ollama

Commit Graph

Author	SHA1	Message	Date
ParthSareen	96d69ee2b2	WIP: add agent docs and example skills Temporary commit with documentation and examples for agent features. This commit can be reverted before merging. Includes: - docs/ENTRYPOINT_FEATURE.md - ENTRYPOINT implementation notes - docs/mcp-integration.md - MCP integration design - docs/agent-skills-changes.md - Skills feature changes - docs/skill-registry-design.md - Registry design notes - skills/ - Example skill implementations - ducky.Agentfile - Example entrypoint agent	2025-12-30 15:01:57 -05:00
ParthSareen	89f74a8b05	agents: add MCP server support and ENTRYPOINT command MCP (Model Context Protocol) support: - Add MCPRef type for agent MCP server references - Parse MCP command in Agentfiles (MCP name command [args...]) - Load and manage MCP servers with mcpManager - Implement agentic loop for multi-turn tool execution - Add /mcp REPL commands (add, remove, disable, enable) - Add 'ollama mcp' CLI commands for global config management - Support both model-bundled and global (~/.ollama/mcp.json) MCPs - Display MCPs in 'ollama show' output ENTRYPOINT support: - Add ENTRYPOINT command to Agentfiles for custom runtimes - Allow agents without FROM when ENTRYPOINT is specified - Execute entrypoint as subprocess with stdin/stdout connected - Support $PROMPT placeholder for prompt insertion control - Hide Model section in 'ollama show' for entrypoint-only agents - Pass user prompt as argument to entrypoint command	2025-12-30 15:01:57 -05:00
ParthSareen	ca43de117f	skills: add registry reference check and working directory env var - Add check for registry references without digest in loadSkillsFromRefs - Fix IsLocalSkillPath to not treat registry refs as local paths - Inject OLLAMA_WORKING_DIR env var so skill scripts can access the directory where 'ollama run' was called from	2025-12-30 15:01:57 -05:00
ParthSareen	7ff2b373f4	docs: add skills documentation Add comprehensive documentation for the skills feature: - Quick start guide for creating skills - SKILL.md structure and frontmatter - Skill reference formats (local, library, user) - CLI commands (push, pull, list, show, rm) - Dynamic skills in interactive chat - Storage layout - Security considerations - Future roadmap	2025-12-29 00:14:24 -05:00
ParthSareen	805177c054	cmd: add skill CLI and REPL commands Add skill management commands and interactive REPL support: CLI commands (cmd/skill_cmd.go): ollama skill push NAME PATH - Push skill to registry ollama skill pull NAME - Pull skill from registry ollama skill list - List installed skills ollama skill show NAME - Show skill details ollama skill rm NAME - Remove a skill Skill loading (cmd/skills.go): - Load skills from model manifests - Parse SKILL.md frontmatter for metadata - Inject skill instructions into system prompt - Provide run_skill_script tool for script execution Interactive mode (cmd/interactive.go): /skills - Show available skills /skill add PATH - Add skill from local path /skill remove NAME - Remove skill from session /skill list - List session skills	2025-12-29 00:14:13 -05:00
ParthSareen	6f9fc4e1bf	parser: add SKILL command for Agentfiles Add SKILL command to the Modelfile/Agentfile parser. Supports both local paths and registry references: SKILL ./path/to/skill # Local skill bundled with agent SKILL skill/calc:1.0.0 # Registry skill reference SKILL alice/skill/calc:1.0 # User skill from registry	2025-12-29 00:13:56 -05:00
ParthSareen	fc62078ba4	api,types: add skill types and configuration Add skill-related types to the API and configuration: - api/types.go: Skill reference types for API requests/responses - types/model/config.go: Skill configuration in model config - envconfig/config.go: Environment configuration for skills	2025-12-29 00:13:39 -05:00
ParthSareen	d08c33faa0	server: add skill layer support Add support for skill layers in model manifests: - server/skill.go: New file with skill extraction and packaging - GetSkillsPath: Returns path to extracted skills cache - ExtractSkillBlob: Extracts skill tar.gz to cache - CreateSkillLayer: Creates skill blob from directory - ParseSkillName/GetSkillManifestPath: Skill name handling - server/images.go: Extract skill layers on pull - server/create.go: Create skill layers from SKILL directives - server/routes.go: Skill-related route handling Skills are stored as gzipped tar archives with MediaType "application/vnd.ollama.image.skill".	2025-12-29 00:13:25 -05:00
ParthSareen	253b035b4a	server: add Kind field to ModelPath for 5-part naming Updates ModelPath struct and parsing to support the Kind field, enabling skills and agents to use the 5-part naming structure. - ParseModelPath detects valid kinds (skill, agent) - GetNamespaceRepository includes kind in path - GetManifestPath returns correct 5-part filepath - GetFullTagname/GetShortTagname include kind when present	2025-12-29 00:12:52 -05:00
ParthSareen	d4f9bd5fe5	types: add Kind field to model.Name for 5-part naming Extends the model name structure from 4-part to 5-part: host/namespace/kind/model:tag The Kind field is optional and supports: - "skill" for skill packages - "agent" for agent packages (future) - empty for regular models Parser detects valid kinds to distinguish between old format (host/namespace/model) and new format (host/namespace/kind/model).	2025-12-29 00:12:24 -05:00
Vallabh Mahajan	18fdcc94e5	docs: fix broken .md links and render issues (#13550 )	2025-12-23 12:44:55 -05:00
Daniel Hiltgen	7ad036992f	amd: use GTT on iGPUs on linux (#13196 ) On Linux, look at the GTT memory information for iGPUs.	2025-12-23 09:30:05 -08:00
Jesse Gross	172b5924af	llm: Avoid integer underflow on llama engine memory layout On the llama engine, when we compute the memory layout, we reserve a buffer to allow for some flexibility for incorrect estimates. This is subtracted from GPU free memory and on GPUs with limited memory, it may underflow. Fixes #13494	2025-12-19 15:48:15 -08:00
Jeffrey Morgan	8852220f59	add REQUIRES command to Modelfile (#13361 )	2025-12-18 13:21:29 -08:00
Parth Sareen	7325791599	parsers/renderers: functiongemma (#13521 )	2025-12-18 07:55:37 -08:00
Grace	522c11a763	Revert "Omit args and params in tool function def and calls (#13516 )" (#13518 ) This reverts commit `0fadeffaee`.	2025-12-17 19:06:56 -08:00
Grace	0fadeffaee	Omit args and params in tool function def and calls (#13516 )	2025-12-17 18:42:21 -08:00
Daniel Hiltgen	49a9c9ba6a	GGML update to ec98e2002 (#13451 ) * Revert "add support for NVIDIA Nemotron 3 Nano" This reverts commit `e7d2ae9d69`. * GGML update to 380b4c984 Remove MaskBatchPadding as GGML_KQ_MASK_PAD is no longer present (no padding required) * update to c45f89d55 * ec98e2002 solar pro needed more adjusting - needs verification * review comments	2025-12-17 13:13:55 -08:00
Parth Sareen	1c094038bc	types: add nested property support for tool definitions (#13508 )	2025-12-17 11:54:09 -08:00
Grace	a013693f80	DeepseekV3 Family Parser (#13484 )	2025-12-16 18:56:30 -08:00
Michael Yang	f6a016f49d	revert granite-embedding (#13505 )	2025-12-16 15:44:52 -08:00
Bruce MacDonald	45c4739374	types: ConfigV2 and RootFS (#13504 ) Refactored the ConfigV2 and RootFS types from server/images.go to a new types/model/config.go file under the model package. Updated all references to use model.ConfigV2 and model.RootFS. This allows for use in other projects without worrying about compiling the c code in the llama package.	2025-12-16 15:18:17 -08:00
Michael Yang	2dd029de12	remove unnecessary code (#13502 ) slog is already lazily evaluated so this code is completely redundant	2025-12-16 15:11:26 -08:00
Michael Yang	903b1fc97f	use ollama engine for bert models (#13501 ) register bpe tokenizer which enables granite-embedding	2025-12-16 11:29:19 -08:00
Parth Sareen	89eb795293	parsers/renderers: use think from user for nemotron (#13492 )	2025-12-15 18:55:17 -08:00
Parth Sareen	7e3ea813c1	llama/parsers/renderers: nemotron 3 nano (#13489 ) --------- Co-authored-by: Daniel Hiltgen <daniel@ollama.com>	2025-12-15 18:00:08 -08:00
Grace	7b95087b9d	Adding tool definitions to DeepseekV3 renderer (#13491 )	2025-12-15 17:57:06 -08:00
Michael Yang	971d62595a	fix: qwen2.5 vl rope (#13486 ) * qwen25vl: bump max pixels * qwen25vl: mrope fix qwen2.5vl window * qwen25vl: vision rope	2025-12-15 17:30:33 -08:00
Parth Sareen	ffbe8e076d	model: add olmo3 and olmo3.1 (#13415 )	2025-12-15 15:20:04 -08:00
Grace	2c639431b1	DeepseekV3 family renderer (#13180 )	2025-12-15 14:50:52 -08:00
Nhan Nguyen	aacd1cb394	fix: define GGML_VERSION variables for proper SOVERSION expansion (#13469 ) The ggml/src/CMakeLists.txt uses GGML_VERSION_MAJOR for the shared library SOVERSION property, but these variables were not defined when building from ollama's CMakeLists.txt. This caused libggml-base.so to be named with a literal "SOVERSION" suffix (libggml-base.so.SOVERSION) instead of the actual version number (libggml-base.so.0). The fix adds the required GGML_VERSION_* variables before including the ggml subdirectory. Fixes #13436	2025-12-15 14:42:15 -08:00
Parth Sareen	e3731fb160	renderers: add olmo3.1 and olmo3 fixes (#13447 )	2025-12-15 11:26:43 -08:00
Eva H	8dbc9e7b68	app/ui: handle unspecified bind addresses and wait for server in ollama proxy (#13159 )	2025-12-15 13:33:09 -05:00
Daniel Hiltgen	abe67acf8a	Revert "Enable Ollama engine by default" (#13481 ) This reverts commit `56f754f46b`.	2025-12-15 09:55:45 -08:00
Jeffrey Morgan	4ff8a691bc	model: default gemma 3 rope scale to 1.0, apply corrections based on layer counts (#13453 )	2025-12-12 17:51:56 -08:00
Jeffrey Morgan	1b308e1d2a	model: fix global layer rope scale values for gemma 3 (#13452 )	2025-12-12 16:29:01 -08:00
Daniel Hiltgen	bd6c1d6b49	flash attn: add auto mode for llama engine (#13052 ) * flash attn: add auto mode for llama engine If the user does not specify fa in the environment, use auto-mode. * review comments * ensure kv cache quantized types have FA explicitly enabled additional review comments	2025-12-12 13:27:19 -08:00
Jeffrey Morgan	3af5d3b738	model: force rope factor 1.0 for Gemma 3 (#13445 )	2025-12-12 13:27:08 -08:00
Daniel Hiltgen	7730895158	Enable Ollama engine by default (#13443 ) This changes the default behavior to use the Ollama engine for supported models, while retaining the ability to disable the Ollama engine and fall back to the Llama engine. Models in the OllamaEngineRequired list will always run on the Ollama engine.	2025-12-12 11:48:43 -08:00
Eva H	de9ecfd01c	tidy up lint warnings on windows (#13430 )	2025-12-12 11:43:35 -05:00
Eva H	95fdd8d619	fix: select and update models folder in settings (#13412 )	2025-12-12 11:09:37 -05:00
Devon Rifkin	9f7822851c	docs: add docs for v1/responses and rework openai compat section (#13416 ) * docs: add docs for v1/responses and rework openai compat section I reworked the examples to be separated by topic and to be fully runnable (i.e., they now log output instead of just suggesting how a call might be made). We now use `<CodeGroup>`s so that each example has a dropdown on the docs site for users to choose, which makes the examples a lot more digestible (since you only see approx 1/3 of the code you used to). I also added a new tool to extract code examples into files so that it's easier to actually run them and check that they work. ## Example ```shell go run docs/tools/extract-examples/main.go docs/api/openai-compatibility.mdx ``` Output: ``` Extracting code examples to: /var/folders/vq/wfm2g6k917d3ldzpjdxc8ph00000gn/T/mdx-examples-3271754368 - 01_basic.py - 01_basic.js - 01_basic.sh - 02_responses.py - 02_responses.js - 02_responses.sh - 03_vision.py - 03_vision.js - 03_vision.sh Extracted 9 file(s) to /var/folders/vq/wfm2g6k917d3ldzpjdxc8ph00000gn/T/mdx-examples-3271754368 To run examples: cd /var/folders/vq/wfm2g6k917d3ldzpjdxc8ph00000gn/T/mdx-examples-3271754368 npm install # for JS examples then run individual files with `node file.js`, `python file.py`, `bash file.sh` ``` In the future we should consider actually running the examples in CI and having some sort of acceptance test so we can automatically detect when our examples break. So this is just a start in that direction. * Update docs/api/openai-compatibility.mdx Co-authored-by: Parth Sareen <parth.sareen@ollama.com> * Update docs/api/openai-compatibility.mdx Co-authored-by: Parth Sareen <parth.sareen@ollama.com> --------- Co-authored-by: Parth Sareen <parth.sareen@ollama.com>	2025-12-11 17:39:40 -08:00
Parth Sareen	9b2035d194	openai: add tool call appending to previous assistant message (#13434 ) * openai: add tool call appending to previous asst message * add tests for thinking appending	2025-12-11 17:30:12 -08:00
Alexander Gusak	93d45d7a04	docs: fix link to modelfile.mdx (#13220 )	2025-12-11 16:14:45 -08:00
JJ	709f842457	Update README.md (#13373 ) Correct Markdown syntax for Swollama GitHub and DocC documentation links	2025-12-11 16:08:57 -08:00
Jeffrey Morgan	2dfb74410d	model: fix rotary embeddings for ministral 3 (#13432 )	2025-12-11 16:02:05 -08:00
Devon Rifkin	1eb5e75972	openai: add v1/responses support (#13351 ) Only supporting the stateless part of the API. Doc updates to come once this is shipped. Closes: #9659	2025-12-11 15:37:10 -08:00
nicole pardal	3475d915cb	embeddings: modified batch size (#13429 ) This PR detects embedding models and sets batch_size = context_size so the full input fits in a single batch. Previously, if batch size was smaller than the input, tokens could be split across batches and cause a SIGTRAP crash. This change ensures all tokens stay in one batch and prevents crashes. Fixes: #12938 #13054 Co-authored-by: Jesse Gross <jesse@ollama.com>	2025-12-11 15:36:31 -08:00
Jeffrey Morgan	48e78e9be1	template: add yesterdayDate helper function (#13431 )	2025-12-11 14:47:55 -08:00
Jeffrey Morgan	a838421ea3	model: conversion and hyperparameter fixes for ministral and devstral (#13424 )	2025-12-11 13:04:00 -08:00

1 2 3 4 5 ...

4920 Commits All Branches Search

4920 Commits

All Branches