Commit Graph

4929 Commits

Author SHA1 Message Date
ParthSareen fceafefdce anthropic: fix ToolCallFunctionArguments type after rebase
Update tests and implementation to use the new ordered map-based
ToolCallFunctionArguments type which replaces the previous map[string]any.

- Add mapToArgs helper to convert map[string]any to ToolCallFunctionArguments
- Add testArgs and testProps helpers in tests
- Use cmpopts.IgnoreUnexported for cmp.Diff comparisons
2026-01-05 21:50:50 -08:00
ParthSareen bd4ab011ac middleware: use HTTP status code for Anthropic error mapping
Use w.ResponseWriter.Status() instead of parsing StatusCode from JSON
payload. routes.go typically sends errors as gin.H{"error": "..."}
without a StatusCode field, causing all errors to be mapped to
"api_error" instead of the appropriate type (not_found_error,
invalid_request_error, etc.).

Added tests to verify error handling for common routes.go patterns.
2026-01-05 19:19:34 -08:00
ParthSareen c1a6aa8be5 docs: add JavaScript example for tool calling 2026-01-05 19:19:34 -08:00
ParthSareen 9c27c72952 middleware: fix test for pointer type Text field 2026-01-05 19:19:34 -08:00
ParthSareen fa42204da8 anthropic: use pointer types for Text and Thinking fields
Use *string instead of string for Text and Thinking fields in ContentBlock
so that omitempty works correctly:
- nil pointer: field omitted from JSON (for blocks that don't use it)
- ptr(""): field present as "" (for SDK streaming accumulation)
- ptr("content"): field present with content

This keeps the JSON output clean (text blocks don't have thinking field,
thinking blocks don't have text field) while still satisfying SDK
requirements for field presence during streaming.
2026-01-05 19:19:34 -08:00
ParthSareen 6188e90aab anthropic: preserve messages with only thinking content
Fix edge case where messages containing only a thinking block (no text,
images, or tool calls) would be dropped. Add thinking != "" to the
condition that creates messages from content blocks.
2026-01-05 19:19:34 -08:00
ParthSareen 515c46c176 docs: add Claude Code integration guide 2026-01-05 19:19:34 -08:00
ParthSareen b44d9b3347 anthropic: add tests for SDK-required empty fields
Add tests documenting that Text and Thinking fields must be present
in JSON output even when empty. The Anthropic SDK requires these fields
in content_block_start events to accumulate streaming deltas properly.

Tests verify:
- ContentBlock JSON includes empty text/thinking fields
- StreamConverter emits content_block_start with required fields
2026-01-05 19:19:34 -08:00
ParthSareen 5ba2092f0a anthropic: fix streaming with SDK by including empty fields
Remove omitempty from Text and Thinking fields in ContentBlock struct.
The Anthropic SDK requires these fields to be present (even if empty)
in content_block_start events to properly accumulate streaming deltas.
2026-01-05 19:19:34 -08:00
ParthSareen 90cf232df2 anthropic: remove redundant comments
Remove obvious comments that don't add value (e.g., "// Convert messages",
"// Handle done"). Keep godoc comments and those explaining API mappings.
2026-01-05 19:19:34 -08:00
ParthSareen ed1e17bb35 anthropic: fix error handling and update docs
- Add proper error handling for JSON marshal in StreamConverter to
  prevent corrupted streams when tool arguments cannot be serialized
- Add tests for unmarshalable arguments and mixed validity scenarios
- Fix documentation typo and update recommended models to qwen3-coder
2026-01-05 19:19:34 -08:00
ParthSareen 6229df5b90 anthropic: add unit and integration tests
- Unit tests for transformation functions (FromMessagesRequest, ToMessagesResponse)
- Unit tests for error handling and edge cases
- Middleware integration tests with httptest
- Fix lint issues (gofmt)
- Fix unused struct fields in StreamConverter
- Add fallback for crypto/rand errors
2026-01-05 19:19:34 -08:00
ParthSareen f760ae1fdd api: add Anthropic Messages API compatibility layer
Add middleware to support the Anthropic Messages API format at /v1/messages.
This enables tools like Claude Code to work with Ollama models through the
Anthropic API interface.

Features:
- Request/response transformation between Anthropic and internal formats
- Streaming support with SSE events (message_start, content_block_delta, etc.)
- Tool calling support (tool_use and tool_result content blocks)
- Thinking/extended thinking block support
- Image content block support (base64)
- System prompt handling
- Multi-turn conversation support
- Proper stop_reason mapping (end_turn, max_tokens, tool_use)
- Error responses in Anthropic format

New files:
- anthropic/anthropic.go: Types and transformation functions
- middleware/anthropic.go: Request/response middleware
2026-01-05 19:19:34 -08:00
Devon Rifkin e51dead636
preserve tool definition and call JSON ordering (#13525)
* preserve tool definition and call JSON ordering

This is another iteration of
<https://github.com/ollama/ollama/pull/12518>, but this time we've
simplified things by relaxing the competing requirements of being
compatible AND order-preserving with templates (vs. renderers). We
maintain backwards compatibility at the cost of not guaranteeing order
for templates. We plan on moving more and more models to renderers,
which have been updated to use these new data types, and additionally
we could add an opt-in way of templates getting an order-preserved list
(e.g., via sibling template vars)

* orderedmap_test: remove testify
2026-01-05 18:03:36 -08:00
Harry V. Kiselev d087e46bd1
docs/capabilities/vision: fix curl related code snippet (#13615) 2026-01-03 17:27:46 -05:00
lif 37f6f3af24
server: return error when embedding contains NaN or Inf values (#13599)
The normalize function now checks for NaN and Inf values in the
embedding vector before processing. This prevents JSON encoding
failures when models produce invalid floating-point values.

Fixes #13572

Signed-off-by: majiayu000 <1835304752@qq.com>
2026-01-03 02:20:12 -05:00
Nhan Nguyen e1bdc23dd2
docs: fix tool name mismatch and trailing commas in api.md example (#13559)
The tool calling example used "get_temperature" for tool_calls but
defined the tool as "get_weather". Also removed trailing commas that
made the JSON invalid.

Fixes #13031
2026-01-03 02:14:53 -05:00
lif 2e78653ff9
app/ui: add swift syntax highlighting support (#13574)
Fixes #13476

Signed-off-by: majiayu000 <1835304752@qq.com>
2026-01-03 02:12:08 -05:00
lif f5f74e12c1
docs: add version note for /v1/responses API (#13596)
Signed-off-by: majiayu000 <1835304752@qq.com>
2026-01-03 01:58:20 -05:00
Vallabh Mahajan 18fdcc94e5
docs: fix broken .md links and render issues (#13550) 2025-12-23 12:44:55 -05:00
Daniel Hiltgen 7ad036992f
amd: use GTT on iGPUs on linux (#13196)
On Linux, look at the GTT memory information for iGPUs.
2025-12-23 09:30:05 -08:00
Jesse Gross 172b5924af llm: Avoid integer underflow on llama engine memory layout
On the llama engine, when we compute the memory layout, we reserve
a buffer to allow for some flexibility for incorrect estimates.
This is subtracted from GPU free memory and on GPUs with limited
memory, it may underflow.

Fixes #13494
2025-12-19 15:48:15 -08:00
Jeffrey Morgan 8852220f59
add REQUIRES command to Modelfile (#13361) 2025-12-18 13:21:29 -08:00
Parth Sareen 7325791599
parsers/renderers: functiongemma (#13521) 2025-12-18 07:55:37 -08:00
Grace 522c11a763
Revert "Omit args and params in tool function def and calls (#13516)" (#13518)
This reverts commit 0fadeffaee.
2025-12-17 19:06:56 -08:00
Grace 0fadeffaee
Omit args and params in tool function def and calls (#13516) 2025-12-17 18:42:21 -08:00
Daniel Hiltgen 49a9c9ba6a
GGML update to ec98e2002 (#13451)
* Revert "add support for NVIDIA Nemotron 3 Nano"

This reverts commit e7d2ae9d69.

* GGML update to 380b4c984

Remove MaskBatchPadding as GGML_KQ_MASK_PAD is no longer present (no
padding required)

* update to c45f89d55

* ec98e2002

solar pro needed more adjusting - needs verification

* review comments
2025-12-17 13:13:55 -08:00
Parth Sareen 1c094038bc
types: add nested property support for tool definitions (#13508) 2025-12-17 11:54:09 -08:00
Grace a013693f80
DeepseekV3 Family Parser (#13484) 2025-12-16 18:56:30 -08:00
Michael Yang f6a016f49d
revert granite-embedding (#13505) 2025-12-16 15:44:52 -08:00
Bruce MacDonald 45c4739374
types: ConfigV2 and RootFS (#13504)
Refactored the ConfigV2 and RootFS types from server/images.go to a new types/model/config.go file under the model package. Updated all references to use model.ConfigV2 and model.RootFS. This allows for use in other projects without worrying about compiling the c code in the llama package.
2025-12-16 15:18:17 -08:00
Michael Yang 2dd029de12
remove unnecessary code (#13502)
slog is already lazily evaluated so this code is completely redundant
2025-12-16 15:11:26 -08:00
Michael Yang 903b1fc97f
use ollama engine for bert models (#13501)
register bpe tokenizer which enables granite-embedding
2025-12-16 11:29:19 -08:00
Parth Sareen 89eb795293
parsers/renderers: use think from user for nemotron (#13492) 2025-12-15 18:55:17 -08:00
Parth Sareen 7e3ea813c1
llama/parsers/renderers: nemotron 3 nano (#13489)
---------

Co-authored-by: Daniel Hiltgen <daniel@ollama.com>
2025-12-15 18:00:08 -08:00
Grace 7b95087b9d
Adding tool definitions to DeepseekV3 renderer (#13491) 2025-12-15 17:57:06 -08:00
Michael Yang 971d62595a
fix: qwen2.5 vl rope (#13486)
* qwen25vl: bump max pixels

* qwen25vl: mrope

fix qwen2.5vl window

* qwen25vl: vision rope
2025-12-15 17:30:33 -08:00
Parth Sareen ffbe8e076d
model: add olmo3 and olmo3.1 (#13415) 2025-12-15 15:20:04 -08:00
Grace 2c639431b1
DeepseekV3 family renderer (#13180) 2025-12-15 14:50:52 -08:00
Nhan Nguyen aacd1cb394
fix: define GGML_VERSION variables for proper SOVERSION expansion (#13469)
The ggml/src/CMakeLists.txt uses GGML_VERSION_MAJOR for the shared
library SOVERSION property, but these variables were not defined when
building from ollama's CMakeLists.txt.

This caused libggml-base.so to be named with a literal "SOVERSION"
suffix (libggml-base.so.SOVERSION) instead of the actual version
number (libggml-base.so.0).

The fix adds the required GGML_VERSION_* variables before including
the ggml subdirectory.

Fixes #13436
2025-12-15 14:42:15 -08:00
Parth Sareen e3731fb160
renderers: add olmo3.1 and olmo3 fixes (#13447) 2025-12-15 11:26:43 -08:00
Eva H 8dbc9e7b68
app/ui: handle unspecified bind addresses and wait for server in ollama proxy (#13159) 2025-12-15 13:33:09 -05:00
Daniel Hiltgen abe67acf8a
Revert "Enable Ollama engine by default" (#13481)
This reverts commit 56f754f46b.
2025-12-15 09:55:45 -08:00
Jeffrey Morgan 4ff8a691bc
model: default gemma 3 rope scale to 1.0, apply corrections based on layer counts (#13453) 2025-12-12 17:51:56 -08:00
Jeffrey Morgan 1b308e1d2a
model: fix global layer rope scale values for gemma 3 (#13452) 2025-12-12 16:29:01 -08:00
Daniel Hiltgen bd6c1d6b49
flash attn: add auto mode for llama engine (#13052)
* flash attn: add auto mode for llama engine

If the user does not specify fa in the environment, use auto-mode.

* review comments

* ensure kv cache quantized types have FA explicitly enabled

additional review comments
2025-12-12 13:27:19 -08:00
Jeffrey Morgan 3af5d3b738
model: force rope factor 1.0 for Gemma 3 (#13445) 2025-12-12 13:27:08 -08:00
Daniel Hiltgen 7730895158
Enable Ollama engine by default (#13443)
This changes the default behavior to use the Ollama engine for supported
models, while retaining the ability to disable the Ollama engine and
fall back to the Llama engine.  Models in the OllamaEngineRequired list
will always run on the Ollama engine.
2025-12-12 11:48:43 -08:00
Eva H de9ecfd01c
tidy up lint warnings on windows (#13430) 2025-12-12 11:43:35 -05:00
Eva H 95fdd8d619
fix: select and update models folder in settings (#13412) 2025-12-12 11:09:37 -05:00