Commit Graph

4928 Commits

Author SHA1 Message Date
Eva Ho 68a3414761 fix test 2026-01-05 13:24:41 -05:00
Eva Ho 9a5c14c58b address comments 2026-01-05 09:38:44 -05:00
Eva Ho 391fb88bce address comment 2026-01-05 09:38:44 -05:00
Eva Ho 75500c8855 address comment 2026-01-05 09:38:44 -05:00
Eva Ho e55fbf2475 fix: gofmt formatting in updater_test.go 2026-01-05 09:38:44 -05:00
Eva Ho c6f941adb3 fix test 2026-01-05 09:38:44 -05:00
Eva Ho 0eb320e74c fix format 2026-01-05 09:38:44 -05:00
Eva Ho 880b4f95b4 fix test 2026-01-05 09:38:44 -05:00
Eva Ho ba25f4a898 fix test 2026-01-05 09:38:44 -05:00
Eva Ho dc573715c4 clean up 2026-01-05 09:38:44 -05:00
Eva Ho 5a5d3260f4 fix behaviour when switching between enabled and disabled 2026-01-05 09:38:44 -05:00
Eva Ho cf7e5e88bc fix test 2026-01-05 09:38:44 -05:00
Eva Ho e76abac24e app: add upgrade configuration to settings page 2026-01-05 09:38:44 -05:00
Harry V. Kiselev d087e46bd1
docs/capabilities/vision: fix curl related code snippet (#13615) 2026-01-03 17:27:46 -05:00
lif 37f6f3af24
server: return error when embedding contains NaN or Inf values (#13599)
The normalize function now checks for NaN and Inf values in the
embedding vector before processing. This prevents JSON encoding
failures when models produce invalid floating-point values.

Fixes #13572

Signed-off-by: majiayu000 <1835304752@qq.com>
2026-01-03 02:20:12 -05:00
Nhan Nguyen e1bdc23dd2
docs: fix tool name mismatch and trailing commas in api.md example (#13559)
The tool calling example used "get_temperature" for tool_calls but
defined the tool as "get_weather". Also removed trailing commas that
made the JSON invalid.

Fixes #13031
2026-01-03 02:14:53 -05:00
lif 2e78653ff9
app/ui: add swift syntax highlighting support (#13574)
Fixes #13476

Signed-off-by: majiayu000 <1835304752@qq.com>
2026-01-03 02:12:08 -05:00
lif f5f74e12c1
docs: add version note for /v1/responses API (#13596)
Signed-off-by: majiayu000 <1835304752@qq.com>
2026-01-03 01:58:20 -05:00
Vallabh Mahajan 18fdcc94e5
docs: fix broken .md links and render issues (#13550) 2025-12-23 12:44:55 -05:00
Daniel Hiltgen 7ad036992f
amd: use GTT on iGPUs on linux (#13196)
On Linux, look at the GTT memory information for iGPUs.
2025-12-23 09:30:05 -08:00
Jesse Gross 172b5924af llm: Avoid integer underflow on llama engine memory layout
On the llama engine, when we compute the memory layout, we reserve
a buffer to allow for some flexibility for incorrect estimates.
This is subtracted from GPU free memory and on GPUs with limited
memory, it may underflow.

Fixes #13494
2025-12-19 15:48:15 -08:00
Jeffrey Morgan 8852220f59
add REQUIRES command to Modelfile (#13361) 2025-12-18 13:21:29 -08:00
Parth Sareen 7325791599
parsers/renderers: functiongemma (#13521) 2025-12-18 07:55:37 -08:00
Grace 522c11a763
Revert "Omit args and params in tool function def and calls (#13516)" (#13518)
This reverts commit 0fadeffaee.
2025-12-17 19:06:56 -08:00
Grace 0fadeffaee
Omit args and params in tool function def and calls (#13516) 2025-12-17 18:42:21 -08:00
Daniel Hiltgen 49a9c9ba6a
GGML update to ec98e2002 (#13451)
* Revert "add support for NVIDIA Nemotron 3 Nano"

This reverts commit e7d2ae9d69.

* GGML update to 380b4c984

Remove MaskBatchPadding as GGML_KQ_MASK_PAD is no longer present (no
padding required)

* update to c45f89d55

* ec98e2002

solar pro needed more adjusting - needs verification

* review comments
2025-12-17 13:13:55 -08:00
Parth Sareen 1c094038bc
types: add nested property support for tool definitions (#13508) 2025-12-17 11:54:09 -08:00
Grace a013693f80
DeepseekV3 Family Parser (#13484) 2025-12-16 18:56:30 -08:00
Michael Yang f6a016f49d
revert granite-embedding (#13505) 2025-12-16 15:44:52 -08:00
Bruce MacDonald 45c4739374
types: ConfigV2 and RootFS (#13504)
Refactored the ConfigV2 and RootFS types from server/images.go to a new types/model/config.go file under the model package. Updated all references to use model.ConfigV2 and model.RootFS. This allows for use in other projects without worrying about compiling the c code in the llama package.
2025-12-16 15:18:17 -08:00
Michael Yang 2dd029de12
remove unnecessary code (#13502)
slog is already lazily evaluated so this code is completely redundant
2025-12-16 15:11:26 -08:00
Michael Yang 903b1fc97f
use ollama engine for bert models (#13501)
register bpe tokenizer which enables granite-embedding
2025-12-16 11:29:19 -08:00
Parth Sareen 89eb795293
parsers/renderers: use think from user for nemotron (#13492) 2025-12-15 18:55:17 -08:00
Parth Sareen 7e3ea813c1
llama/parsers/renderers: nemotron 3 nano (#13489)
---------

Co-authored-by: Daniel Hiltgen <daniel@ollama.com>
2025-12-15 18:00:08 -08:00
Grace 7b95087b9d
Adding tool definitions to DeepseekV3 renderer (#13491) 2025-12-15 17:57:06 -08:00
Michael Yang 971d62595a
fix: qwen2.5 vl rope (#13486)
* qwen25vl: bump max pixels

* qwen25vl: mrope

fix qwen2.5vl window

* qwen25vl: vision rope
2025-12-15 17:30:33 -08:00
Parth Sareen ffbe8e076d
model: add olmo3 and olmo3.1 (#13415) 2025-12-15 15:20:04 -08:00
Grace 2c639431b1
DeepseekV3 family renderer (#13180) 2025-12-15 14:50:52 -08:00
Nhan Nguyen aacd1cb394
fix: define GGML_VERSION variables for proper SOVERSION expansion (#13469)
The ggml/src/CMakeLists.txt uses GGML_VERSION_MAJOR for the shared
library SOVERSION property, but these variables were not defined when
building from ollama's CMakeLists.txt.

This caused libggml-base.so to be named with a literal "SOVERSION"
suffix (libggml-base.so.SOVERSION) instead of the actual version
number (libggml-base.so.0).

The fix adds the required GGML_VERSION_* variables before including
the ggml subdirectory.

Fixes #13436
2025-12-15 14:42:15 -08:00
Parth Sareen e3731fb160
renderers: add olmo3.1 and olmo3 fixes (#13447) 2025-12-15 11:26:43 -08:00
Eva H 8dbc9e7b68
app/ui: handle unspecified bind addresses and wait for server in ollama proxy (#13159) 2025-12-15 13:33:09 -05:00
Daniel Hiltgen abe67acf8a
Revert "Enable Ollama engine by default" (#13481)
This reverts commit 56f754f46b.
2025-12-15 09:55:45 -08:00
Jeffrey Morgan 4ff8a691bc
model: default gemma 3 rope scale to 1.0, apply corrections based on layer counts (#13453) 2025-12-12 17:51:56 -08:00
Jeffrey Morgan 1b308e1d2a
model: fix global layer rope scale values for gemma 3 (#13452) 2025-12-12 16:29:01 -08:00
Daniel Hiltgen bd6c1d6b49
flash attn: add auto mode for llama engine (#13052)
* flash attn: add auto mode for llama engine

If the user does not specify fa in the environment, use auto-mode.

* review comments

* ensure kv cache quantized types have FA explicitly enabled

additional review comments
2025-12-12 13:27:19 -08:00
Jeffrey Morgan 3af5d3b738
model: force rope factor 1.0 for Gemma 3 (#13445) 2025-12-12 13:27:08 -08:00
Daniel Hiltgen 7730895158
Enable Ollama engine by default (#13443)
This changes the default behavior to use the Ollama engine for supported
models, while retaining the ability to disable the Ollama engine and
fall back to the Llama engine.  Models in the OllamaEngineRequired list
will always run on the Ollama engine.
2025-12-12 11:48:43 -08:00
Eva H de9ecfd01c
tidy up lint warnings on windows (#13430) 2025-12-12 11:43:35 -05:00
Eva H 95fdd8d619
fix: select and update models folder in settings (#13412) 2025-12-12 11:09:37 -05:00
Devon Rifkin 9f7822851c
docs: add docs for v1/responses and rework openai compat section (#13416)
* docs: add docs for v1/responses and rework openai compat section

I reworked the examples to be separated by topic and to be fully
runnable (i.e., they now log output instead of just suggesting how a
call might be made).

We now use `<CodeGroup>`s so that each example has a dropdown on the
docs site for users to choose, which makes the examples a lot more
digestible (since you only see approx 1/3 of the code you used to).

I also added a new tool to extract code examples into files so that it's
easier to actually run them and check that they work.

## Example

```shell
go run docs/tools/extract-examples/main.go docs/api/openai-compatibility.mdx
```

Output:

```
Extracting code examples to: /var/folders/vq/wfm2g6k917d3ldzpjdxc8ph00000gn/T/mdx-examples-3271754368

  - 01_basic.py
  - 01_basic.js
  - 01_basic.sh
  - 02_responses.py
  - 02_responses.js
  - 02_responses.sh
  - 03_vision.py
  - 03_vision.js
  - 03_vision.sh

Extracted 9 file(s) to /var/folders/vq/wfm2g6k917d3ldzpjdxc8ph00000gn/T/mdx-examples-3271754368

To run examples:

  cd /var/folders/vq/wfm2g6k917d3ldzpjdxc8ph00000gn/T/mdx-examples-3271754368
  npm install   # for JS examples

then run individual files with `node file.js`, `python file.py`, `bash file.sh`
```

In the future we should consider actually running the examples in CI and
having some sort of acceptance test so we can automatically detect when
our examples break. So this is just a start in that direction.

* Update docs/api/openai-compatibility.mdx

Co-authored-by: Parth Sareen <parth.sareen@ollama.com>

* Update docs/api/openai-compatibility.mdx

Co-authored-by: Parth Sareen <parth.sareen@ollama.com>

---------

Co-authored-by: Parth Sareen <parth.sareen@ollama.com>
2025-12-11 17:39:40 -08:00