Commit Graph

180 Commits

Author SHA1 Message Date
cvrunmin 9e70f0fcfd
Merge 328675151b into 6c3faafed2 2026-01-06 08:25:48 +01:00
Devon Rifkin 6c3faafed2
olmo3: fix flaky test (#13629)
I introduced this in <https://github.com/ollama/ollama/pull/13525>
2026-01-05 22:37:20 -08:00
Devon Rifkin e51dead636
preserve tool definition and call JSON ordering (#13525)
* preserve tool definition and call JSON ordering

This is another iteration of
<https://github.com/ollama/ollama/pull/12518>, but this time we've
simplified things by relaxing the competing requirements of being
compatible AND order-preserving with templates (vs. renderers). We
maintain backwards compatibility at the cost of not guaranteeing order
for templates. We plan on moving more and more models to renderers,
which have been updated to use these new data types, and additionally
we could add an opt-in way of templates getting an order-preserved list
(e.g., via sibling template vars)

* orderedmap_test: remove testify
2026-01-05 18:03:36 -08:00
cvrunmin 328675151b Merge branch 'main' into feat/split-gguf 2025-12-19 10:04:49 +08:00
Parth Sareen 7325791599
parsers/renderers: functiongemma (#13521) 2025-12-18 07:55:37 -08:00
Grace a013693f80
DeepseekV3 Family Parser (#13484) 2025-12-16 18:56:30 -08:00
Michael Yang f6a016f49d
revert granite-embedding (#13505) 2025-12-16 15:44:52 -08:00
Michael Yang 2dd029de12
remove unnecessary code (#13502)
slog is already lazily evaluated so this code is completely redundant
2025-12-16 15:11:26 -08:00
Michael Yang 903b1fc97f
use ollama engine for bert models (#13501)
register bpe tokenizer which enables granite-embedding
2025-12-16 11:29:19 -08:00
Parth Sareen 89eb795293
parsers/renderers: use think from user for nemotron (#13492) 2025-12-15 18:55:17 -08:00
Parth Sareen 7e3ea813c1
llama/parsers/renderers: nemotron 3 nano (#13489)
---------

Co-authored-by: Daniel Hiltgen <daniel@ollama.com>
2025-12-15 18:00:08 -08:00
Grace 7b95087b9d
Adding tool definitions to DeepseekV3 renderer (#13491) 2025-12-15 17:57:06 -08:00
Michael Yang 971d62595a
fix: qwen2.5 vl rope (#13486)
* qwen25vl: bump max pixels

* qwen25vl: mrope

fix qwen2.5vl window

* qwen25vl: vision rope
2025-12-15 17:30:33 -08:00
Parth Sareen ffbe8e076d
model: add olmo3 and olmo3.1 (#13415) 2025-12-15 15:20:04 -08:00
Grace 2c639431b1
DeepseekV3 family renderer (#13180) 2025-12-15 14:50:52 -08:00
Parth Sareen e3731fb160
renderers: add olmo3.1 and olmo3 fixes (#13447) 2025-12-15 11:26:43 -08:00
cvrunmin c81b9eccda Merge branch 'main' into feat/split-gguf 2025-12-15 09:13:02 +08:00
Jeffrey Morgan 4ff8a691bc
model: default gemma 3 rope scale to 1.0, apply corrections based on layer counts (#13453) 2025-12-12 17:51:56 -08:00
Jeffrey Morgan 1b308e1d2a
model: fix global layer rope scale values for gemma 3 (#13452) 2025-12-12 16:29:01 -08:00
Jeffrey Morgan 3af5d3b738
model: force rope factor 1.0 for Gemma 3 (#13445) 2025-12-12 13:27:08 -08:00
Jeffrey Morgan 2dfb74410d
model: fix rotary embeddings for ministral 3 (#13432) 2025-12-11 16:02:05 -08:00
Jeffrey Morgan a838421ea3
model: conversion and hyperparameter fixes for ministral and devstral (#13424) 2025-12-11 13:04:00 -08:00
nicole pardal 76f88caf43
nomic-embed-text:v2: model implementation (#13162) 2025-12-09 14:24:51 -08:00
Parth Sareen 2bccf8c624
renderers/parsers: olmo3 instruct (#13383) 2025-12-09 11:12:27 -08:00
Parth Sareen 0c5e5f6630
parsers/renderers: olmo3 think (#13290) 2025-12-09 10:41:47 -08:00
Jeffrey Morgan d2f334c1f7
model: add rnj-1 inference support (#13354) 2025-12-08 16:49:17 -08:00
Michael Yang 603ceefaa6 refactor rope
change to a flatter directory structure and group the options with the
function

update models to call rope in one place
2025-12-08 14:42:22 -08:00
cvrunmin 44e6a789ca
Merge branch 'main' into feat/split-gguf 2025-12-08 21:53:49 +08:00
Patrick Devine d3e0a0dee4
model: ministral w/ llama4 scaling (#13292)
This change:

* fixes rope scaling in the mistral converter
* updates ministral to include llama4 scaling
* includes a new ministral parser for parsing reasoning and tool calling

---------

Co-authored-by: jmorganca <jmorganca@gmail.com>
2025-12-01 23:20:14 -08:00
cvrunmin efdd9b76da gguf: add split gguf loading 2025-11-24 16:59:06 +08:00
Grace d70e935526
Parser for Cogito v2 (#13145) 2025-11-19 17:21:07 -08:00
Michael Yang 5c1063df7f
deepseek2: upgrade to run v3+ models (#13166)
the check for mla omits v3 and r1 which should not return unsupported.
instead check the tokenizer for compatibility
2025-11-19 17:05:39 -08:00
Patrick Devine 604e43b28d
models: enable deepseek2 (deepseek v3.1 w/ MLA) on the new engine (#13151) 2025-11-18 22:03:50 -08:00
Grace 91935631ac
Renderer for Cogito v2 (#13139) 2025-11-18 19:06:34 -08:00
nicole pardal 8de30b568a
nomic-embed-text model implementation (#13071) 2025-11-18 18:28:10 -08:00
Michael Yang 92981ae3f2 deepseekocr 2025-11-18 16:11:37 -08:00
Michael Yang 440a3823a6
fix(tokenizer): add special tokens to empty inputs (#13091) 2025-11-18 11:16:56 -08:00
Grace 584e2d646f
Add deepseek v3.1 (#13063)
* Add mla for flash attention
* Revert to using chunks
2025-11-17 18:03:21 -08:00
Michael Yang 333203d871
chore: update models to use slice/chunk/chunksections (#12934)
* use slice/chunks

* bert

* llama4

* gemma3n

* gptoss

* mistral3

* qwen3vl

* qwen25vl

* deepseek2

* remove unused ops
2025-11-13 15:20:12 -08:00
Daniel Hiltgen 544b6739dd
ggml update to b6840 (#12791) 2025-11-06 10:19:22 -08:00
Michael Yang ce3eb0a315
chore(gptoss): cleanup dead code (#12932) 2025-11-03 11:27:15 -08:00
Michael Yang f67a6df110
interleaved mrope (#12807)
* ml(ggml): mrope
* interleave mrope
2025-10-30 11:29:00 -07:00
Michael Yang d432ade714
fix: qwen2.5vl, qwen3vl composite image (#12841)
this change fixes images with an alpha channel by overlaying the image
onto a white background
2025-10-30 10:33:19 -07:00
Grace 0a2d92081b
Removing whitespace between Thinking and Content in Qwen3VL (#12838)
Eats extra whitespace at the end/beginning of content
2025-10-29 15:14:28 -07:00
Michael Yang 7d25b9e194
feat(model): add qwen3vl (#12665) 2025-10-28 17:39:47 -07:00
Michael Yang 1188f408dd
s/From*Slice/From*s/ (#12255) 2025-10-28 12:08:49 -07:00
Michael Yang ec9eb28f4c
gemma3: make embedding non-causal (#12297) 2025-10-27 19:54:08 -07:00
Jeffrey Morgan 94f110b35a
model/parsers: remove warning for missing <think> tag for qwen3-vl (#12713) 2025-10-20 16:03:43 -07:00
Daniel Hiltgen bc1a818fdc
contiguous input per layer (#12686)
Co-authored-by: Michael Yang <git@mxy.ng>
2025-10-17 18:39:18 -07:00
Jeffrey Morgan 65fb3ff49d
renderers: add global flag for setting [img] tags (#12669)
Adds a temporary global flag to renderers that causes renderers to always
render images as [img]. In a follow up change, we will consider making this
the default, and this flag could eventually be removed
2025-10-16 16:37:32 -07:00