Commit Graph

176 Commits

Author SHA1 Message Date
cvrunmin afebb4aa37
Merge e68d6054c1 into e51dead636 2026-01-06 03:15:24 +01:00
Devon Rifkin e51dead636
preserve tool definition and call JSON ordering (#13525)
* preserve tool definition and call JSON ordering

This is another iteration of
<https://github.com/ollama/ollama/pull/12518>, but this time we've
simplified things by relaxing the competing requirements of being
compatible AND order-preserving with templates (vs. renderers). We
maintain backwards compatibility at the cost of not guaranteeing order
for templates. We plan on moving more and more models to renderers,
which have been updated to use these new data types, and additionally
we could add an opt-in way of templates getting an order-preserved list
(e.g., via sibling template vars)

* orderedmap_test: remove testify
2026-01-05 18:03:36 -08:00
cvrunmin e68d6054c1 model: add tensor names in mmproj 2025-12-19 17:42:53 +08:00
Parth Sareen 7325791599
parsers/renderers: functiongemma (#13521) 2025-12-18 07:55:37 -08:00
Grace a013693f80
DeepseekV3 Family Parser (#13484) 2025-12-16 18:56:30 -08:00
Michael Yang f6a016f49d
revert granite-embedding (#13505) 2025-12-16 15:44:52 -08:00
Michael Yang 2dd029de12
remove unnecessary code (#13502)
slog is already lazily evaluated so this code is completely redundant
2025-12-16 15:11:26 -08:00
Michael Yang 903b1fc97f
use ollama engine for bert models (#13501)
register bpe tokenizer which enables granite-embedding
2025-12-16 11:29:19 -08:00
Parth Sareen 89eb795293
parsers/renderers: use think from user for nemotron (#13492) 2025-12-15 18:55:17 -08:00
Parth Sareen 7e3ea813c1
llama/parsers/renderers: nemotron 3 nano (#13489)
---------

Co-authored-by: Daniel Hiltgen <daniel@ollama.com>
2025-12-15 18:00:08 -08:00
Grace 7b95087b9d
Adding tool definitions to DeepseekV3 renderer (#13491) 2025-12-15 17:57:06 -08:00
Michael Yang 971d62595a
fix: qwen2.5 vl rope (#13486)
* qwen25vl: bump max pixels

* qwen25vl: mrope

fix qwen2.5vl window

* qwen25vl: vision rope
2025-12-15 17:30:33 -08:00
Parth Sareen ffbe8e076d
model: add olmo3 and olmo3.1 (#13415) 2025-12-15 15:20:04 -08:00
Grace 2c639431b1
DeepseekV3 family renderer (#13180) 2025-12-15 14:50:52 -08:00
Parth Sareen e3731fb160
renderers: add olmo3.1 and olmo3 fixes (#13447) 2025-12-15 11:26:43 -08:00
Jeffrey Morgan 4ff8a691bc
model: default gemma 3 rope scale to 1.0, apply corrections based on layer counts (#13453) 2025-12-12 17:51:56 -08:00
Jeffrey Morgan 1b308e1d2a
model: fix global layer rope scale values for gemma 3 (#13452) 2025-12-12 16:29:01 -08:00
Jeffrey Morgan 3af5d3b738
model: force rope factor 1.0 for Gemma 3 (#13445) 2025-12-12 13:27:08 -08:00
Jeffrey Morgan 2dfb74410d
model: fix rotary embeddings for ministral 3 (#13432) 2025-12-11 16:02:05 -08:00
Jeffrey Morgan a838421ea3
model: conversion and hyperparameter fixes for ministral and devstral (#13424) 2025-12-11 13:04:00 -08:00
nicole pardal 76f88caf43
nomic-embed-text:v2: model implementation (#13162) 2025-12-09 14:24:51 -08:00
Parth Sareen 2bccf8c624
renderers/parsers: olmo3 instruct (#13383) 2025-12-09 11:12:27 -08:00
Parth Sareen 0c5e5f6630
parsers/renderers: olmo3 think (#13290) 2025-12-09 10:41:47 -08:00
Jeffrey Morgan d2f334c1f7
model: add rnj-1 inference support (#13354) 2025-12-08 16:49:17 -08:00
Michael Yang 603ceefaa6 refactor rope
change to a flatter directory structure and group the options with the
function

update models to call rope in one place
2025-12-08 14:42:22 -08:00
Patrick Devine d3e0a0dee4
model: ministral w/ llama4 scaling (#13292)
This change:

* fixes rope scaling in the mistral converter
* updates ministral to include llama4 scaling
* includes a new ministral parser for parsing reasoning and tool calling

---------

Co-authored-by: jmorganca <jmorganca@gmail.com>
2025-12-01 23:20:14 -08:00
Grace d70e935526
Parser for Cogito v2 (#13145) 2025-11-19 17:21:07 -08:00
Michael Yang 5c1063df7f
deepseek2: upgrade to run v3+ models (#13166)
the check for mla omits v3 and r1 which should not return unsupported.
instead check the tokenizer for compatibility
2025-11-19 17:05:39 -08:00
Patrick Devine 604e43b28d
models: enable deepseek2 (deepseek v3.1 w/ MLA) on the new engine (#13151) 2025-11-18 22:03:50 -08:00
Grace 91935631ac
Renderer for Cogito v2 (#13139) 2025-11-18 19:06:34 -08:00
nicole pardal 8de30b568a
nomic-embed-text model implementation (#13071) 2025-11-18 18:28:10 -08:00
Michael Yang 92981ae3f2 deepseekocr 2025-11-18 16:11:37 -08:00
Michael Yang 440a3823a6
fix(tokenizer): add special tokens to empty inputs (#13091) 2025-11-18 11:16:56 -08:00
Grace 584e2d646f
Add deepseek v3.1 (#13063)
* Add mla for flash attention
* Revert to using chunks
2025-11-17 18:03:21 -08:00
Michael Yang 333203d871
chore: update models to use slice/chunk/chunksections (#12934)
* use slice/chunks

* bert

* llama4

* gemma3n

* gptoss

* mistral3

* qwen3vl

* qwen25vl

* deepseek2

* remove unused ops
2025-11-13 15:20:12 -08:00
Daniel Hiltgen 544b6739dd
ggml update to b6840 (#12791) 2025-11-06 10:19:22 -08:00
Michael Yang ce3eb0a315
chore(gptoss): cleanup dead code (#12932) 2025-11-03 11:27:15 -08:00
Michael Yang f67a6df110
interleaved mrope (#12807)
* ml(ggml): mrope
* interleave mrope
2025-10-30 11:29:00 -07:00
Michael Yang d432ade714
fix: qwen2.5vl, qwen3vl composite image (#12841)
this change fixes images with an alpha channel by overlaying the image
onto a white background
2025-10-30 10:33:19 -07:00
Grace 0a2d92081b
Removing whitespace between Thinking and Content in Qwen3VL (#12838)
Eats extra whitespace at the end/beginning of content
2025-10-29 15:14:28 -07:00
Michael Yang 7d25b9e194
feat(model): add qwen3vl (#12665) 2025-10-28 17:39:47 -07:00
Michael Yang 1188f408dd
s/From*Slice/From*s/ (#12255) 2025-10-28 12:08:49 -07:00
Michael Yang ec9eb28f4c
gemma3: make embedding non-causal (#12297) 2025-10-27 19:54:08 -07:00
Jeffrey Morgan 94f110b35a
model/parsers: remove warning for missing <think> tag for qwen3-vl (#12713) 2025-10-20 16:03:43 -07:00
Daniel Hiltgen bc1a818fdc
contiguous input per layer (#12686)
Co-authored-by: Michael Yang <git@mxy.ng>
2025-10-17 18:39:18 -07:00
Jeffrey Morgan 65fb3ff49d
renderers: add global flag for setting [img] tags (#12669)
Adds a temporary global flag to renderers that causes renderers to always
render images as [img]. In a follow up change, we will consider making this
the default, and this flag could eventually be removed
2025-10-16 16:37:32 -07:00
Grace e2a0b24435
Grace/qwen3 thinking (#12647)
* changing initial status to take into consideration prefill

* Add seperate strings for content and thinking builder

* thinking tests

* remove white space from string before closing think tag
2025-10-16 15:29:41 -07:00
Devon Rifkin 08fbb60bb2 qwen3-coder: support anyOf when parsing tool calls 2025-10-14 15:33:05 -07:00
Devon Rifkin ddaca643d0 add registries for parsers/renderers 2025-10-14 01:13:54 -07:00
Grace 05982a95cb
Qwen3VL Cloud Parser and Renderer (#12526)
* working (other than tool call is the incorrect order) for tool calls and tools

* Tests work, other than image tags (tests do not go through server) and tools (not in the correct order, but contents are the same)

* testing for qwen3vl parser - toolparser is working

* made changes to JSON tool parser, wraps the TollCallFunction with a TollCall object

* Working parser for thinking models - assumes state of thinking, emits unambiguous content in thinking, does not call tool call in thinking

* changed the parser to start with collecting content

* thinking prefill

* add hasThinkingSupport parameter to parser

* qwen3-vl -> qwen3-vl-instruct for renderer/parser

* Add hasThinkingSupport=false to QwenVLParser

---------

Co-authored-by: Devon Rifkin <drifkin@drifkin.net>
2025-10-13 16:52:33 -07:00