the updated interface supports variadic attention options which
removes the need for individual `AttentionWith...` functions. it means
more models can use the attention interface, e.g. models with
custom masks, logit softcapping, etc.
additionally, this interface should be less error prone since there are
now reasonable defaults for all optional parameters
This change:
* fixes rope scaling in the mistral converter
* updates ministral to include llama4 scaling
* includes a new ministral parser for parsing reasoning and tool calling
---------
Co-authored-by: jmorganca <jmorganca@gmail.com>
Adds a temporary global flag to renderers that causes renderers to always
render images as [img]. In a follow up change, we will consider making this
the default, and this flag could eventually be removed
* changing initial status to take into consideration prefill
* Add seperate strings for content and thinking builder
* thinking tests
* remove white space from string before closing think tag
* working (other than tool call is the incorrect order) for tool calls and tools
* Tests work, other than image tags (tests do not go through server) and tools (not in the correct order, but contents are the same)
* testing for qwen3vl parser - toolparser is working
* made changes to JSON tool parser, wraps the TollCallFunction with a TollCall object
* Working parser for thinking models - assumes state of thinking, emits unambiguous content in thinking, does not call tool call in thinking
* changed the parser to start with collecting content
* thinking prefill
* add hasThinkingSupport parameter to parser
* qwen3-vl -> qwen3-vl-instruct for renderer/parser
* Add hasThinkingSupport=false to QwenVLParser
---------
Co-authored-by: Devon Rifkin <drifkin@drifkin.net>
When trimming whitespace at the end of every chunk, we were iterating
backwards over the string byte-by-byte instead of rune-by-rune.
As an example of how this can cause corruption, suppose we have the
multi-byte character ✅ (`"\u2705"`), which is represented in utf-8 as
the three bytes `0xE2 0x9C 0x85`. It happens that `0x85` is NEL, which
passes `unicode.IsSpace()`. Because we were iterating byte-by-byte, this
caused us to mistakenly slice in the middle of the rune, removing `0x85`
and leaving `0xE2 0x9C`, which beyond being the incorrect place to
slice, is not even a valid utf-8 character.
`trailingWhitespaceLen()` was modified to count from the end in a
rune-aware way. Tests with various multibyte unicode characters were
also added.
Fixes: #12414