Commit Graph

5 Commits

Author SHA1 Message Date
Devon Rifkin c87b910232 WIP: stable ordering for tool args
Right now we deserialize tool call definitions' arguments into golang
maps. These purposefully don't have a predictable iteration order,
whereas we want to maintain the order the user originally provided.

Unstable rendering of arguments means that we break the kv cache, which
this change fixes.

There's no way to build this in a fully backwards compatible way when
executing existing templates exactly as they are. We get around this by
rewriting templates dynamically just before they're rendered. This is
fragile, but perhaps the least bad option?
2025-10-07 15:38:58 -07:00
Daniel Hiltgen c23e6f4cae
tests: add single threaded history test (#12295)
* tests: add single threaded history test

Also tidies up some existing tests to handle more model output variation

* test: add support for testing specific architectures
2025-09-22 11:23:14 -07:00
Parth Sareen 20b53eaa72
tests: add tool calling integration test (#12232) 2025-09-09 14:01:11 -07:00
Daniel Hiltgen 517807cdf2
perf: build graph for next batch async to keep GPU busy (#11863)
* perf: build graph for next batch in parallel to keep GPU busy

This refactors the main run loop of the ollama runner to perform the main GPU
intensive tasks (Compute+Floats) in a go routine so we can prepare the next
batch in parallel to reduce the amount of time the GPU stalls waiting for the
next batch of work.

* tests: tune integration tests for ollama engine

This tunes the integration tests to focus more on models supported
by the new engine.
2025-08-29 14:20:28 -07:00
Daniel Hiltgen ed4e139314
Integration test improvements (#9654)
Add some new test coverage for various model architectures,
and switch from orca-mini to the small llama model.
2025-04-16 14:25:55 -07:00