ollama/runner/ollamarunner
nicole pardal e082d60a24
truncation: fixed runner truncation logic + removed server truncation (#12839)
This PR consolidates all embedding prompt-length checking, truncation, and prompt token counting into the runner to ensure a single source of truth.
2025-12-08 11:20:28 -08:00
..
cache.go feat(model): add qwen3vl (#12665) 2025-10-28 17:39:47 -07:00
cache_test.go feat(model): add qwen3vl (#12665) 2025-10-28 17:39:47 -07:00
multimodal.go ggml: Enable op_offload to improve partial offload performance 2025-10-30 13:53:10 -07:00
runner.go truncation: fixed runner truncation logic + removed server truncation (#12839) 2025-12-08 11:20:28 -08:00