ollama/runner/ollamarunner
Jesse Gross bcd5507f4b
ggml: Support closing backends
In order to iteratively find the best memory allocation, we need to
be able to free backend memory so we can try again.
2025-12-29 06:39:51 -06:00
..
cache.go ggml: Support closing backends 2025-12-29 06:39:51 -06:00
cache_test.go ollamarunner: Separate text and multimodal graphs 2025-12-29 06:38:01 -06:00
multimodal.go ml: Panic rather than return error on tensor allocation failure 2025-12-29 06:38:06 -06:00
runner.go ggml: Support closing backends 2025-12-29 06:39:51 -06:00