ollama/model
Jesse Gross d773b7d671 backend: API to support full precision matmul
Most tensor backends try to optimize performance by using a lower
precision for matmuls. However, some operations (such as kq) on
some models are sensitive to this and require full precision.
2025-02-13 17:09:26 -08:00
..
imageproc imageproc mllama refactor (#7537) 2024-12-14 19:50:15 -08:00
llama backend: API to support full precision matmul 2025-02-13 17:09:26 -08:00
mllama backend: API to support full precision matmul 2025-02-13 17:09:26 -08:00
pixtral imageproc mllama refactor (#7537) 2024-12-14 19:50:15 -08:00
qwen2vl imageproc mllama refactor (#7537) 2024-12-14 19:50:15 -08:00
testdata next ollama runner (#7913) 2025-02-13 16:31:21 -08:00
model.go backend: Support graph computation that does not return an output 2025-02-13 17:09:26 -08:00
model_test.go next ollama runner (#7913) 2025-02-13 16:31:21 -08:00
process_text.go next ollama runner (#7913) 2025-02-13 16:31:21 -08:00
process_text_test.go next ollama runner (#7913) 2025-02-13 16:31:21 -08:00