ollama

History

Daniel Hiltgen bd6c1d6b49 flash attn: add auto mode for llama engine (#13052 ) * flash attn: add auto mode for llama engine If the user does not specify fa in the environment, use auto-mode. * review comments * ensure kv cache quantized types have FA explicitly enabled additional review comments		2025-12-12 13:27:19 -08:00
..
ggml	feat: llama.cpp bump (17f7f4) for SSM performance improvements (#13408 )	2025-12-10 12:59:27 -08:00
ggml.go	flash attn: add auto mode for llama engine (#13052 )	2025-12-12 13:27:19 -08:00
ggml_test.go	ml: add slice operation (#12870 )	2025-11-13 13:28:21 -08:00
quantization.go	chore: fix some inconsistent function name in comment	2025-08-13 09:50:27 -07:00
threads.go	ollama debug tensor	2025-03-11 14:49:19 -07:00
threads_debug.go	ollama debug tensor	2025-03-11 14:49:19 -07:00