ollama

History

Jesse Gross 71cb86af3e llm: Remove unneeded warning with flash attention enabled If flash attention is enabled without KV cache quanitization, we will currently always get this warning: level=WARN source=server.go:226 msg="kv cache type not supported by model" type=""		2025-09-10 16:40:45 -07:00
..
llm_darwin.go	Optimize container images for startup (#6547 )	2024-09-12 12:10:30 -07:00
llm_linux.go	Optimize container images for startup (#6547 )	2024-09-12 12:10:30 -07:00
llm_windows.go	win: lint fix (#10571 )	2025-05-05 11:08:12 -07:00
memory.go	llm: Remove unneeded warning with flash attention enabled	2025-09-10 16:40:45 -07:00
memory_test.go	llm: New memory management	2025-08-14 15:24:01 -07:00
server.go	llm: Remove unneeded warning with flash attention enabled	2025-09-10 16:40:45 -07:00
server_test.go	llm: New memory management	2025-08-14 15:24:01 -07:00
status.go	Improve crash reporting (#7728 )	2024-11-19 16:26:57 -08:00