ollama

History

Jesse Gross 6db8a3771c ggml: Report graph memory for failed allocations GGML has a function to report the allocated size of a backend buffer. However, this returns 0 if we tried to allocate a buffer and it failed. For memory management purposes, it's important to know how much we were trying to allocate. This extends the API to report attempted sizes for all buffers and whether it succeeeded.		2025-05-22 14:38:09 -07:00
..
0001-ggml-backend-malloc-and-free-using-the-same-compiler.patch	llama: update to commit de4c07f93 (#10655 )	2025-05-12 12:17:26 -07:00
0002-pretokenizer.patch	llama: update to commit de4c07f93 (#10655 )	2025-05-12 12:17:26 -07:00
0003-embeddings.patch	llama: update to commit de4c07f93 (#10655 )	2025-05-12 12:17:26 -07:00
0004-clip-unicode.patch	llama: update to commit de4c07f93 (#10655 )	2025-05-12 12:17:26 -07:00
0005-solar-pro.patch	chore: update mllama to use ollama engine (#10637 )	2025-05-13 17:36:02 -07:00
0006-fix-deepseek-deseret-regex.patch	chore: update mllama to use ollama engine (#10637 )	2025-05-13 17:36:02 -07:00
0007-maintain-ordering-for-rules-for-grammar.patch	chore: update mllama to use ollama engine (#10637 )	2025-05-13 17:36:02 -07:00
0008-ensure-KV-cache-is-fully-defragmented.patch	chore: update mllama to use ollama engine (#10637 )	2025-05-13 17:36:02 -07:00
0009-sort-devices-by-score.patch	chore: update mllama to use ollama engine (#10637 )	2025-05-13 17:36:02 -07:00
0010-add-phony-target-ggml-cpu-for-all-cpu-variants.patch	chore: update mllama to use ollama engine (#10637 )	2025-05-13 17:36:02 -07:00
0011-remove-amx.patch	chore: update mllama to use ollama engine (#10637 )	2025-05-13 17:36:02 -07:00
0012-fix-string-arr-kv-loading.patch	chore: update mllama to use ollama engine (#10637 )	2025-05-13 17:36:02 -07:00
0013-ollama-debug-tensor.patch	chore: update mllama to use ollama engine (#10637 )	2025-05-13 17:36:02 -07:00
0014-add-ollama-vocab-for-grammar-support.patch	chore: update mllama to use ollama engine (#10637 )	2025-05-13 17:36:02 -07:00
0015-add-argsort-and-cuda-copy-for-i32.patch	model: add Qwen2.5-VL support (#10385 )	2025-05-13 20:58:02 -07:00
0016-graph-memory-reporting-on-failure.patch	ggml: Report graph memory for failed allocations	2025-05-22 14:38:09 -07:00