ollama

History

Jesse Gross 3d990dc451 ggml: No-alloc mode Callers can set a backend buffer type to be no-alloc, meaning that it does not allocate memory for tensors or operations. This can be used for calculating memory requirements. Tensors and graphs must be recreated with no-alloc set to false before loading data. Defaults to false for newly created backend buffer types.		2025-12-29 06:39:51 -06:00
..
0001-ggml-backend-malloc-and-free-using-the-same-compiler.patch	llama: update to commit de4c07f93 (#10655 )	2025-12-29 06:37:57 -06:00
0002-pretokenizer.patch	llama: update to commit de4c07f93 (#10655 )	2025-12-29 06:37:57 -06:00
0003-embeddings.patch	llama: update to commit de4c07f93 (#10655 )	2025-12-29 06:37:57 -06:00
0004-clip-unicode.patch	llama: update to commit de4c07f93 (#10655 )	2025-12-29 06:37:57 -06:00
0005-solar-pro.patch	add new gemma model (#11204 )	2025-12-29 06:39:38 -06:00
0006-fix-deepseek-deseret-regex.patch	chore: update mllama to use ollama engine (#10637 )	2025-12-29 06:37:59 -06:00
0007-maintain-ordering-for-rules-for-grammar.patch	chore: update mllama to use ollama engine (#10637 )	2025-12-29 06:37:59 -06:00
0008-ensure-KV-cache-is-fully-defragmented.patch	add new gemma model (#11204 )	2025-12-29 06:39:38 -06:00
0009-sort-devices-by-score.patch	chore: update mllama to use ollama engine (#10637 )	2025-12-29 06:37:59 -06:00
0010-add-phony-target-ggml-cpu-for-all-cpu-variants.patch	chore: update mllama to use ollama engine (#10637 )	2025-12-29 06:37:59 -06:00
0011-remove-amx.patch	chore: update mllama to use ollama engine (#10637 )	2025-12-29 06:37:59 -06:00
0012-fix-string-arr-kv-loading.patch	chore: update mllama to use ollama engine (#10637 )	2025-12-29 06:37:59 -06:00
0013-ollama-debug-tensor.patch	chore: update mllama to use ollama engine (#10637 )	2025-12-29 06:37:59 -06:00
0014-add-ollama-vocab-for-grammar-support.patch	chore: update mllama to use ollama engine (#10637 )	2025-12-29 06:37:59 -06:00
0015-add-argsort-and-cuda-copy-for-i32.patch	add new gemma model (#11204 )	2025-12-29 06:39:38 -06:00
0016-graph-memory-reporting-on-failure.patch	ggml: Report graph memory for failed allocations	2025-12-29 06:38:06 -06:00
0017-ggml-Export-GPU-UUIDs.patch	ggml: Report ordinal IDs for AMD GPUs on Windows	2025-12-29 06:39:42 -06:00
0018-temporary-prevent-rocm-cuda-mixed-loading.patch	Re-remove cuda v11 (#10694 )	2025-12-29 06:38:18 -06:00
0019-metal-add-mean-kernel-14267.patch	gpt-oss (#11672 )	2025-12-29 06:39:48 -06:00
0020-CUDA-add-mean-operation-14313.patch	Increase performance for Gemma3n models on NVGPUs by enabling CUDA Graph execution (#11525 )	2025-12-29 06:39:47 -06:00
0021-Enable-CUDA-Graphs-for-gemma3n.patch	Increase performance for Gemma3n models on NVGPUs by enabling CUDA Graph execution (#11525 )	2025-12-29 06:39:47 -06:00
0022-BF16-macos-version-guard.patch	gpt-oss (#11672 )	2025-12-29 06:39:48 -06:00
0023-MXFP4.patch	gpt-oss (#11672 )	2025-12-29 06:39:48 -06:00
0024-cuda-disable-graph-compat-check-for-OP_ADD.patch	gpt-oss (#11672 )	2025-12-29 06:39:48 -06:00
0025-Disable-ggml-blas-on-macos-v13-and-older.patch	gpt-oss (#11672 )	2025-12-29 06:39:48 -06:00
0026-ggml-No-alloc-mode.patch	ggml: No-alloc mode	2025-12-29 06:39:51 -06:00