ollama/llama/patches
Daniel Hiltgen 27f1fde413
discovery: only retry AMD GPUs (#12894)
* discovery: only retry AMD GPUs

CUDA and Vulkan don't crash on unsupported devices, so retry isn't necessary.
This also refactors the code to shift the Library specific logic into the ml
package.

* review comments
2025-11-04 15:33:46 -08:00
..
.gitignore update vendored llama.cpp and ggml (#11823) 2025-08-14 14:42:58 -07:00
0001-ggml-backend-malloc-and-free-using-the-same-compiler.patch Llama cpp bump (df1b612): granite docling / mamba2 optimizations / multimodal encoding fixes (#12552) 2025-10-13 15:26:18 -07:00
0002-pretokenizer.patch Llama cpp bump (df1b612): granite docling / mamba2 optimizations / multimodal encoding fixes (#12552) 2025-10-13 15:26:18 -07:00
0003-clip-unicode.patch Llama cpp bump (df1b612): granite docling / mamba2 optimizations / multimodal encoding fixes (#12552) 2025-10-13 15:26:18 -07:00
0004-solar-pro.patch Llama cpp bump (df1b612): granite docling / mamba2 optimizations / multimodal encoding fixes (#12552) 2025-10-13 15:26:18 -07:00
0005-fix-deepseek-deseret-regex.patch Llama cpp bump (df1b612): granite docling / mamba2 optimizations / multimodal encoding fixes (#12552) 2025-10-13 15:26:18 -07:00
0006-maintain-ordering-for-rules-for-grammar.patch Update GGML to b6646 (#12245) 2025-10-02 14:47:10 -07:00
0007-sort-devices-by-score.patch Update GGML to b6646 (#12245) 2025-10-02 14:47:10 -07:00
0008-add-phony-target-ggml-cpu-for-all-cpu-variants.patch Llama cpp bump (df1b612): granite docling / mamba2 optimizations / multimodal encoding fixes (#12552) 2025-10-13 15:26:18 -07:00
0009-remove-amx.patch Llama cpp bump (df1b612): granite docling / mamba2 optimizations / multimodal encoding fixes (#12552) 2025-10-13 15:26:18 -07:00
0010-fix-string-arr-kv-loading.patch Llama cpp bump (df1b612): granite docling / mamba2 optimizations / multimodal encoding fixes (#12552) 2025-10-13 15:26:18 -07:00
0011-ollama-debug-tensor.patch Llama cpp bump (df1b612): granite docling / mamba2 optimizations / multimodal encoding fixes (#12552) 2025-10-13 15:26:18 -07:00
0012-add-ollama-vocab-for-grammar-support.patch Llama cpp bump (df1b612): granite docling / mamba2 optimizations / multimodal encoding fixes (#12552) 2025-10-13 15:26:18 -07:00
0013-add-argsort-and-cuda-copy-for-i32.patch Llama cpp bump (df1b612): granite docling / mamba2 optimizations / multimodal encoding fixes (#12552) 2025-10-13 15:26:18 -07:00
0014-graph-memory-reporting-on-failure.patch Llama cpp bump (df1b612): granite docling / mamba2 optimizations / multimodal encoding fixes (#12552) 2025-10-13 15:26:18 -07:00
0015-ggml-Export-GPU-UUIDs.patch Llama cpp bump (df1b612): granite docling / mamba2 optimizations / multimodal encoding fixes (#12552) 2025-10-13 15:26:18 -07:00
0016-add-C-API-for-mtmd_input_text.patch Llama cpp bump (df1b612): granite docling / mamba2 optimizations / multimodal encoding fixes (#12552) 2025-10-13 15:26:18 -07:00
0017-no-power-throttling-win32-with-gnuc.patch Llama cpp bump (df1b612): granite docling / mamba2 optimizations / multimodal encoding fixes (#12552) 2025-10-13 15:26:18 -07:00
0018-BF16-macos-version-guard.patch Update GGML to b6646 (#12245) 2025-10-02 14:47:10 -07:00
0019-ggml-Add-batch-size-hint.patch ggml: Enable op_offload to improve partial offload performance 2025-10-30 13:53:10 -07:00
0020-Disable-ggml-blas-on-macos-v13-and-older.patch Update GGML to b6646 (#12245) 2025-10-02 14:47:10 -07:00
0021-fix-mtmd-audio.cpp-build-on-windows.patch llm: New memory management 2025-08-14 15:24:01 -07:00
0022-ggml-No-alloc-mode.patch ggml: Avoid cudaMemsetAsync during memory fitting 2025-10-31 15:23:28 -07:00
0023-decode-disable-output_all.patch Llama cpp bump (df1b612): granite docling / mamba2 optimizations / multimodal encoding fixes (#12552) 2025-10-13 15:26:18 -07:00
0024-ggml-Enable-resetting-backend-devices.patch logs: fix bogus "0 MiB free" log line (#12590) 2025-10-14 11:26:28 -07:00
0025-harden-uncaught-exception-registration.patch harden uncaught exception registration (#12120) 2025-09-02 09:43:55 -07:00
0026-GPU-discovery-enhancements.patch discovery: only retry AMD GPUs (#12894) 2025-11-04 15:33:46 -08:00
0027-NVML-fallback-for-unified-memory-GPUs.patch Fix vulkan PCI ID and ID handling (#12775) 2025-10-28 15:15:35 -07:00
0028-CUDA-Changing-the-CUDA-scheduling-strategy-to-spin-1.patch Fix vulkan PCI ID and ID handling (#12775) 2025-10-28 15:15:35 -07:00
0029-report-LoadLibrary-failures.patch Fix vulkan PCI ID and ID handling (#12775) 2025-10-28 15:15:35 -07:00
0030-Add-memory-detection-using-DXGI-PDH.patch discovery: only retry AMD GPUs (#12894) 2025-11-04 15:33:46 -08:00
0031-interleave-multi-rope.patch discovery: only retry AMD GPUs (#12894) 2025-11-04 15:33:46 -08:00