ollama

History

virajwad 220e133fca vulkan: Add memory detection for Intel GPU using DXGI+PDH (#12664 ) * PDH free memory skeleton * Add PDH printing * Add LUID support for Vulkan * wire luid from ggml-vulkan to mem-dxgi-pdh file * Fix to ggml-impl * Continue skeleton * Implemented ggml_dxgi_pdh_get_device_memory * fix comments * Fix - change value GB to bytes * add ifdefs to only support windows and not linux * modify error codes * Finished ggml_dxgi_pdh_init() function * completed ggml_dxgi_pdh_release() * Formatting changes, add static to functions * fix build errors * fix go build error * fix luid - now should match between dxgi and vulkan * Fix the free memory reporting (was using copy by value, change to reference) * keep only dxgi1_2.h * Modifications based on PR feedback * fix merge conflicts (2) and fix desc1.description printout * move dxgi + pdh api calls to before the vendor specific library calls * change from 3 samples to 1 sample for PDH * modify when old_mode is set * add fix for building MacOS * fix release and returns for other vendors * add patch file		2025-11-04 14:11:55 -08:00
..
.gitignore	update vendored llama.cpp and ggml (#11823 )	2025-08-14 14:42:58 -07:00
0001-ggml-backend-malloc-and-free-using-the-same-compiler.patch	Llama cpp bump (df1b612): granite docling / mamba2 optimizations / multimodal encoding fixes (#12552 )	2025-10-13 15:26:18 -07:00
0002-pretokenizer.patch	Llama cpp bump (df1b612): granite docling / mamba2 optimizations / multimodal encoding fixes (#12552 )	2025-10-13 15:26:18 -07:00
0003-clip-unicode.patch	Llama cpp bump (df1b612): granite docling / mamba2 optimizations / multimodal encoding fixes (#12552 )	2025-10-13 15:26:18 -07:00
0004-solar-pro.patch	Llama cpp bump (df1b612): granite docling / mamba2 optimizations / multimodal encoding fixes (#12552 )	2025-10-13 15:26:18 -07:00
0005-fix-deepseek-deseret-regex.patch	Llama cpp bump (df1b612): granite docling / mamba2 optimizations / multimodal encoding fixes (#12552 )	2025-10-13 15:26:18 -07:00
0006-maintain-ordering-for-rules-for-grammar.patch	Update GGML to b6646 (#12245 )	2025-10-02 14:47:10 -07:00
0007-sort-devices-by-score.patch	Update GGML to b6646 (#12245 )	2025-10-02 14:47:10 -07:00
0008-add-phony-target-ggml-cpu-for-all-cpu-variants.patch	Llama cpp bump (df1b612): granite docling / mamba2 optimizations / multimodal encoding fixes (#12552 )	2025-10-13 15:26:18 -07:00
0009-remove-amx.patch	Llama cpp bump (df1b612): granite docling / mamba2 optimizations / multimodal encoding fixes (#12552 )	2025-10-13 15:26:18 -07:00
0010-fix-string-arr-kv-loading.patch	Llama cpp bump (df1b612): granite docling / mamba2 optimizations / multimodal encoding fixes (#12552 )	2025-10-13 15:26:18 -07:00
0011-ollama-debug-tensor.patch	Llama cpp bump (df1b612): granite docling / mamba2 optimizations / multimodal encoding fixes (#12552 )	2025-10-13 15:26:18 -07:00
0012-add-ollama-vocab-for-grammar-support.patch	Llama cpp bump (df1b612): granite docling / mamba2 optimizations / multimodal encoding fixes (#12552 )	2025-10-13 15:26:18 -07:00
0013-add-argsort-and-cuda-copy-for-i32.patch	Llama cpp bump (df1b612): granite docling / mamba2 optimizations / multimodal encoding fixes (#12552 )	2025-10-13 15:26:18 -07:00
0014-graph-memory-reporting-on-failure.patch	Llama cpp bump (df1b612): granite docling / mamba2 optimizations / multimodal encoding fixes (#12552 )	2025-10-13 15:26:18 -07:00
0015-ggml-Export-GPU-UUIDs.patch	Llama cpp bump (df1b612): granite docling / mamba2 optimizations / multimodal encoding fixes (#12552 )	2025-10-13 15:26:18 -07:00
0016-add-C-API-for-mtmd_input_text.patch	Llama cpp bump (df1b612): granite docling / mamba2 optimizations / multimodal encoding fixes (#12552 )	2025-10-13 15:26:18 -07:00
0017-no-power-throttling-win32-with-gnuc.patch	Llama cpp bump (df1b612): granite docling / mamba2 optimizations / multimodal encoding fixes (#12552 )	2025-10-13 15:26:18 -07:00
0018-BF16-macos-version-guard.patch	Update GGML to b6646 (#12245 )	2025-10-02 14:47:10 -07:00
0019-ggml-Add-batch-size-hint.patch	ggml: Enable op_offload to improve partial offload performance	2025-10-30 13:53:10 -07:00
0020-Disable-ggml-blas-on-macos-v13-and-older.patch	Update GGML to b6646 (#12245 )	2025-10-02 14:47:10 -07:00
0021-fix-mtmd-audio.cpp-build-on-windows.patch	llm: New memory management	2025-08-14 15:24:01 -07:00
0022-ggml-No-alloc-mode.patch	ggml: Avoid cudaMemsetAsync during memory fitting	2025-10-31 15:23:28 -07:00
0023-decode-disable-output_all.patch	Llama cpp bump (df1b612): granite docling / mamba2 optimizations / multimodal encoding fixes (#12552 )	2025-10-13 15:26:18 -07:00
0024-ggml-Enable-resetting-backend-devices.patch	logs: fix bogus "0 MiB free" log line (#12590 )	2025-10-14 11:26:28 -07:00
0025-harden-uncaught-exception-registration.patch	harden uncaught exception registration (#12120 )	2025-09-02 09:43:55 -07:00
0026-GPU-discovery-enhancements.patch	Fix vulkan PCI ID and ID handling (#12775 )	2025-10-28 15:15:35 -07:00
0027-NVML-fallback-for-unified-memory-GPUs.patch	Fix vulkan PCI ID and ID handling (#12775 )	2025-10-28 15:15:35 -07:00
0028-CUDA-Changing-the-CUDA-scheduling-strategy-to-spin-1.patch	Fix vulkan PCI ID and ID handling (#12775 )	2025-10-28 15:15:35 -07:00
0029-report-LoadLibrary-failures.patch	Fix vulkan PCI ID and ID handling (#12775 )	2025-10-28 15:15:35 -07:00
0031-Add-memory-detection-using-DXGI-PDH.patch	vulkan: Add memory detection for Intel GPU using DXGI+PDH (#12664 )	2025-11-04 14:11:55 -08:00
0032-interleave-multi-rope.patch	interleaved mrope (#12807 )	2025-10-30 11:29:00 -07:00