ollama/ml
Jesse Gross 015e39a8be
ggml: Disable unused pipeline parallelism
We're not currently using it, even in cases where we could. Disabling
it improves generation performance by 10-30% with multiple GPUs.
2025-12-29 06:39:42 -06:00
..
backend ggml: Disable unused pipeline parallelism 2025-12-29 06:39:42 -06:00
nn ml: add more rope options (#10775) 2025-12-29 06:38:03 -06:00
backend.go ggml: Report ordinal IDs for AMD GPUs on Windows 2025-12-29 06:39:42 -06:00