ollama/ml/backend/ggml
Jesse Gross 3d990dc451
ggml: No-alloc mode
Callers can set a backend buffer type to be no-alloc, meaning that
it does not allocate memory for tensors or operations. This can
be used for calculating memory requirements. Tensors and graphs
must be recreated with no-alloc set to false before loading data.

Defaults to false for newly created backend buffer types.
2025-12-29 06:39:51 -06:00
..
ggml ggml: No-alloc mode 2025-12-29 06:39:51 -06:00
ggml.go ggml: Support closing backends 2025-12-29 06:39:51 -06:00
ggml_test.go gpt-oss (#11672) 2025-12-29 06:39:48 -06:00
mxfp4_test.go gpt-oss (#11672) 2025-12-29 06:39:48 -06:00
quantization.go gpt-oss (#11672) 2025-12-29 06:39:48 -06:00
threads.go ollama debug tensor 2025-03-11 14:49:19 -07:00
threads_debug.go ollama debug tensor 2025-03-11 14:49:19 -07:00