This provides granular information about the backend memory allocations required by the runner: - Per backend - Per layer - Weights, cache and graph - Allocation status This can be used for debugging and validating memory estimates. |
||
|---|---|---|
| .. | ||
| ggml | ||
| ggml.go | ||
| quantization.go | ||
| threads.go | ||
| threads_debug.go | ||