In order to iteratively find the best memory allocation, we need to be able to free backend memory so we can try again. |
||
|---|---|---|
| .. | ||
| ggml | ||
| ggml.go | ||
| ggml_test.go | ||
| mxfp4_test.go | ||
| quantization.go | ||
| threads.go | ||
| threads_debug.go | ||
In order to iteratively find the best memory allocation, we need to be able to free backend memory so we can try again. |
||
|---|---|---|
| .. | ||
| ggml | ||
| ggml.go | ||
| ggml_test.go | ||
| mxfp4_test.go | ||
| quantization.go | ||
| threads.go | ||
| threads_debug.go | ||