Commit Graph

4 Commits

Author SHA1 Message Date
Daniel Hiltgen 39ca55a1ba
Move quantization to new backend (#10363)
* Move quantization logic to GGML via new backend

This moves the model aware logic to Go code and calls GGMLs quantization code for model creation.

* Remove "add model quantizations"

This is no longer needed now that quantization is implemented in Go+GGML code directly.
2025-12-29 06:37:52 -06:00
Michael Yang 644d6c5256
fixes for maverick 2025-12-29 06:37:45 -06:00
Michael Yang d2d5c5e6d5
chunked attention 2025-12-29 06:37:45 -06:00
Michael Yang 0f5c45e19d
llama4 2025-12-29 06:37:44 -06:00