ollama/ml/backend/ggml
Michael Yang 2dc60d4620 ml/backend/ggml: offload vision to cpu
temporary until tensor loading can accurately account for vision models
2025-03-07 14:08:21 -08:00
..
ggml model: load non-repeated tensors into multiple backends 2025-03-07 14:08:21 -08:00
ggml.go ml/backend/ggml: offload vision to cpu 2025-03-07 14:08:21 -08:00