ollama/ml/backend/ggml/ggml
Michael Yang bfce55db3d model: load non-repeated tensors into multiple backends
some tensors are expected to be used in repeating layers but are not
themselves repeated. this change copies these tensors into the same
backends as their repeating counterparts to minimize copying tensors
between backends
2025-03-07 14:08:21 -08:00
..
include llama: fix kv loading on snowflake-arctic-embed models (#9536) 2025-03-07 09:25:34 -08:00
src model: load non-repeated tensors into multiple backends 2025-03-07 14:08:21 -08:00
.rsync-filter ml/backend/ggml: follow on fixes after updating vendored code (#9388) 2025-02-26 22:33:53 -08:00
LICENSE next build (#8539) 2025-01-29 15:03:38 -08:00