ollama/kvcache
Jesse Gross 6da8b6a879 kvcache: Support non-causal attention
Models can disable causality for all or part of their processing
while continuing to store data in the KV cache.
2025-03-07 18:39:27 -08:00
..
cache.go attention: Remove unnecessary contiguous operations 2025-03-01 20:53:23 -08:00
causal.go kvcache: Support non-causal attention 2025-03-07 18:39:27 -08:00
causal_test.go kvcache: update tests 2025-03-07 14:08:21 -08:00
encoder.go ml/backend/ggml: create tensor on specific backend 2025-03-07 14:08:21 -08:00
wrapper.go attention: Remove unnecessary contiguous operations 2025-03-01 20:53:23 -08:00