|
gemma2
|
gemma2: use fast attention
|
2025-08-19 13:33:12 -07:00 |
|
gemma3
|
gemma3: scale in attention
|
2025-08-19 13:43:47 -07:00 |
|
gptoss
|
update vendored llama.cpp and ggml (#11823)
|
2025-08-14 14:42:58 -07:00 |
|
llama
|
Only load supported models on new engine (#11362)
|
2025-07-11 12:21:54 -07:00 |
|
llama4
|
use nn.Linear in place of ml.Tensor (#11049)
|
2025-06-11 12:10:15 -07:00 |
|
qwen2
|
Only load supported models on new engine (#11362)
|
2025-07-11 12:21:54 -07:00 |
|
qwen3
|
use nn.Linear in place of ml.Tensor (#11049)
|
2025-06-11 12:10:15 -07:00 |
|
models.go
|
gpt-oss (#11672)
|
2025-08-05 12:21:16 -07:00 |