* Enable CUDA Graphs for gemma3n. Similar to https://github.com/ggml-org/llama.cpp/pull/14741, though ollama has a slightly different model graph than llama.cpp which requires different workaround checks. * Remove residual check by reshaping differently in gemma3n model This should make the heuristics more robust |
||
|---|---|---|
| .. | ||
| imageproc | ||
| input | ||
| models | ||
| testdata | ||
| bytepairencoding.go | ||
| bytepairencoding_test.go | ||
| model.go | ||
| model_test.go | ||
| sentencepiece.go | ||
| sentencepiece_test.go | ||
| textprocessor.go | ||
| vocabulary.go | ||
| vocabulary_test.go | ||