ollama/model
Jesse Gross 9679f40146 ml: Allow models to constrain inputs to a single batch
Models may require that a set of inputs all be processed as part
of the same batch. For example, if an image has multiple patches
with fully connected attention between them, we should not split
the batch in the middle of an image.

Fixes #9697
2025-03-14 15:38:54 -07:00
..
imageproc imageproc mllama refactor (#7537) 2024-12-14 19:50:15 -08:00
input ml: Allow models to constrain inputs to a single batch 2025-03-14 15:38:54 -07:00
models ml: Allow models to constrain inputs to a single batch 2025-03-14 15:38:54 -07:00
testdata gemma2 impl 2025-03-11 14:35:08 -07:00
model.go Update model/model.go 2025-03-13 13:11:52 -07:00
model_test.go model: Update encoder cache to use multimodal input processing handler 2025-03-09 17:05:26 -07:00
process_text.go set non-causal attention 2025-03-11 14:49:18 -07:00
process_text_spm.go model: validate left and right pairs before merging them 2025-03-11 14:49:20 -07:00
process_text_spm_test.go model: add more spm tokenizer tests 2025-03-11 14:49:20 -07:00
process_text_test.go model: Don't unconditionally add special tokens 2025-03-06 16:54:16 -08:00