Commit Graph

22 Commits

Author SHA1 Message Date
Bruce MacDonald c1f9bcb4dd restructure
image processing

Update model.go

Update model.go

Update model.go

no projector

no projector

vision model scaffold

...

...

wip

...

rebase

fix patch merger

tidy

...

Update model_vision.go

server: do not attempt to parse offset file as gguf

This logic was causing issues for me when importing a gguf that had some padding at the end of the file. The valid gguf would be read, but then it would try to read the offset as a different gguf file. This does not seem right.

Update process_image_test.go

apply norm

prompt processing

prompt processing

fix post tokenize

fix gguf padding + populate the split patch embeddings

...

...

another shot at patch embeddings

...

patch embedding

Update model_vision.go

split pixels
2025-05-12 13:49:41 -07:00
Bruce MacDonald 51ad65f831 ml: structured rope config to allow specifying context len
This commit refactors the Rotary Position Embedding (RoPE) implementation across the codebase to use a structured configuration approach instead of individual parameters.

Key changes:
- Add new RoPEConfig struct with fields for dimension, type, base frequency, and scaling
- Add RopeType enum to formalize different RoPE implementation variants
- Add YarnConfig struct and related configuration for YaRN (Yet Another RoPE extensioN) context extension
- Update RoPE method signature across all tensor interfaces and implementations
- Refactor all model implementations (llama, gemma2, gemma3, mllama) to use the new configuration structure

This change improves code organization, makes the RoPE configuration more explicit, and provides better support for different RoPE variants and context extension methods.
2025-05-12 13:49:41 -07:00
Michael Yang d26c18e25c fix token type 2025-04-25 16:59:01 -07:00
Bruce MacDonald 6bd0a983cd model: support for mistral-small in the ollama runner
Mistral is a popular research lab making open source models. This updates
the forward pass of llama architecture models to support both llama models
and mistral models by accounting for additional metadata present in mistral
models, and finding the correct dimensions for the output projection.
2025-04-03 16:57:36 -07:00
Michael Yang 3b96a93672 fs: move ml.Config to fs package 2025-04-03 13:12:24 -07:00
Jeffrey Morgan b51e0f397c
model: fix issues with spm tokenizer for Gemma 3 (#10081) 2025-04-02 13:22:56 -07:00
Jesse Gross 0c220935bd input: Rename Options to Batch
Options is no longer very descriptive of this struct.
2025-03-20 13:28:13 -07:00
Jesse Gross 9679f40146 ml: Allow models to constrain inputs to a single batch
Models may require that a set of inputs all be processed as part
of the same batch. For example, if an image has multiple patches
with fully connected attention between them, we should not split
the batch in the middle of an image.

Fixes #9697
2025-03-14 15:38:54 -07:00
Bruce MacDonald a70820daa0
models/gemma3: remove final logit softcap (#9692)
Softcap isn't in the whitepaper/implementation for the language model so we should remove it. There is no discernible difference in output with it removed.
2025-03-12 10:17:57 -07:00
Jesse Gross a8e83a7654 Disable causal attention based on batch index
Currently we are using positions, which are relative to a
sequence and may not be unique.
2025-03-11 14:49:20 -07:00
Jesse Gross 2c40c4d35e Fix follow up images and images split across batches 2025-03-11 14:49:19 -07:00
Michael Yang e95278932b use non-causal mask only for image positions 2025-03-11 14:49:19 -07:00
Michael Yang 9d2a20a763 use non-causal mask for inputs with images 2025-03-11 14:49:19 -07:00
Michael Yang 6b32a2d549 compat with upstream gguf 2025-03-11 14:49:19 -07:00
Michael Yang f888912870 fix vision encoder 2025-03-11 14:49:19 -07:00
Patrick Devine 9b54267e69 fix configs 2025-03-11 14:49:19 -07:00
Michael Yang 46bb0169c4 update model 2025-03-11 14:49:19 -07:00
Patrick Devine c62861f4fa fix conversion 2025-03-11 14:49:18 -07:00
Michael Yang 0df1800436 set non-causal attention 2025-03-11 14:49:18 -07:00
Jesse Gross 4346c2409d fix drift from main 2025-03-11 14:49:18 -07:00
Michael Yang 4b037a97dc add gemma vision encoder 2025-03-11 14:49:17 -07:00
Patrick Devine 5f74d1fd47 gemma2 impl 2025-03-11 14:35:08 -07:00