ollama

Commit Graph

Author	SHA1	Message	Date
Bruce MacDonald	51ad65f831	ml: structured rope config to allow specifying context len This commit refactors the Rotary Position Embedding (RoPE) implementation across the codebase to use a structured configuration approach instead of individual parameters. Key changes: - Add new RoPEConfig struct with fields for dimension, type, base frequency, and scaling - Add RopeType enum to formalize different RoPE implementation variants - Add YarnConfig struct and related configuration for YaRN (Yet Another RoPE extensioN) context extension - Update RoPE method signature across all tensor interfaces and implementations - Refactor all model implementations (llama, gemma2, gemma3, mllama) to use the new configuration structure This change improves code organization, makes the RoPE configuration more explicit, and provides better support for different RoPE variants and context extension methods.	2025-05-12 13:49:41 -07:00
Michael Yang	d26c18e25c	fix token type	2025-04-25 16:59:01 -07:00
Bruce MacDonald	6bd0a983cd	model: support for mistral-small in the ollama runner Mistral is a popular research lab making open source models. This updates the forward pass of llama architecture models to support both llama models and mistral models by accounting for additional metadata present in mistral models, and finding the correct dimensions for the output projection.	2025-04-03 16:57:36 -07:00
Michael Yang	3b96a93672	fs: move ml.Config to fs package	2025-04-03 13:12:24 -07:00
Jeffrey Morgan	b51e0f397c	model: fix issues with spm tokenizer for Gemma 3 (#10081 )	2025-04-02 13:22:56 -07:00
Jesse Gross	0c220935bd	input: Rename Options to Batch Options is no longer very descriptive of this struct.	2025-03-20 13:28:13 -07:00
Jesse Gross	9679f40146	ml: Allow models to constrain inputs to a single batch Models may require that a set of inputs all be processed as part of the same batch. For example, if an image has multiple patches with fully connected attention between them, we should not split the batch in the middle of an image. Fixes #9697	2025-03-14 15:38:54 -07:00
Bruce MacDonald	a70820daa0	models/gemma3: remove final logit softcap (#9692 ) Softcap isn't in the whitepaper/implementation for the language model so we should remove it. There is no discernible difference in output with it removed.	2025-03-12 10:17:57 -07:00
Jesse Gross	a8e83a7654	Disable causal attention based on batch index Currently we are using positions, which are relative to a sequence and may not be unique.	2025-03-11 14:49:20 -07:00
Jesse Gross	2c40c4d35e	Fix follow up images and images split across batches	2025-03-11 14:49:19 -07:00
Michael Yang	e95278932b	use non-causal mask only for image positions	2025-03-11 14:49:19 -07:00
Michael Yang	9d2a20a763	use non-causal mask for inputs with images	2025-03-11 14:49:19 -07:00
Michael Yang	6b32a2d549	compat with upstream gguf	2025-03-11 14:49:19 -07:00
Michael Yang	f888912870	fix vision encoder	2025-03-11 14:49:19 -07:00
Patrick Devine	9b54267e69	fix configs	2025-03-11 14:49:19 -07:00
Michael Yang	46bb0169c4	update model	2025-03-11 14:49:19 -07:00
Patrick Devine	c62861f4fa	fix conversion	2025-03-11 14:49:18 -07:00
Michael Yang	0df1800436	set non-causal attention	2025-03-11 14:49:18 -07:00
Jesse Gross	4346c2409d	fix drift from main	2025-03-11 14:49:18 -07:00
Michael Yang	4b037a97dc	add gemma vision encoder	2025-03-11 14:49:17 -07:00
Patrick Devine	5f74d1fd47	gemma2 impl	2025-03-11 14:35:08 -07:00

21 Commits