ollama

Commit Graph

Author	SHA1	Message	Date
Bruce MacDonald	9ceee25d8b	chunk vision outputs	2025-05-12 13:49:44 -07:00
Bruce MacDonald	661bf04696	add picture prefix	2025-05-12 13:49:44 -07:00
Bruce MacDonald	2521a55ae6	fixes after rebase	2025-05-12 13:49:44 -07:00
Bruce MacDonald	32948ec952	increase rope base	2025-05-12 13:49:43 -07:00
Bruce MacDonald	9876c8453a	update exported functions for tests	2025-05-12 13:49:43 -07:00
Bruce MacDonald	919b3d6e21	require new engine for qwen25vl arch	2025-05-12 13:49:43 -07:00
Bruce MacDonald	16b13e0cfc	Revert "ropeTheta should be 1e5" This reverts commit cc1638b26763eae7daddd44e3975a885671ef9d3. This reverts commit b32385591307e2d33a8f43ce1626b529d2dac83e.	2025-05-12 13:49:43 -07:00
Bruce MacDonald	75441c56f3	add comment explaining rope theta	2025-05-12 13:49:43 -07:00
Bruce MacDonald	45f96e898d	ropeTheta should be 1e5	2025-05-12 13:49:43 -07:00
Bruce MacDonald	7c555d394c	simplify patch creation	2025-05-12 13:49:43 -07:00
Bruce MacDonald	39ee6d2bd0	ranges for lint	2025-05-12 13:49:43 -07:00
Bruce MacDonald	47705b5168	simplify rope changes	2025-05-12 13:49:43 -07:00
Michael Yang	698a92aa4a	reverse window	2025-05-12 13:49:43 -07:00
Michael Yang	150c499cae	use silu	2025-05-12 13:49:43 -07:00
Michael Yang	f1257a7de4	update vision rope theta default	2025-05-12 13:49:43 -07:00
Bruce MacDonald	b68af0370f	move sdpa to model forward pass	2025-05-12 13:49:43 -07:00
Bruce MacDonald	ca981c8a49	full attn block indexes should be []int32	2025-05-12 13:49:43 -07:00
Bruce MacDonald	b3da8a319e	Update model_vision.go	2025-05-12 13:49:42 -07:00
Bruce MacDonald	359e1d5b19	full attention layers	2025-05-12 13:49:42 -07:00
Michael Yang	bde6b46ce9	fix padding padding was being added to offset but not to the running count	2025-05-12 13:49:42 -07:00
Bruce MacDonald	ff1f74534b	block attention	2025-05-12 13:49:42 -07:00
Bruce MacDonald	104f802df1	remove todos	2025-05-12 13:49:42 -07:00
Bruce MacDonald	eed0ac2948	clean up vision model forward pass	2025-05-12 13:49:42 -07:00
Bruce MacDonald	fcfad744ff	fix patch merger	2025-05-12 13:49:42 -07:00
Michael Yang	fb3c16f2a2	window index	2025-05-12 13:49:42 -07:00
Michael Yang	ee869f35e4	fix image processing python built-in `round()` rounds to the nearest even number if the value is in the middle https://docs.python.org/3/library/functions.html#round	2025-05-12 13:49:42 -07:00
Michael Yang	ff5d1a3dc0	duplicate input embeddings	2025-05-12 13:49:42 -07:00
Michael Yang	88b231f903	use maxgridsize	2025-05-12 13:49:42 -07:00
Michael Yang	7e920c8d75	fix: patch merger and convert convert: - split patch embedding - split qkv remove duplicate PatchMerger	2025-05-12 13:49:42 -07:00
Bruce MacDonald	dd8c619fba	fixes after rebase	2025-05-12 13:49:42 -07:00
Bruce MacDonald	2af76d0e7a	default to 32 for vision block count	2025-05-12 13:49:42 -07:00
Bruce MacDonald	8d901825f0	reshape cos and sin	2025-05-12 13:49:41 -07:00
Bruce MacDonald	04936b719f	Update model_vision.go	2025-05-12 13:49:41 -07:00
Bruce MacDonald	0f0136d419	simplify by doing operations in Go rather than with tensors Co-Authored-By: Michael Yang <2372640+mxyng@users.noreply.github.com>	2025-05-12 13:49:41 -07:00
Bruce MacDonald	80498f76de	fix build	2025-05-12 13:49:41 -07:00
Bruce MacDonald	f8b48aa784	Delete model_external_test.go	2025-05-12 13:49:41 -07:00
Bruce MacDonald	5ff0d538b0	wip: implementing rope	2025-05-12 13:49:41 -07:00
Bruce MacDonald	eedc969c35	grid refactor	2025-05-12 13:49:41 -07:00
Bruce MacDonald	963531215e	update convert	2025-05-12 13:49:41 -07:00
Bruce MacDonald	3fe090f447	get patch embedding vals from config	2025-05-12 13:49:41 -07:00
Bruce MacDonald	1704072746	patch embeddings	2025-05-12 13:49:41 -07:00
Bruce MacDonald	c1f9bcb4dd	restructure image processing Update model.go Update model.go Update model.go no projector no projector vision model scaffold ... ... wip ... rebase fix patch merger tidy ... Update model_vision.go server: do not attempt to parse offset file as gguf This logic was causing issues for me when importing a gguf that had some padding at the end of the file. The valid gguf would be read, but then it would try to read the offset as a different gguf file. This does not seem right. Update process_image_test.go apply norm prompt processing prompt processing fix post tokenize fix gguf padding + populate the split patch embeddings ... ... another shot at patch embeddings ... patch embedding Update model_vision.go split pixels	2025-05-12 13:49:41 -07:00
Bruce MacDonald	198b1e6db9	text model forward pass	2025-05-12 13:49:41 -07:00
Bruce MacDonald	51ad65f831	ml: structured rope config to allow specifying context len This commit refactors the Rotary Position Embedding (RoPE) implementation across the codebase to use a structured configuration approach instead of individual parameters. Key changes: - Add new RoPEConfig struct with fields for dimension, type, base frequency, and scaling - Add RopeType enum to formalize different RoPE implementation variants - Add YarnConfig struct and related configuration for YaRN (Yet Another RoPE extensioN) context extension - Update RoPE method signature across all tensor interfaces and implementations - Refactor all model implementations (llama, gemma2, gemma3, mllama) to use the new configuration structure This change improves code organization, makes the RoPE configuration more explicit, and provides better support for different RoPE variants and context extension methods.	2025-05-12 13:49:41 -07:00
Jeffrey Morgan	0cefd46f23	llama: update to commit de4c07f93 (#10655 )	2025-05-12 12:17:26 -07:00
Bruce MacDonald	ad035ad595	convert: quantize from safetensors needs kv (#10675 ) When creating a quantized model from safetensors we need the array KV values to be loaded.Changing this value to -1 loads the KV values on the returned layer to be used and saved during quantization.	2025-05-12 12:04:20 -07:00
Michael Yang	f95a1f2bef	feat: add trace log level (#10650 ) reduce prompt log to trace level	2025-05-12 11:43:00 -07:00
HardCodeDev	82a9e9462a	readme: add UnityCodeLama to community integrations (#10665 )	2025-05-11 13:44:51 -07:00
HardCodeDev	76724e2f29	readme: add OllamaPlusPlus C++ library to community integrations (#10664 )	2025-05-11 13:40:41 -07:00
frob	ecf14a220f	llama: allocate grammar buffer based on schema length (#10649 )	2025-05-10 11:57:30 -07:00

1 2 3 4 5 ...

4304 Commits All Branches Search

4304 Commits

All Branches