ollama

Commit Graph

Author	SHA1	Message	Date
Jeffrey Morgan	ca3520de87	readme: update Ollama icon size	2025-12-29 06:39:40 -06:00
Daniel Hiltgen	55a4a37c3a	int: add performance integration tests (#11173 ) usage example: go test --tags=integration,perf -count 1 ./integration -v -timeout 1h -run TestModelsPerf 2>&1 \| tee int.log cat int.log \| grep MODEL_PERF_HEADER \| cut -f2- -d: > perf.csv cat int.log \| grep MODEL_PERF_DATA \| cut -f2- -d: >> perf.csv	2025-12-29 06:39:40 -06:00
Daniel Hiltgen	ba750172ca	doc: add NVIDIA blackwell to supported list (#11307 )	2025-12-29 06:39:40 -06:00
Vincent RAMPAL	35bf6c0a41	Update base image to Ubuntu 24.04 LTS (#9681 )	2025-12-29 06:39:40 -06:00
Daniel Hiltgen	b23d28b549	doc: Update link for mac install (#11288 ) Favor the dmg now.	2025-12-29 06:39:40 -06:00
Daniel Hiltgen	e897624123	mimic logs for layers on new engine (#11278 ) This adds some extra logs to make the new engine a bit more consistent with the llama engine.	2025-12-29 06:39:39 -06:00
XuKecheng	a3e4bb7f58	readme: add NativeMind to community integrations (#11242 )	2025-12-29 06:39:39 -06:00
Jeffrey Morgan	9cf8ef9371	tools: fix parsing tool calls with empty arguments, missing required fields (#11233 )	2025-12-29 06:39:39 -06:00
Attogram Project	96be53fe6c	readme: add ollama-bash-toolshed to community integrations (#11224 )	2025-12-29 06:39:39 -06:00
Michael Yang	1cdab47113	chore: cleanup comments + unused vars (#11225 )	2025-12-29 06:39:39 -06:00
Jesse Gross	872d190c8f	ggml: Temporarily disable reporting UUIDs This is causing segfaults, so disable it. Currently UUIDs are only used for debugging purposes, although they planned to be used in additional ways in the future. Bug #11211	2025-12-29 06:39:39 -06:00
Michael Yang	8f2099306f	skip quantizing per_layer_token_embd (#11207 ) this tensor isn't compatible with cuda when quantized to q4_K so skip it	2025-12-29 06:39:38 -06:00
Daniel Hiltgen	59112600d1	ci: multi-stage release process (#11001 )	2025-12-29 06:39:38 -06:00
Jeffrey Morgan	10119ec2ee	fs/ggml: add multiplier in graph estimates (#11208 )	2025-12-29 06:39:38 -06:00
Jeffrey Morgan	84998ae4ba	fs/ggml: add missing architecture to OllamaEngineRequired() (#11206 )	2025-12-29 06:39:38 -06:00
Michael Yang	801564fa8b	add new gemma model (#11204 ) * update patches * cherry pick metal mean kernel * cherry pick cuda mean kernel * gemma3n	2025-12-29 06:39:38 -06:00
Daniel Hiltgen	d6253f09c2	ci: arm sbsa fixes (#11194 )	2025-12-29 06:39:37 -06:00
Daniel Hiltgen	9cf1db79b4	ci: include dependencies	2025-12-29 06:39:37 -06:00
Daniel Hiltgen	46654149c9	ci: pick up arm sbsa cuda libs (#11192 )	2025-12-29 06:39:37 -06:00
Daniel Hiltgen	138c973d8f	ci: recombine linux amd64 binaries (#11188 ) Glue the rocm and archive builds back together.	2025-12-29 06:39:37 -06:00
Devon Rifkin	dd8d037c16	load arrays with up to 1024 elements when estimating This mirrors the old behavior before #10382	2025-12-29 06:39:37 -06:00
Devon Rifkin	558c1920fa	ggml: fix crash for array head counts If it's an array, it uses the max value in the array If array values for head counts becomes more popular, we can consider a more invasive change like #10225 to calculate more accurate estimates. Fixes: #9984	2025-12-29 06:39:34 -06:00
Daniel Hiltgen	b9b179fe00	ci: rocm parallel builds on windows (#11187 ) The preset CMAKE_HIP_FLAGS isn't getting used on Windows. This passes the parallel flag in through the C/CXX flags, along with suppression for some log spew warnings to quiet down the build.	2025-12-29 06:38:19 -06:00
Daniel Hiltgen	38f92e7332	CI: switch windows to vs 2022 (#11184 ) * CI: switch windows to vs 2022 * ci: fix regex match	2025-12-29 06:38:18 -06:00
Daniel Hiltgen	c012d1805b	avoid context overflow (#11175 ) For smaller context models, make sure we do not exceed the training size.	2025-12-29 06:38:18 -06:00
Daniel Hiltgen	29ec3ddf9a	Re-remove cuda v11 (#10694 ) * Re-remove cuda v11 Revert the revert - drop v11 support requiring drivers newer than Feb 23 This reverts commit `c6bcdc4223`. * Simplify layout With only one version of the GPU libraries, we can simplify things down somewhat. (Jetsons still require special handling) * distinct sbsa variant for linux arm64 This avoids accidentally trying to load the sbsa cuda libraries on a jetson system which results in crashes. * temporary prevent rocm+cuda mixed loading	2025-12-29 06:38:18 -06:00
AJ	d8b03acc1a	readme: add ai-hub to community integrations (#11169 )	2025-12-29 06:38:18 -06:00
Daniel Hiltgen	95571375dd	build speedups (#11142 ) Enable parallel building of the GPU architectures.	2025-12-29 06:38:18 -06:00
Michael Yang	69ee842b6e	convert: utility for merging tensors (#11069 )	2025-12-29 06:38:17 -06:00
Michael Yang	4585d231ee	Reapply "feat: incremental gguf parser (#10822 )" (#11114 ) (#11119 ) * Reapply "feat: incremental gguf parser (#10822)" (#11114) This reverts commit `a6e64fbdf2`. * fix older ggufs	2025-12-29 06:38:17 -06:00
Jesse Gross	290d4c2c6c	ggml: Check return status for computation. We don't check the return status after computing the graph, which can silently lead to bad outputs if we try to keep going and future computation succeeds. This appears to happens in certain cases on Apple M2 devices. Fixes #11070	2025-12-29 06:38:17 -06:00
Daniel Hiltgen	29b668e649	int: add coverage for older models (#11137 ) Verified these fail on 0.9.1 and pass on HEAD.	2025-12-29 06:38:17 -06:00
Jeffrey Morgan	6d36b8dcfb	benchmark: remove unused benchmark test (#11120 ) Removes a test under benchmark/ that is unused	2025-12-29 06:38:17 -06:00
Jeffrey Morgan	5e3fb4744b	Revert "Revert "ggml: Export GPU UUIDs" (#11115 )" (#11117 ) Reverts PR #11115. The original change was mistakingly reverted instead of #10822	2025-12-29 06:38:16 -06:00
Jeffrey Morgan	c5237d9462	Revert "ggml: Export GPU UUIDs" (#11115 ) This reverts commit `aaa7818000`.	2025-12-29 06:38:16 -06:00
Jeffrey Morgan	4f1588bc37	Revert "feat: incremental gguf parser (#10822 )" (#11114 ) This reverts commit `6b04cad7e8`.	2025-12-29 06:38:16 -06:00
曹家巧	8c3501c161	cache: fix comment function name in cache.go (#11110 )	2025-12-29 06:38:16 -06:00
Jeffrey Morgan	829e77105a	tools: return empty arguments object instead of null (#11113 )	2025-12-29 06:38:16 -06:00
Jeffrey Morgan	1dc12706c5	tools: fix parsing tool calls without any parameters (#11101 ) Fixes issue where tool calls that don't expect any parameters were not being parsed. This also fixes two additional issues: one where 2+ tool calls would not be correctly parsed, and cases where tool calls with invalid parameters would still get parsed	2025-12-29 06:38:15 -06:00
Jeffrey Morgan	2c371ff357	model: treat 'user defined' tokens as special tokens (#11077 )	2025-12-29 06:38:15 -06:00
Michael Yang	142efb91b1	gguf: fix write order (#11068 ) * ggml: test write gguf order * ggml: fix write tensor order	2025-12-29 06:38:15 -06:00
NGC13009	7e0b662c6c	readme: add ollama-launcher to community integrations (#11080 )	2025-12-29 06:38:15 -06:00
Phil	4c7cf115fe	readme: add GPTranslate to community integrations (#11071 )	2025-12-29 06:38:15 -06:00
Jeffrey Morgan	2d86651985	tools: loosen tool parsing to allow for more formats (#11030 )	2025-12-29 06:38:14 -06:00
Michael Yang	2c6f1dc9c8	feat: incremental gguf parser (#10822 ) * incremental gguf parser * gguf: update test to not rely on gguf on disc * re-use existing create gguf * read capabilities from gguf kv * kv exists * update tests * s/doneFunc/successFunc/g * new buffered reader --------- Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>	2025-12-29 06:38:14 -06:00
Michael Yang	db3a312edf	feat: uneven splits (#11048 ) The current splitDim function only operates on tensors that are split evenly which isn't always the case, e.g. a QKV tensor. This change allows the function to be used for arbitrary splits	2025-12-29 06:38:14 -06:00
Michael Yang	0d5c118679	skip tokenizer.model if possible (#11050 ) if tokenizer.json is already copied, skip tokenizer.model	2025-12-29 06:38:14 -06:00
Michael Yang	eb2c2d61e5	use nn.Linear in place of ml.Tensor (#11049 ) while nn.Linear.Forward isn't applicable for sparse MLP, it's still a nice container for the tensors	2025-12-29 06:38:13 -06:00
Attogram Project	4fff1738a4	readme: add ollama-multirun to community integrations (#11038 )	2025-12-29 06:38:13 -06:00
Jeffrey Morgan	26a1129d71	readme: update quickstart link text to Gemma 3	2025-12-29 06:38:13 -06:00

1 2 3 4 5 ...

4377 Commits All Branches Search

4377 Commits

All Branches