ollama

Commit Graph

Author	SHA1	Message	Date
Gabe Goodhart	4f462a9f67	feat: Bump llama.cpp to 4a4f42 This picks up support for Kimi K2 and PLaMO-2 Branch: GraniteFour Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>	2025-07-15 14:49:15 -06:00
Gabe Goodhart	91e4b10d40	fix: Sync patch changes for ggml-cpu.c Branch: GraniteFour Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>	2025-07-11 16:01:15 -06:00
Gabe Goodhart	0beea04b52	fix: Add a patch to avoid power throttling API on non-msvc windows builds Branch: GraniteFour Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>	2025-07-11 16:00:49 -06:00
Gabe Goodhart	e8a303a701	build: Add top-level include for GNUINstallDirs in CMakeLists.txt This is used to populate CMAKE_INSTALL_BINDIR Branch: GraniteFour Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>	2025-07-11 13:44:10 -06:00
Gabe Goodhart	81d821ba9b	build: Include cmake/common.cmake in ggml sync Branch: GraniteFour Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>	2025-07-11 13:25:01 -06:00
Gabe Goodhart	bf1b261611	feat: Sync all patched code Branch: GraniteFour Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>	2025-07-11 11:44:18 -06:00
Gabe Goodhart	3020c462da	fix: Add patch for GGML_VERSION and GGML_COMMIT constants Branch: GraniteFour Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>	2025-07-11 11:43:14 -06:00
Gabe Goodhart	d7f98e0673	fix: Revert changes to ggml export GPU UUID patch Branch: GraniteFour Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>	2025-07-11 11:42:26 -06:00
Gabe Goodhart	111434ab39	feat: Bump back to the cenral repo and point at the latest master This includes granite 4 and a number of other model architectures! Branch: GraniteFour Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>	2025-07-11 10:43:22 -06:00
Gabe Goodhart	06a5592dc5	fix: Update patches for bump Branch: GraniteFour Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>	2025-07-10 16:01:30 -06:00
Gabe Goodhart	0a7ddc4e17	feat: Bump to the latest tip of the branch Branch: GraniteFour Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>	2025-07-10 16:01:14 -06:00
Gabe Goodhart	152260e9c7	fix: Update patch 0015 for upstream implementation of uuid Branch: GraniteFour Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>	2025-07-10 14:33:12 -06:00
Gabe Goodhart	e61826c180	Merge remote-tracking branch 'origin/main' into GraniteFour * origin/main: ggml: Report ordinal IDs for AMD GPUs on Windows doc: add MacOS docs (#11334) Reduce default parallelism to 1 (#11330) API/CLI context enhancements (#11331) add `tool_name` to api.md (#11326) template: add tool result compatibility (#11294) ci: modularization (#11324) Revert "ggml: Temporarily disable reporting UUIDs" readme: update Ollama icon size int: add performance integration tests (#11173) doc: add NVIDIA blackwell to supported list (#11307) Update base image to Ubuntu 24.04 LTS (#9681) doc: Update link for mac install (#11288) mimic logs for layers on new engine (#11278) readme: add NativeMind to community integrations (#11242) tools: fix parsing tool calls with empty arguments, missing required fields (#11233) readme: add ollama-bash-toolshed to community integrations (#11224)	2025-07-10 14:01:24 -06:00
Jesse Gross	35fda7b4af	ggml: Report ordinal IDs for AMD GPUs on Windows We don't get valid UUIDs for AMD GPUs on Windows, so the best option is to use the ordinal IDs. This brings us in line with what we currently do on the Ollama server - the only exception is AMD GPUs on Linux, which falls back to using ordinal IDs. The GGML implementation has no fallback but it doesn't appear to occur for any of the GPUs that we support. It's also possible that there are collisions between ordinal IDs for different libraries - however the only places where we use them are AMD on Windows and Metal on Mac, which can never occur on the same system.	2025-07-09 10:35:31 -07:00
Daniel Hiltgen	66fb8575ce	doc: add MacOS docs (#11334 ) also removes stale model dir instructions for windows	2025-07-08 15:38:04 -07:00
Daniel Hiltgen	20c3266e94	Reduce default parallelism to 1 (#11330 ) The current scheduler algorithm of picking the paralellism based on available VRAM complicates the upcoming dynamic layer memory allocation algorithm. This changes the default to 1, with the intent going forward that parallelism is explicit and will no longer be dynamically determined. Removal of the dynamic logic will come in a follow up.	2025-07-08 12:08:37 -07:00
Daniel Hiltgen	34088dbcfb	API/CLI context enhancements (#11331 ) * API: expose context size of loaded models * CLI: add context UX This adds a column in the ps output to show the models context size.	2025-07-08 11:59:06 -07:00
Parth Sareen	43107b15b9	add `tool_name` to api.md (#11326 )	2025-07-07 16:53:13 -07:00
Parth Sareen	1f91cb0c8c	template: add tool result compatibility (#11294 )	2025-07-07 15:53:42 -07:00
Daniel Hiltgen	12d8ad0d38	ci: modularization (#11324 ) switch a few constants to variables	2025-07-07 14:07:43 -07:00
Jesse Gross	592d21e7db	Revert "ggml: Temporarily disable reporting UUIDs" The root cause was an unclean upgrade - this code is fine. This reverts commit `45f216a9c7`.	2025-07-07 11:31:02 -07:00
Jeffrey Morgan	5a08b01f5b	readme: update Ollama icon size	2025-07-05 17:20:42 -07:00
Daniel Hiltgen	4f473e224c	int: add performance integration tests (#11173 ) usage example: go test --tags=integration,perf -count 1 ./integration -v -timeout 1h -run TestModelsPerf 2>&1 \| tee int.log cat int.log \| grep MODEL_PERF_HEADER \| cut -f2- -d: > perf.csv cat int.log \| grep MODEL_PERF_DATA \| cut -f2- -d: >> perf.csv	2025-07-05 16:07:09 -07:00
Daniel Hiltgen	9d60bb44cf	doc: add NVIDIA blackwell to supported list (#11307 )	2025-07-05 16:06:30 -07:00
Vincent RAMPAL	f371260e75	Update base image to Ubuntu 24.04 LTS (#9681 )	2025-07-05 16:02:33 -07:00
Daniel Hiltgen	c9e6d7719e	doc: Update link for mac install (#11288 ) Favor the dmg now.	2025-07-03 09:48:45 -07:00
Daniel Hiltgen	2c4ce40334	mimic logs for layers on new engine (#11278 ) This adds some extra logs to make the new engine a bit more consistent with the llama engine.	2025-07-02 16:38:36 -07:00
XuKecheng	5d8c173529	readme: add NativeMind to community integrations (#11242 )	2025-07-01 09:46:15 -07:00
Jeffrey Morgan	44b17d2bfa	tools: fix parsing tool calls with empty arguments, missing required fields (#11233 )	2025-06-30 08:59:03 -07:00
Attogram Project	3b8b692218	readme: add ollama-bash-toolshed to community integrations (#11224 )	2025-06-29 14:59:54 -07:00
Gabe Goodhart	34ff84df43	fix: Use c++17 and include vendor for go wrapper modules Branch: GraniteFour Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>	2025-06-27 17:23:27 -06:00
Gabe Goodhart	d395132510	fix: Add sync'ed stb vendored header Branch: GraniteFour Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>	2025-06-27 17:17:23 -06:00
Gabe Goodhart	16c116c2b7	fix: Add missing stb to llama.cpp rsync-filter Branch: GraniteFour Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>	2025-06-27 17:16:58 -06:00
Gabe Goodhart	58300273f4	fix: Apply patch for mtmd_text_input Branch: GraniteFour Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>	2025-06-27 17:09:48 -06:00
Gabe Goodhart	f358dd5a1c	fix: Use mtmd_helper to correctly load the bitmap for the image Branch: GraniteFour Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>	2025-06-27 17:09:05 -06:00
Gabe Goodhart	dbd8ee2654	fix: Fix support for arch-specific ggml-cpu source files with new arrangement In https://github.com/ggml-org/llama.cpp/pull/13892, all arch-specific implementations were split out into a nested tree structure under ggml-cpu/arch. This conflicts with standard CGO layout where all arch-specific source files are expected to live in the same directory as the parent go module and use suffixes based on GOOS and GOARCH. As such, there were really two options for getting this to work: 1. Add a patch on top of the GGML sync to rearrange the files to match the GO layout convention 2. Use CGO directives to conditionally include the nested source files in the compilation units This commit does (2) in order to minimize the set of changes needed on top of the upstream file layout. To get this to work, there are two key things needed: 1. In cpu.go, #cgo directives are added to explicitly set __${GOARCH}__ in the preprocessor directives 2. In arch-impls.c\|cpp, use an #ifdef \| #elif defined \| #endif chain to explicitly include the .c\|.cpp files for the given architecture from the nested directory Branch: GraniteFour Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>	2025-06-27 17:08:56 -06:00
Gabe Goodhart	7334a0ea07	chore: Ignore *.patched in the patch directory Branch: GraniteFour Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>	2025-06-27 17:08:42 -06:00
Gabe Goodhart	1664d52be6	fix: Add patch for mtmd_input_text Branch: GraniteFour Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>	2025-06-27 17:08:29 -06:00
Gabe Goodhart	3d70237fd1	fix: Update llama.go to use mtmd instead of clip/llava It's _very_ possible that this is broken! Branch: GraniteFour Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>	2025-06-27 17:06:47 -06:00
Gabe Goodhart	fa54a3cf3a	fix: Add missing include in sampling_ext.cpp Branch: GraniteFour Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>	2025-06-27 17:06:40 -06:00
Gabe Goodhart	d0fd9e5aa2	fix: Remove mtmd main cpp files Branch: GraniteFour Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>	2025-06-27 17:06:31 -06:00
Gabe Goodhart	1cd9352cc3	fix: Narrow llama.cpp rsync-filter to not include mtmd main tool cpp files Branch: GraniteFour Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>	2025-06-27 17:06:18 -06:00
Gabe Goodhart	85aba511ec	fix: Add ggml files missing from sync Branch: GraniteFour Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>	2025-06-27 17:06:05 -06:00
Gabe Goodhart	62af160d82	fix: Update ggml rsync-filter for new ggml-cpu/arch subdirs Branch: GraniteFour Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>	2025-06-27 17:05:39 -06:00
Gabe Goodhart	414a097372	fix: Add files missing from sync Branch: GraniteFour Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>	2025-06-27 17:05:25 -06:00
Gabe Goodhart	424e05c20e	fix: Update rsync-filter for all moved/new/removed files Branch: GraniteFour Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>	2025-06-27 17:04:51 -06:00
Gabe Goodhart	2613f5da2d	feat: Sync llama.cpp and ggml Branch: GraniteFour Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>	2025-06-27 17:01:24 -06:00
Gabe Goodhart	73d089bb90	feat: Update all patches There are a number that are no longer needed at all: - 0003-embeddings: Embeddings entirely overhauled on master - 0008-ensure-KV-cache-is-fully-defragmented: KV caching entirely overhauled on master - 0019-metal-add-mean-kernel-14267: Merged upstream - 0020-CUDA-add-mean-operation-14313: Merged upstream Branch: GraniteFour Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>	2025-06-27 16:57:05 -06:00
Gabe Goodhart	a30ae1fa20	TEMPORARY: Update the llama.cpp upstream to my fork's Granite Four branch This will be redone once my branch is merged upstream in llama.cpp Branch: GraniteFour Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>	2025-06-27 16:24:42 -06:00
Michael Yang	4129af9205	chore: cleanup comments + unused vars (#11225 )	2025-06-27 11:45:33 -07:00

1 2 3 4 5 ...

4428 Commits All Branches Search

4428 Commits

All Branches