ollama/server
Daniel Hiltgen 4879a234c4
build: Make target improvements (#7499)
* llama: wire up builtin runner

This adds a new entrypoint into the ollama CLI to run the cgo built runner.
On Mac arm64, this will have GPU support, but on all other platforms it will
be the lowest common denominator CPU build.  After we fully transition
to the new Go runners more tech-debt can be removed and we can stop building
the "default" runner via make and rely on the builtin always.

* build: Make target improvements

Add a few new targets and help for building locally.
This also adjusts the runner lookup to favor local builds, then
runners relative to the executable, and finally payloads.

* Support customized CPU flags for runners

This implements a simplified custom CPU flags pattern for the runners.
When built without overrides, the runner name contains the vector flag
we check for (AVX) to ensure we don't try to run on unsupported systems
and crash.  If the user builds a customized set, we omit the naming
scheme and don't check for compatibility.  This avoids checking
requirements at runtime, so that logic has been removed as well.  This
can be used to build GPU runners with no vector flags, or CPU/GPU
runners with additional flags (e.g. AVX512) enabled.

* Use relative paths

If the user checks out the repo in a path that contains spaces, make gets
really confused so use relative paths for everything in-repo to avoid breakage.

* Remove payloads from main binary

* install: clean up prior libraries

This removes support for v0.3.6 and older versions (before the tar bundle)
and ensures we clean up prior libraries before extracting the bundle(s).
Without this change, runners and dependent libraries could leak when we
update and lead to subtle runtime errors.
2024-12-10 09:47:19 -08:00
..
imageproc add more tests for getting the optimal tiled canvas (#7411) 2024-10-29 16:28:02 -07:00
testdata/tools server: add tool parsing support for nemotron-mini (#6849) 2024-09-17 18:06:16 -07:00
auth.go fix nil deref in auth.go 2024-07-26 14:14:48 -07:00
download.go server: fix blob download when receiving a 200 response (#6656) 2024-09-05 10:48:26 -07:00
fixblobs.go server: replace blob prefix separator from ':' to '-' (#3146) 2024-03-14 20:18:06 -07:00
fixblobs_test.go server: replace blob prefix separator from ':' to '-' (#3146) 2024-03-14 20:18:06 -07:00
images.go server: fix Transport override (#7834) 2024-11-25 15:08:34 -08:00
layer.go One corrupt manifest should not wedge model operations (#7515) 2024-11-05 14:21:45 -08:00
manifest.go One corrupt manifest should not wedge model operations (#7515) 2024-11-05 14:21:45 -08:00
manifest_test.go One corrupt manifest should not wedge model operations (#7515) 2024-11-05 14:21:45 -08:00
model.go image processing for llama3.2 (#6963) 2024-10-18 16:12:35 -07:00
model_test.go api: enable tool streaming (#7836) 2024-11-27 13:40:57 -08:00
modelpath.go validate model path 2024-08-28 09:32:57 -07:00
modelpath_test.go validate model path 2024-08-28 09:32:57 -07:00
prompt.go prompt: Don't trim whitespace from prompts 2024-12-09 11:02:55 -08:00
prompt_test.go prompt: Don't trim whitespace from prompts 2024-12-09 11:02:55 -08:00
routes.go build: Make target improvements (#7499) 2024-12-10 09:47:19 -08:00
routes_create_test.go Merge pull request #6534 from ollama/mxyng/messages 2024-08-30 09:39:59 -07:00
routes_delete_test.go server: clean up route names for consistency (#6524) 2024-08-26 19:36:11 -07:00
routes_generate_test.go api: enable tool streaming (#7836) 2024-11-27 13:40:57 -08:00
routes_list_test.go server: clean up route names for consistency (#6524) 2024-08-26 19:36:11 -07:00
routes_test.go server: allow mixed-case model names on push, pull, cp, and create (#7676) 2024-11-19 15:05:57 -08:00
sched.go sched: Lift parallel restriction for multimodal models except mllama 2024-11-06 13:32:18 -08:00
sched_test.go Rename gpu package discover (#7143) 2024-10-16 17:45:00 -07:00
sparse_common.go Don't hard fail on sparse setup error 2024-08-09 12:16:19 -07:00
sparse_windows.go Don't hard fail on sparse setup error 2024-08-09 12:16:19 -07:00
upload.go server: limit upload parts to 16 (#6411) 2024-08-19 09:20:52 -07:00