ollama

Commit Graph

Author	SHA1	Message	Date
Blake Mizerany	9f2d8d2117	...	2024-04-04 00:11:31 -07:00
Blake Mizerany	d42c3f6be1	x/build/blob: add fuzz test for ParseRef	2024-04-03 23:50:43 -07:00
Blake Mizerany	4ea3e9efa6	x/build/blob: lock in zero allocs for ParseRef	2024-04-03 23:03:36 -07:00
Blake Mizerany	2e1ea6ecaa	x/build/blob: move most commit value checks to emit func	2024-04-03 22:55:53 -07:00
Blake Mizerany	6d2da77ce2	x/build/blob: add Parts for streaming ref parts Also, make ParseRef use the new Parts method to parse the ref parts.	2024-04-03 22:27:55 -07:00
Blake Mizerany	def4d902bf	... wip still broke	2024-04-03 22:15:58 -07:00
Blake Mizerany	76a202c04e	...	2024-04-03 20:52:27 -07:00
Blake Mizerany	f7cfe946dc	x/registry: fixing tests wip	2024-04-03 16:37:27 -07:00
Blake Mizerany	005b6373e2	x/registry: fix startMinio	2024-04-03 16:19:50 -07:00
Blake Mizerany	d54e0fb3b2	...	2024-04-03 16:14:22 -07:00
Blake Mizerany	bdd05e0ae0	x/registry: skip ref test	2024-04-03 15:59:23 -07:00
Blake Mizerany	1a346640db	x/registry: work on getting basic test passing	2024-04-03 15:58:04 -07:00
Blake Mizerany	f5883070f8	x/registry: upload smoke test passing	2024-04-03 14:30:58 -07:00
Blake Mizerany	adc23d5f96	Add 'x/' from commit 'a10a11b9d371f36b7c3510da32a1d70b74e27bd1' git-subtree-dir: x git-subtree-mainline: `7d05a6ee8f` git-subtree-split: `a10a11b9d3`	2024-04-03 10:40:23 -07:00
Blake Mizerany	a10a11b9d3	registry: initial work on multipart pushes	2024-04-03 10:39:30 -07:00
Blake Mizerany	7d05a6ee8f	cmd: provide feedback if OLLAMA_MODELS is set on non-serve command (#3470 ) This also moves the checkServerHeartbeat call out of the "RunE" Cobra stuff (that's the only word I have for that) to on-site where it's after the check for OLLAMA_MODELS, which allows the helpful error message to be printed before the server heartbeat check. This also arguably makes the code more readable without the magic/superfluous "pre" function caller.	2024-04-02 22:11:13 -07:00
Daniel Hiltgen	464d817824	Merge pull request #3464 from dhiltgen/subprocess Fix numgpu opt miscomparison	2024-04-02 20:10:17 -07:00
Pier Francesco Contino	531324a9be	feat: add OLLAMA_DEBUG in ollama server help message (#3461 ) Co-authored-by: Pier Francesco Contino <pfcontino@gmail.com>	2024-04-02 18:20:03 -07:00
Daniel Hiltgen	6589eb8a8c	Revert options as a ref in the server	2024-04-02 16:44:10 -07:00
Michael Yang	a039e383cd	Merge pull request #3465 from ollama/mxyng/fix-metal fix metal gpu	2024-04-02 16:29:58 -07:00
Michael Yang	80163ebcb5	fix metal gpu	2024-04-02 16:06:45 -07:00
Daniel Hiltgen	a57818d93e	Merge pull request #3343 from dhiltgen/bump_more2 Bump llama.cpp to b2581	2024-04-02 15:08:26 -07:00
Blake Mizerany	94befe366a	...	2024-04-02 14:28:06 -07:00
Blake Mizerany	c95f97689b	utils/upload: init	2024-04-02 14:15:21 -07:00
Blake Mizerany	618eb5b909	registry: multipart push	2024-04-02 13:40:23 -07:00
Daniel Hiltgen	841adda157	Fix windows lint CI flakiness	2024-04-02 12:22:16 -07:00
Daniel Hiltgen	0035e31af8	Bump to b2581	2024-04-02 11:53:07 -07:00
Blake Mizerany	eb75418be9	build/blob: test ParseRef round-trip	2024-04-02 11:45:01 -07:00
Blake Mizerany	9959da05de	build/blob: break out test refs for other tests/fuzzing	2024-04-02 11:38:10 -07:00
Daniel Hiltgen	c863c6a96d	Merge pull request #3218 from dhiltgen/subprocess Switch back to subprocessing for llama.cpp	2024-04-02 10:49:44 -07:00
Blake Mizerany	aff7970628	build: remove superfluous parseCompleteRef	2024-04-01 23:41:42 -07:00
Blake Mizerany	628f1feb36	build: back to taking manifests as []byte Its nicer to have the manifests be an opaque []byte, rather than a struct. This way users of the build package don't need to know about the internal structure of the manifests. The registry can interpret the manifests as it sees fit, while letting build keep its own Go type of manifest which is easier to work with in the build package.	2024-04-01 23:18:58 -07:00
Blake Mizerany	ce3125afd5	registry: add New and take a minio client as argument	2024-04-01 22:53:49 -07:00
Blake Mizerany	f488652ba7	build: make Build accept only refs without builds	2024-04-01 22:12:43 -07:00
Blake Mizerany	2318ed2919	build: remove unused manifest()	2024-04-01 21:59:38 -07:00
Blake Mizerany	b1b8be33d9	build: cleanup error names and other things	2024-04-01 21:57:34 -07:00
Blake Mizerany	876f7eab81	build: move Manifest from internal/blobstore to build It was getting confusing to have the arbirary handling of manifests in the blobstore. It also prevented us from using model.Ref in the blobstore because of cyclic dependencies. This is much easier to grok now.	2024-04-01 21:43:30 -07:00
Blake Mizerany	7cfc8a0838	build/blob: fix awkward Ref type	2024-04-01 21:25:18 -07:00
Daniel Hiltgen	1f11b52511	Refined min memory from testing	2024-04-01 16:48:33 -07:00
Daniel Hiltgen	526d4eb204	Release gpu discovery library after use Leaving the cudart library loaded kept ~30m of memory pinned in the GPU in the main process. This change ensures we don't hold GPU resources when idle.	2024-04-01 16:48:33 -07:00
Daniel Hiltgen	0a74cb31d5	Safeguard for noexec We may have users that run into problems with our current payload model, so this gives us an escape valve.	2024-04-01 16:48:33 -07:00
Daniel Hiltgen	10ed1b6292	Detect too-old cuda driver "cudart init failure: 35" isn't particularly helpful in the logs.	2024-04-01 16:48:33 -07:00
Daniel Hiltgen	4fec5816d6	Integration test improvements Cleaner shutdown logic, a bit of response hardening	2024-04-01 16:48:18 -07:00
Daniel Hiltgen	0a0e9f3e0f	Apply 01-cache.diff	2024-04-01 16:48:18 -07:00
Daniel Hiltgen	58d95cc9bd	Switch back to subprocessing for llama.cpp This should resolve a number of memory leak and stability defects by allowing us to isolate llama.cpp in a separate process and shutdown when idle, and gracefully restart if it has problems. This also serves as a first step to be able to run multiple copies to support multiple models concurrently.	2024-04-01 16:48:18 -07:00
Patrick Devine	3b6a9154dd	Simplify model conversion (#3422 )	2024-04-01 16:14:53 -07:00
Michael Yang	d6dd2ff839	Merge pull request #3241 from ollama/mxyng/mem update memory estimations for gpu offloading	2024-04-01 13:59:14 -07:00
Michael Yang	e57a6ba89f	Merge pull request #2926 from ollama/mxyng/decode-ggml-v2 refactor model parsing	2024-04-01 13:58:13 -07:00
Michael Yang	12ec2346ef	Merge pull request #3442 from ollama/mxyng/generate-output fix generate output	2024-04-01 13:56:09 -07:00
Michael Yang	1ec0df1069	fix generate output	2024-04-01 13:47:34 -07:00

1 2 3 4 5 ...

2343 Commits All Branches Search

2343 Commits

All Branches