ollama

Author	SHA1	Message	Date
Josh Yan	163ee9a8b0	isLocal firstdraft	2024-07-22 15:51:52 -07:00
Josh Yan	de7b2f3948	clean	2024-07-22 15:51:52 -07:00
Josh Yan	f27c66fb0c	rm bench	2024-07-22 15:51:52 -07:00
Josh Yan	a238191798	rm config	2024-07-22 15:51:52 -07:00
Josh Yan	6436c7a375	rm config	2024-07-22 15:51:52 -07:00
Josh Yan	896a15874e	clean	2024-07-22 15:51:52 -07:00
Josh Yan	56008688a1	local path	2024-07-22 15:51:52 -07:00
Josh Yan	d14d38e940	still works	2024-07-22 15:51:52 -07:00
Josh Yan	03df02883d	rebase	2024-07-22 15:51:52 -07:00
Josh Yan	ae49abf80a	benchmark	2024-07-22 15:51:52 -07:00
Josh Yan	2c450502db	on disk copy	2024-07-22 15:51:52 -07:00
Josh Yan	46b76aeb46	start tests	2024-07-22 15:51:52 -07:00
Josh Yan	0e01da82d6	errorsis	2024-07-22 15:51:31 -07:00
Josh Yan	6b1b85ba3d	hide initialize keypair	2024-07-22 15:41:04 -07:00
Josh Yan	5603441538	test	2024-07-22 13:58:50 -07:00
Josh Yan	76b4dfcc9e	auth	2024-07-22 13:54:02 -07:00
Daniel Hiltgen	5784c05397	Merge pull request #5854 from dhiltgen/win_exit_status Refine error reporting for subprocess crash	2024-07-22 10:40:22 -07:00
Daniel Hiltgen	f14aa5435d	Merge pull request #5855 from dhiltgen/remove_max_vram Remove no longer supported max vram var	2024-07-22 10:35:29 -07:00
Jeffrey Morgan	f8fedbda20	Update llama.cpp submodule commit to `d94c6e0c` (#5805 ) v0.2.8-rc2	2024-07-22 12:42:00 -04:00
Jeffrey Morgan	b3e5491e41	server: collect nested tool call objects when parsing (#5824 )	2024-07-22 12:38:03 -04:00
Daniel Hiltgen	cc269ba094	Remove no longer supported max vram var The OLLAMA_MAX_VRAM env var was a temporary workaround for OOM scenarios. With Concurrency this was no longer wired up, and the simplistic value doesn't map to multi-GPU setups. Users can still set `num_gpu` to limit memory usage to avoid OOM if we get our predictions wrong.	2024-07-22 09:08:11 -07:00
Daniel Hiltgen	a3c20e3f18	Refine error reporting for subprocess crash On windows, the exit status winds up being the search term many users search for and end up piling in on issues that are unrelated. This refines the reporting so that if we have a more detailed message we'll suppress the exit status portion of the message.	2024-07-22 08:52:16 -07:00
Jeffrey Morgan	80ee9b5e47	Remove out of space test temporarily (#5825 )	2024-07-21 00:22:11 -04:00
Jeffrey Morgan	5534f2cc6a	llm: consider `head_dim` in llama arch (#5817 ) v0.2.8-rc1	2024-07-20 21:48:12 -04:00
Daniel Hiltgen	d321297d8a	Merge pull request #5815 from dhiltgen/win_rocm_gfx_features Adjust windows ROCm discovery	2024-07-20 16:02:55 -07:00
Daniel Hiltgen	06e5d74e34	Merge pull request #5506 from dhiltgen/sched_tests Refine scheduler unit tests for reliability	2024-07-20 15:48:39 -07:00
Daniel Hiltgen	5d707e6fd5	Merge pull request #5583 from dhiltgen/integration_improvements Fix context exhaustion integration test for small gpus	2024-07-20 15:48:21 -07:00
Daniel Hiltgen	283948c83b	Adjust windows ROCm discovery The v5 hip library returns unsupported GPUs which wont enumerate at inference time in the runner so this makes sure we align discovery. The gfx906 cards are no longer supported so we shouldn't compile with that GPU type as it wont enumerate at runtime.	2024-07-20 15:17:50 -07:00
Jeffrey Morgan	1475eab95f	add patch for tekken (#5807 )	2024-07-20 13:41:21 -04:00
Jeffrey Morgan	20090f3172	preserve last assistant message (#5802 )	2024-07-19 20:19:26 -07:00
Jeffrey Morgan	69a2d4ccff	Fix generate test flakyness (#5804 )	2024-07-19 19:11:25 -07:00
Josh	e8b954c646	server: validate template (#5734 ) add template validation to modelfile	2024-07-19 15:24:29 -07:00
royjhan	c57317cbf0	OpenAI: Function Based Testing (#5752 ) * distinguish error forwarding * more coverage * rm comment	2024-07-19 11:37:12 -07:00
royjhan	51b2fd299c	adjust openai chat msg processing (#5729 )	2024-07-19 11:19:20 -07:00
Michael Yang	d0634b1596	Merge pull request #5780 from ollama/mxyng/tools fix parsing tool calls: break on unexpected eofs v0.2.7	2024-07-18 12:14:10 -07:00
Michael Yang	43606d6d6a	fix parsing tool calls	2024-07-18 12:08:11 -07:00
Jeffrey Morgan	70b1010fa5	server: check for empty tools array too (#5779 )	2024-07-18 11:44:57 -07:00
Jeffrey Morgan	84e5721f3a	always provide content even if empty (#5778 )	2024-07-18 11:28:19 -07:00
Jeffrey Morgan	319fb1ce03	server: only parse tool calls if tools are provided (#5771 ) * server: only parse tool calls if tools are provided * still set `resp.Message.Content` v0.2.6	2024-07-18 08:50:23 -07:00
Michael Yang	b255445557	marshal json automatically for some template values (#5758 )	2024-07-17 15:35:11 -07:00
Michael Yang	b23424bb3c	Merge pull request #5753 from ollama/mxyng/parse-tool-call parse tool call as individual objects	2024-07-17 11:47:53 -07:00
Michael Yang	5fd6988126	parse tool call as individual objects	2024-07-17 11:19:04 -07:00
Michael Yang	5b82960df8	stub response (#5750 )	2024-07-17 10:39:22 -07:00
Michael Yang	cc9a252d8c	Merge pull request #5732 from ollama/mxyng/cleanup remove ToolCall from GenerateResponse	2024-07-17 10:26:54 -07:00
Pákozdi György	d281a6e603	add sidellama link (#5702 )	2024-07-17 10:24:44 -07:00
royjhan	154f6f45d4	OpenAI: Support Tools (#5614 ) * reopen pr * tools * remove tc from stream for now * ID and Function * openai expects arguments to be a string (#5739) * mutually exclusive content and tool calls * clean up --------- Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>	2024-07-16 20:52:59 -07:00
royjhan	0d41623b52	OpenAI: Add Suffix to `v1/completions` (#5611 ) * add suffix * remove todo * remove TODO * add to test * rm outdated prompt tokens info md * fix test * fix test	2024-07-16 20:50:14 -07:00
Michael Yang	c279f96371	remove ToolCall from GenerateResponse	2024-07-16 15:22:49 -07:00
Michael Yang	499e87c9ba	Merge pull request #5730 from ollama/mxyng/cleanup remove unneeded tool calls	2024-07-16 14:42:13 -07:00
Michael Yang	cd0853f2d5	Merge pull request #5207 from ollama/mxyng/suffix add insert support to generate endpoint	2024-07-16 14:37:32 -07:00

1 2 3 4 5 ...

3195 Commits