ollama

History

Jesse Gross f560bd077f llm: Use Ollama engine memory layouts for both old and new engines Currently for both the old and new engines, there is code to calculate how much memory is required for a model and lay out the layers onto GPUs. This reuses the new engine's lay out code for the old engine as well, bringing them closer together. The old engine continues to use its current method of estimating required memory. This reduces maintainence effort and improves consistency, as new features only need to be implemented in one place. The newer code is also more accurate, especially with multiple GPUs.		2025-11-11 13:11:08 -08:00
..
internal	refactor: use the built-in max/min to simplify the code (#12280 )	2025-09-16 17:14:21 -07:00
auth.go	fix nil deref in auth.go	2024-07-26 14:14:48 -07:00
create.go	create: inherit FROM model's renderer/parser	2025-10-27 15:14:19 -07:00
create_test.go	engine: add remote proxy (#12307 )	2025-09-17 14:40:53 -07:00
download.go	server: fix duplicate 'is' typo in comment (#12985 )	2025-11-06 14:44:44 -08:00
fixblobs.go	server: replace blob prefix separator from ':' to '-' (#3146 )	2024-03-14 20:18:06 -07:00
fixblobs_test.go	server: replace blob prefix separator from ':' to '-' (#3146 )	2024-03-14 20:18:06 -07:00
images.go	templates: fix crash in improperly defined templates (#12483 )	2025-10-02 17:25:55 -07:00
images_test.go	Reapply "feat: incremental gguf parser (#10822 )" (#11114 ) (#11119 )	2025-06-20 11:11:40 -07:00
layer.go	One corrupt manifest should not wedge model operations (#7515 )	2024-11-05 14:21:45 -08:00
logprob.go	server: add logprobs and top_logprobs support to Ollama's API (#12899 )	2025-11-11 08:49:50 -08:00
manifest.go	One corrupt manifest should not wedge model operations (#7515 )	2024-11-05 14:21:45 -08:00
manifest_test.go	One corrupt manifest should not wedge model operations (#7515 )	2024-11-05 14:21:45 -08:00
model.go	tools: refactor tool call parsing and enable streaming (#10415 )	2025-05-23 14:19:31 -07:00
modelpath.go	server: add hint to the error message when model path access fails (#10843 )	2025-05-24 13:17:04 -07:00
modelpath_test.go	lint: enable usetesting, disable tenv (#10594 )	2025-05-08 11:42:14 -07:00
prompt.go	add registries for parsers/renderers	2025-10-14 01:13:54 -07:00
prompt_test.go	Reapply "add truncate and shift parameters" (#12582 )	2025-10-11 16:06:14 -07:00
quantization.go	skip quantizing per_layer_token_embd (#11207 )	2025-06-26 21:49:35 -07:00
quantization_test.go	Reapply "feat: incremental gguf parser (#10822 )" (#11114 ) (#11119 )	2025-06-20 11:11:40 -07:00
routes.go	server: add logprobs and top_logprobs support to Ollama's API (#12899 )	2025-11-11 08:49:50 -08:00
routes_create_test.go	create: inherit FROM model's renderer/parser	2025-10-27 15:14:19 -07:00
routes_debug_test.go	api: add omitempty to required tool function parameter type (#12989 )	2025-11-06 14:08:55 -08:00
routes_delete_test.go	fs(ggml): fill in arch prefix if necessary (#12646 )	2025-10-20 16:42:18 -07:00
routes_generate_renderer_test.go	DRY out the runner lifecycle code (#12540 )	2025-10-23 11:20:02 -07:00
routes_generate_test.go	server: add logprobs and top_logprobs support to Ollama's API (#12899 )	2025-11-11 08:49:50 -08:00
routes_harmony_streaming_test.go	api: add omitempty to required tool function parameter type (#12989 )	2025-11-06 14:08:55 -08:00
routes_list_test.go	Update the /api/create endpoint to use JSON (#7935 )	2024-12-31 18:02:30 -08:00
routes_test.go	engine: add remote proxy (#12307 )	2025-09-17 14:40:53 -07:00
sched.go	llm: Use Ollama engine memory layouts for both old and new engines	2025-11-11 13:11:08 -08:00
sched_test.go	app: add code for macOS and Windows apps under 'app' (#12933 )	2025-11-04 11:40:17 -08:00
sparse_common.go	Don't hard fail on sparse setup error	2024-08-09 12:16:19 -07:00
sparse_windows.go	Don't hard fail on sparse setup error	2024-08-09 12:16:19 -07:00
upload.go	server: always print upload/download part info (#8832 )	2025-02-04 19:30:49 -08:00