ollama

History

Jesse Gross d897a54f08 server: Reduce gpt-oss context length for small VRAM GPUs gpt-oss works best with a context length of at least 8k. However, for GPUs with limited amount of VRAM, there is a significant performance hit to this increased context. In these cases, we switch to the Ollama default of 4k		2025-12-29 06:39:50 -06:00
..
internal	cache: fix comment function name in cache.go (#11110 )	2025-12-29 06:38:16 -06:00
auth.go	fix nil deref in auth.go	2024-07-26 14:14:48 -07:00
create.go	remove support for multiple ggufs in a single file (#10722 )	2025-12-29 06:38:05 -06:00
create_test.go	server: validate local path on safetensor create (#9379 )	2025-02-28 16:10:43 -08:00
download.go	server: abort download on empty digest	2025-12-29 06:38:09 -06:00
fixblobs.go	server: replace blob prefix separator from ':' to '-' (#3146 )	2024-03-14 20:18:06 -07:00
fixblobs_test.go	server: replace blob prefix separator from ':' to '-' (#3146 )	2024-03-14 20:18:06 -07:00
harmonyparser.go	gpt-oss (#11672 )	2025-12-29 06:39:48 -06:00
harmonyparser_test.go	gpt-oss (#11672 )	2025-12-29 06:39:48 -06:00
images.go	gpt-oss (#11672 )	2025-12-29 06:39:48 -06:00
images_test.go	Reapply "feat: incremental gguf parser (#10822 )" (#11114 ) (#11119 )	2025-12-29 06:38:17 -06:00
layer.go	One corrupt manifest should not wedge model operations (#7515 )	2024-11-05 14:21:45 -08:00
manifest.go	One corrupt manifest should not wedge model operations (#7515 )	2024-11-05 14:21:45 -08:00
manifest_test.go	One corrupt manifest should not wedge model operations (#7515 )	2024-11-05 14:21:45 -08:00
model.go	tools: refactor tool call parsing and enable streaming (#10415 )	2025-12-29 06:38:07 -06:00
modelpath.go	server: add hint to the error message when model path access fails (#10843 )	2025-12-29 06:38:07 -06:00
modelpath_test.go	lint: enable usetesting, disable tenv (#10594 )	2025-12-29 06:37:55 -06:00
prompt.go	gpt-oss (#11672 )	2025-12-29 06:39:48 -06:00
prompt_test.go	gpt-oss (#11672 )	2025-12-29 06:39:48 -06:00
quantization.go	skip quantizing per_layer_token_embd (#11207 )	2025-12-29 06:39:38 -06:00
quantization_test.go	Reapply "feat: incremental gguf parser (#10822 )" (#11114 ) (#11119 )	2025-12-29 06:38:17 -06:00
routes.go	server: Reduce gpt-oss context length for small VRAM GPUs	2025-12-29 06:39:50 -06:00
routes_create_test.go	Move quantization to new backend (#10363 )	2025-12-29 06:37:52 -06:00
routes_delete_test.go	Update the /api/create endpoint to use JSON (#7935 )	2024-12-31 18:02:30 -08:00
routes_generate_test.go	tools: support anyOf types	2025-12-29 06:39:49 -06:00
routes_harmony_streaming_test.go	tools: support anyOf types	2025-12-29 06:39:49 -06:00
routes_list_test.go	Update the /api/create endpoint to use JSON (#7935 )	2024-12-31 18:02:30 -08:00
routes_test.go	server: use slices.Equal to simplify code (#11502 )	2025-12-29 06:39:45 -06:00
sched.go	Reduce default parallelism to 1 (#11330 )	2025-12-29 06:39:41 -06:00
sched_test.go	Reapply "feat: incremental gguf parser (#10822 )" (#11114 ) (#11119 )	2025-12-29 06:38:17 -06:00
sparse_common.go	Don't hard fail on sparse setup error	2024-08-09 12:16:19 -07:00
sparse_windows.go	Don't hard fail on sparse setup error	2024-08-09 12:16:19 -07:00
upload.go	server: always print upload/download part info (#8832 )	2025-02-04 19:30:49 -08:00