ollama/server
nicole pardal 5d347f6d6f
server: Consolidate embedding truncation in runner (#12730)
Currently, checking the length of prompts for embeddings to ensure
they fit in the context window (and possible truncation) occurs in
two places - the Ollama server and runner. This can lead to
inconsistencies in both the checks and reported number of tokens
processed. Since we have to do this processing in the runner, this
consolidates all of the logic there.
2025-10-27 11:59:12 -07:00
..
internal refactor: use the built-in max/min to simplify the code (#12280) 2025-09-16 17:14:21 -07:00
auth.go fix nil deref in auth.go 2024-07-26 14:14:48 -07:00
create.go engine: add remote proxy (#12307) 2025-09-17 14:40:53 -07:00
create_test.go engine: add remote proxy (#12307) 2025-09-17 14:40:53 -07:00
download.go server: abort download on empty digest 2025-05-27 11:28:48 -07:00
fixblobs.go server: replace blob prefix separator from ':' to '-' (#3146) 2024-03-14 20:18:06 -07:00
fixblobs_test.go server: replace blob prefix separator from ':' to '-' (#3146) 2024-03-14 20:18:06 -07:00
images.go templates: fix crash in improperly defined templates (#12483) 2025-10-02 17:25:55 -07:00
images_test.go Reapply "feat: incremental gguf parser (#10822)" (#11114) (#11119) 2025-06-20 11:11:40 -07:00
layer.go One corrupt manifest should not wedge model operations (#7515) 2024-11-05 14:21:45 -08:00
manifest.go One corrupt manifest should not wedge model operations (#7515) 2024-11-05 14:21:45 -08:00
manifest_test.go One corrupt manifest should not wedge model operations (#7515) 2024-11-05 14:21:45 -08:00
model.go tools: refactor tool call parsing and enable streaming (#10415) 2025-05-23 14:19:31 -07:00
modelpath.go server: add hint to the error message when model path access fails (#10843) 2025-05-24 13:17:04 -07:00
modelpath_test.go lint: enable usetesting, disable tenv (#10594) 2025-05-08 11:42:14 -07:00
prompt.go add registries for parsers/renderers 2025-10-14 01:13:54 -07:00
prompt_test.go Reapply "add truncate and shift parameters" (#12582) 2025-10-11 16:06:14 -07:00
quantization.go skip quantizing per_layer_token_embd (#11207) 2025-06-26 21:49:35 -07:00
quantization_test.go Reapply "feat: incremental gguf parser (#10822)" (#11114) (#11119) 2025-06-20 11:11:40 -07:00
routes.go server: Consolidate embedding truncation in runner (#12730) 2025-10-27 11:59:12 -07:00
routes_create_test.go fs(ggml): fill in arch prefix if necessary (#12646) 2025-10-20 16:42:18 -07:00
routes_debug_test.go DRY out the runner lifecycle code (#12540) 2025-10-23 11:20:02 -07:00
routes_delete_test.go fs(ggml): fill in arch prefix if necessary (#12646) 2025-10-20 16:42:18 -07:00
routes_generate_renderer_test.go DRY out the runner lifecycle code (#12540) 2025-10-23 11:20:02 -07:00
routes_generate_test.go DRY out the runner lifecycle code (#12540) 2025-10-23 11:20:02 -07:00
routes_harmony_streaming_test.go DRY out the runner lifecycle code (#12540) 2025-10-23 11:20:02 -07:00
routes_list_test.go Update the /api/create endpoint to use JSON (#7935) 2024-12-31 18:02:30 -08:00
routes_test.go engine: add remote proxy (#12307) 2025-09-17 14:40:53 -07:00
sched.go DRY out the runner lifecycle code (#12540) 2025-10-23 11:20:02 -07:00
sched_test.go server: Consolidate embedding truncation in runner (#12730) 2025-10-27 11:59:12 -07:00
sparse_common.go Don't hard fail on sparse setup error 2024-08-09 12:16:19 -07:00
sparse_windows.go Don't hard fail on sparse setup error 2024-08-09 12:16:19 -07:00
upload.go server: always print upload/download part info (#8832) 2025-02-04 19:30:49 -08:00