ollama/server
Devon Rifkin d20cd8df80 fix incorrect chat truncation
The dynamically calculated `NumCtx` value wasn't making it all the way
to the chat handler

This fix made us notice that the minimum setting of `NumCtx` to 4 inside
of `server/sched.go` was accidentally removed in #10364. The minimum
doesn't make it out to the client code, which is important for
embeddings, as demonstrated in `TestAllMiniLMEmbedTruncate`. This should
be cleaned up more, but probably is caused by start and end tokens in
the embedding, so small context sizes need some work there. See the
comment in `server/routes.go` for more information on the temporary hack
that's been added to propagate the dynamically calculated `NumCtx` (the
-1 guard there is to keep embeddings working if you set `NumCtx` to some
small value like `1`).

Fixes: #10441
2025-04-28 16:11:36 -07:00
..
internal fix superfluous call to WriteHeader 2025-04-25 16:58:49 -07:00
testdata/tools all: fix typos in documentation, code, and comments (#7021) 2024-12-10 12:58:06 -08:00
auth.go fix nil deref in auth.go 2024-07-26 14:14:48 -07:00
create.go explicitly decode maxarraysize 1024 2025-04-25 16:59:01 -07:00
create_test.go server: validate local path on safetensor create (#9379) 2025-02-28 16:10:43 -08:00
download.go server: organize error types (#9465) 2025-03-28 11:50:22 -07:00
fixblobs.go server: replace blob prefix separator from ':' to '-' (#3146) 2024-03-14 20:18:06 -07:00
fixblobs_test.go server: replace blob prefix separator from ':' to '-' (#3146) 2024-03-14 20:18:06 -07:00
images.go explicitly decode maxarraysize 1024 2025-04-25 16:59:01 -07:00
images_test.go api: return model capabilities from the show endpoint (#10066) 2025-04-01 15:21:46 -07:00
layer.go One corrupt manifest should not wedge model operations (#7515) 2024-11-05 14:21:45 -08:00
manifest.go One corrupt manifest should not wedge model operations (#7515) 2024-11-05 14:21:45 -08:00
manifest_test.go One corrupt manifest should not wedge model operations (#7515) 2024-11-05 14:21:45 -08:00
model.go explicitly decode maxarraysize 1024 2025-04-25 16:59:01 -07:00
model_test.go Update the /api/create endpoint to use JSON (#7935) 2024-12-31 18:02:30 -08:00
modelpath.go server: organize error types (#9465) 2025-03-28 11:50:22 -07:00
modelpath_test.go server: more support for mixed-case model names (#8017) 2024-12-11 15:29:59 -08:00
prompt.go gemma3: Allow multiple image in a single input 2025-03-14 15:38:54 -07:00
prompt_test.go prompt: Don't trim whitespace from prompts 2024-12-09 11:02:55 -08:00
routes.go fix incorrect chat truncation 2025-04-28 16:11:36 -07:00
routes_create_test.go next ollama runner (#7913) 2025-02-13 16:31:21 -08:00
routes_delete_test.go Update the /api/create endpoint to use JSON (#7935) 2024-12-31 18:02:30 -08:00
routes_generate_test.go fix incorrect chat truncation 2025-04-28 16:11:36 -07:00
routes_list_test.go Update the /api/create endpoint to use JSON (#7935) 2024-12-31 18:02:30 -08:00
routes_test.go server/internal/client/ollama: hold DiskCache on Registry (#9463) 2025-03-02 20:55:44 -08:00
sched.go fix incorrect chat truncation 2025-04-28 16:11:36 -07:00
sched_test.go fix incorrect chat truncation 2025-04-28 16:11:36 -07:00
sparse_common.go Don't hard fail on sparse setup error 2024-08-09 12:16:19 -07:00
sparse_windows.go Don't hard fail on sparse setup error 2024-08-09 12:16:19 -07:00
upload.go server: always print upload/download part info (#8832) 2025-02-04 19:30:49 -08:00