ollama/server
Bruce MacDonald 883f655dd6 server: model info caching system for improved performance
Implements an in-memory cache for loaded models with file modification
time tracking to ensure cache validity. Models are now cached after
first load and retrieved from cache on subsequent requests if the
underlying manifest file hasn't changed.

Key changes:
- Add ModelCache with get/set methods and modification time validation
- Cache models in GetModel() and check cache before disk load
- Move capabilities calculation to model loading time and store in model
- Update capability access to use cached field instead of runtime calculation
- Add test coverage for cache behavior and model loading

This reduces redundant model loading operations and improves response
times for model access.
2025-06-16 15:16:58 -07:00
..
internal lint: enable usetesting, disable tenv (#10594) 2025-05-08 11:42:14 -07:00
auth.go fix nil deref in auth.go 2024-07-26 14:14:48 -07:00
cache.go server: model info caching system for improved performance 2025-06-16 15:16:58 -07:00
cache_test.go server: model info caching system for improved performance 2025-06-16 15:16:58 -07:00
create.go remove support for multiple ggufs in a single file (#10722) 2025-05-21 13:55:31 -07:00
create_test.go server: validate local path on safetensor create (#9379) 2025-02-28 16:10:43 -08:00
download.go server: abort download on empty digest 2025-05-27 11:28:48 -07:00
fixblobs.go server: replace blob prefix separator from ':' to '-' (#3146) 2024-03-14 20:18:06 -07:00
fixblobs_test.go server: replace blob prefix separator from ':' to '-' (#3146) 2024-03-14 20:18:06 -07:00
images.go server: model info caching system for improved performance 2025-06-16 15:16:58 -07:00
images_test.go feat: incremental gguf parser (#10822) 2025-06-12 11:04:11 -07:00
layer.go One corrupt manifest should not wedge model operations (#7515) 2024-11-05 14:21:45 -08:00
manifest.go One corrupt manifest should not wedge model operations (#7515) 2024-11-05 14:21:45 -08:00
manifest_test.go One corrupt manifest should not wedge model operations (#7515) 2024-11-05 14:21:45 -08:00
model.go tools: refactor tool call parsing and enable streaming (#10415) 2025-05-23 14:19:31 -07:00
modelpath.go server: add hint to the error message when model path access fails (#10843) 2025-05-24 13:17:04 -07:00
modelpath_test.go lint: enable usetesting, disable tenv (#10594) 2025-05-08 11:42:14 -07:00
prompt.go add thinking support to the api and cli (#10584) 2025-05-28 19:38:52 -07:00
prompt_test.go add thinking support to the api and cli (#10584) 2025-05-28 19:38:52 -07:00
quantization.go server: improve tensor quantization fallback logic (#10806) 2025-05-22 10:48:08 -07:00
quantization_test.go feat: incremental gguf parser (#10822) 2025-06-12 11:04:11 -07:00
routes.go server: model info caching system for improved performance 2025-06-16 15:16:58 -07:00
routes_create_test.go Move quantization to new backend (#10363) 2025-05-06 11:20:48 -07:00
routes_delete_test.go Update the /api/create endpoint to use JSON (#7935) 2024-12-31 18:02:30 -08:00
routes_generate_test.go add thinking support to the api and cli (#10584) 2025-05-28 19:38:52 -07:00
routes_list_test.go Update the /api/create endpoint to use JSON (#7935) 2024-12-31 18:02:30 -08:00
routes_test.go fix: stream accumulator exits early (#10593) 2025-05-08 13:17:30 -07:00
sched.go sched: fix runner leak during reloading unload (#10819) 2025-05-22 14:31:36 -07:00
sched_test.go feat: incremental gguf parser (#10822) 2025-06-12 11:04:11 -07:00
sparse_common.go Don't hard fail on sparse setup error 2024-08-09 12:16:19 -07:00
sparse_windows.go Don't hard fail on sparse setup error 2024-08-09 12:16:19 -07:00
upload.go server: always print upload/download part info (#8832) 2025-02-04 19:30:49 -08:00