Commit Graph

657 Commits

Author SHA1 Message Date
Roy Han c0b5bf0a36 testing clean up 2024-07-12 11:45:45 -07:00
Roy Han 53e9576f46 testing clean up 2024-07-11 20:20:14 -07:00
Roy Han dbe9527305 clean up 2024-07-11 17:28:55 -07:00
Roy Han 694388db90 set context length 2024-07-10 15:21:46 -07:00
Roy Han 8f6d0242b6 refactoring 2024-07-09 16:19:02 -07:00
Roy Han b686ac144c merge conflicts 2024-07-09 14:00:13 -07:00
royjhan 786848dfd3
Merge branch 'main' into royh-batchembed 2024-07-09 13:48:06 -07:00
Roy Han fb390b8902 embedding type 64 2024-07-09 13:41:48 -07:00
Roy Han bcb63e6e0e touches 2024-07-09 13:37:00 -07:00
Michael Yang 6bbbc50f10
Merge pull request #5440 from ollama/mxyng/messages-templates
update named templates
2024-07-09 09:36:32 -07:00
Michael Yang 9bbddc37a7
Merge pull request #5126 from ollama/mxyng/messages
update message processing
2024-07-09 09:20:44 -07:00
Jeffrey Morgan e4ff73297d
server: fix model reloads when setting `OLLAMA_NUM_PARALLEL` (#5560)
* server: fix unneeded model reloads when setting `OLLAMA_NUM_PARALLEL`

* remove whitespace change

* undo some changes
2024-07-08 22:32:15 -07:00
Roy Han 3342e5f035 merge conflicts 2024-07-08 15:15:09 -07:00
royjhan b7c622dd32
Merge branch 'main' into royh-batchembed 2024-07-08 15:10:52 -07:00
Jeffrey Morgan 0ee87615c7
sched: don't error if paging to disk on Windows and macOS (#5523) 2024-07-06 22:01:52 -04:00
Michael Yang fb6cbc02fb update named templates 2024-07-05 16:29:32 -07:00
Michael Yang ac7a842e55 fix model reloading
ensure runtime model changes (template, system prompt, messages,
options) are captured on model updates without needing to reload the
server
2024-07-05 13:17:25 -07:00
Michael Yang 2c3fe1fd97 comments 2024-07-05 13:17:24 -07:00
Michael Yang 269ed6e6a2 update message processing 2024-07-05 13:16:58 -07:00
Daniel Hiltgen af28b94533
Merge pull request #5469 from dhiltgen/prevent_system_oom
Prevent loading models larger than total memory
2024-07-05 08:22:20 -07:00
Anatoli Babenia 0d16eb310e
fix: use `envconfig.ModelsDir` directly (#4821)
* Co-authored-by: Anatoli Babenia <anatoli@rainforce.org>

Co-authored-by: Maas Lalani <maas@lalani.dev>
2024-07-03 15:36:11 -07:00
Daniel Hiltgen 955f2a4e03 Only set default keep_alive on initial model load
This change fixes the handling of keep_alive so that if client
request omits the setting, we only set this on initial load.  Once
the model is loaded, if new requests leave this unset, we'll keep
whatever keep_alive was there.
2024-07-03 15:29:56 -07:00
Daniel Hiltgen 3c75113e37 Prevent loading models larger than total memory
Users may not realize the siny new model they're trying to load
fits on their disk, but can't load into system+GPU memory.  Today
we crash, but with this fix, we'll give them a better error message
before even trying to load it.
2024-07-03 14:47:42 -07:00
Roy Han 6caac01494 clear comments 2024-07-03 14:05:34 -07:00
Roy Han 17de2b4405 Refactoring of legacy and new 2024-07-03 14:02:25 -07:00
Roy Han 922b8f2584 input handling and handler testing 2024-07-03 12:48:54 -07:00
Roy Han a413014aaf refactoring 2024-07-03 11:21:06 -07:00
royjhan a5f23d766e
Merge branch 'main' into royh-batchembed 2024-07-03 11:20:24 -07:00
Roy Han 95e46eeedf move normalize test 2024-07-03 09:45:42 -07:00
Michael Yang 65a5040e09 fix generate template 2024-07-02 16:42:17 -07:00
royjhan d626b99b54
OpenAI: v1/completions compatibility (#5209)
* OpenAI v1 models

* Refactor Writers

* Add Test

Co-Authored-By: Attila Kerekes

* Credit Co-Author

Co-Authored-By: Attila Kerekes <439392+keriati@users.noreply.github.com>

* Empty List Testing

* Use Namespace for Ownedby

* Update Test

* Add back envconfig

* v1/models docs

* Use ModelName Parser

* Test Names

* Remove Docs

* Clean Up

* Test name

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Add Middleware for Chat and List

* Completions Endpoint

* Testing Cleanup

* Test with Fatal

* Add functionality to chat test

* Rename function

* float types

* type cleanup

* cleaning

* more cleaning

* Extra test cases

* merge conflicts

* merge conflicts

* merge conflicts

* merge conflicts

* cleaning

* cleaning

---------

Co-authored-by: Attila Kerekes <439392+keriati@users.noreply.github.com>
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2024-07-02 16:01:45 -07:00
Michael Yang dddb58a38b
Merge pull request #5051 from ollama/mxyng/capabilities
add model capabilities
2024-07-02 14:26:07 -07:00
Michael Yang 400056e154
Merge pull request #5420 from ollama/mxyng/insecure-path
err on insecure path
2024-07-02 14:03:23 -07:00
royjhan 996bb1b85e
OpenAI: /v1/models and /v1/models/{model} compatibility (#5007)
* OpenAI v1 models

* Refactor Writers

* Add Test

Co-Authored-By: Attila Kerekes

* Credit Co-Author

Co-Authored-By: Attila Kerekes <439392+keriati@users.noreply.github.com>

* Empty List Testing

* Use Namespace for Ownedby

* Update Test

* Add back envconfig

* v1/models docs

* Use ModelName Parser

* Test Names

* Remove Docs

* Clean Up

* Test name

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Add Middleware for Chat and List

* Testing Cleanup

* Test with Fatal

* Add functionality to chat test

* OpenAI: /v1/models/{model} compatibility (#5028)

* Retrieve Model

* OpenAI Delete Model

* Retrieve Middleware

* Remove Delete from Branch

* Update Test

* Middleware Test File

* Function name

* Cleanup

* Test Update

* Test Update

---------

Co-authored-by: Attila Kerekes <439392+keriati@users.noreply.github.com>
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2024-07-02 11:50:56 -07:00
Roy Han 3d060e0ae9 move normalize 2024-07-02 10:35:02 -07:00
Roy Han 00a4cb26ca use float32 2024-07-02 10:30:29 -07:00
Roy Han 512e0a7bde Clean up 2024-07-01 16:29:54 -07:00
Roy Han 1a0c8b363c Truncation Integration Tests 2024-07-01 16:26:30 -07:00
Michael Yang 88bcd79bb9 err on insecure path 2024-07-01 15:55:59 -07:00
Roy Han aee25acb5b move normalization to go 2024-07-01 14:10:58 -07:00
Roy Han 9c32b6b9ed Truncation 2024-07-01 11:59:44 -07:00
Roy Han 1daac52651 Truncation 2024-07-01 11:55:16 -07:00
Michael Yang da8e2a0447 use kvs to detect embedding models 2024-07-01 10:47:43 -07:00
Michael Yang a30915bde1 add capabilities 2024-07-01 10:47:43 -07:00
Michael Yang 58e3fff311 rename templates to template 2024-07-01 10:40:54 -07:00
Michael Yang 3f0b309ad4 remove ManifestV2 2024-07-01 10:40:54 -07:00
Daniel Hiltgen cff3f44f4a Fix case for NumCtx 2024-07-01 09:43:59 -07:00
Daniel Hiltgen 3518aaef33
Merge pull request #4218 from dhiltgen/auto_parallel
Enable concurrency by default
2024-07-01 08:32:29 -07:00
Roy Han 80c1a3f812 playing around with truncate stuff 2024-06-28 18:17:09 -07:00
Roy Han c111d8bb51 normalization 2024-06-28 17:19:04 -07:00