Commit Graph

303 Commits

Author SHA1 Message Date
Michael Yang a250c2cb13 display messages 2024-07-26 13:39:57 -07:00
Michael Yang 3d9de805b7 fix: model save
stop parameter is saved as a slice which is incompatible with modelfile
parsing
2024-07-26 13:23:06 -07:00
Michael Yang 15af558423 include modelfile messages 2024-07-26 11:40:11 -07:00
Daniel Hiltgen 830fdd2715 Better explain multi-gpu behavior 2024-07-23 15:16:38 -07:00
Michael Yang 55cd3ddcca bool 2024-07-22 11:27:21 -07:00
Michael Yang 4f1afd575d host 2024-07-22 11:25:30 -07:00
Daniel Hiltgen cc269ba094 Remove no longer supported max vram var
The OLLAMA_MAX_VRAM env var was a temporary workaround for OOM
scenarios.  With Concurrency this was no longer wired up, and the simplistic
value doesn't map to multi-GPU setups.  Users can still set `num_gpu`
to limit memory usage to avoid OOM if we get our predictions wrong.
2024-07-22 09:08:11 -07:00
Patrick Devine 057d31861e
remove template (#5655) 2024-07-13 20:56:24 -07:00
Patrick Devine 23ebbaa46e Revert "remove template from tests"
This reverts commit 9ac0a7a50b.
2024-07-12 15:47:17 -07:00
Patrick Devine 9ac0a7a50b remove template from tests 2024-07-12 15:41:31 -07:00
royjhan 5f034f5b63
Include Show Info in Interactive (#5342) 2024-06-28 13:15:52 -07:00
royjhan b910fa9010
Ollama Show: Check for Projector Type (#5307)
* Check exists projtype

* Maintain Ordering
2024-06-28 11:30:16 -07:00
Michael Yang 123a722a6f
zip: prevent extracting files into parent dirs (#5314) 2024-06-26 21:38:21 -07:00
Blake Mizerany 2aa91a937b
cmd: defer stating model info until necessary (#5248)
This commit changes the 'ollama run' command to defer fetching model
information until it really needs it. That is, when in interactive mode.

It also removes one such case where the model information is fetch in
duplicate, just before calling generateInteractive and then again, first
thing, in generateInteractive.

This positively impacts the performance of the command:

    ; time ./before run llama3 'hi'
    Hi! It's nice to meet you. Is there something I can help you with, or would you like to chat?

    ./before run llama3 'hi'  0.02s user 0.01s system 2% cpu 1.168 total
    ; time ./before run llama3 'hi'
    Hi! It's nice to meet you. Is there something I can help you with, or would you like to chat?

    ./before run llama3 'hi'  0.02s user 0.01s system 2% cpu 1.220 total
    ; time ./before run llama3 'hi'
    Hi! It's nice to meet you. Is there something I can help you with, or would you like to chat?

    ./before run llama3 'hi'  0.02s user 0.01s system 2% cpu 1.217 total
    ; time ./after run llama3 'hi'
    Hi! It's nice to meet you. Is there something I can help you with, or would you like to chat?

    ./after run llama3 'hi'  0.02s user 0.01s system 4% cpu 0.652 total
    ; time ./after run llama3 'hi'
    Hi! It's nice to meet you. Is there something I can help you with, or would you like to chat?

    ./after run llama3 'hi'  0.01s user 0.01s system 5% cpu 0.498 total
    ; time ./after run llama3 'hi'
    Hi! It's nice to meet you. Is there something I can help you with or would you like to chat?

    ./after run llama3 'hi'  0.01s user 0.01s system 3% cpu 0.479 total
    ; time ./after run llama3 'hi'
    Hi! It's nice to meet you. Is there something I can help you with, or would you like to chat?

    ./after run llama3 'hi'  0.02s user 0.01s system 5% cpu 0.507 total
    ; time ./after run llama3 'hi'
    Hi! It's nice to meet you. Is there something I can help you with, or would you like to chat?

    ./after run llama3 'hi'  0.02s user 0.01s system 5% cpu 0.507 total
2024-06-24 20:14:03 -07:00
royjhan fedf71635e
Extend api/show and ollama show to return more model info (#4881)
* API Show Extended

* Initial Draft of Information

Co-Authored-By: Patrick Devine <pdevine@sonic.net>

* Clean Up

* Descriptive arg error messages and other fixes

* Second Draft of Show with Projectors Included

* Remove Chat Template

* Touches

* Prevent wrapping from files

* Verbose functionality

* Docs

* Address Feedback

* Lint

* Resolve Conflicts

* Function Name

* Tests for api/show model info

* Show Test File

* Add Projector Test

* Clean routes

* Projector Check

* Move Show Test

* Touches

* Doc update

---------

Co-authored-by: Patrick Devine <pdevine@sonic.net>
2024-06-19 14:19:02 -07:00
Patrick Devine c69bc19e46
move OLLAMA_HOST to envconfig (#5009) 2024-06-12 18:48:16 -04:00
Michael Yang 201d853fdf nolintlint 2024-06-04 11:13:30 -07:00
Michael Yang e40145a39d lint 2024-06-04 11:13:30 -07:00
Michael Yang 8ffb51749f nolintlint 2024-06-04 11:13:30 -07:00
Michael Yang 04f3c12bb7 replace x/exp/slices with slices 2024-06-04 11:13:30 -07:00
Josh Yan 914f68f021 replaced duplicate call with variable 2024-05-30 10:38:07 -07:00
Josh Yan bd1d119ba9 fixed japanese characters deleted at end of line 2024-05-30 10:24:21 -07:00
Lei Jitang a03be18189
Fix OLLAMA_LLM_LIBRARY with wrong map name and add more env vars to help message (#4663)
* envconfig/config.go: Fix wrong description of OLLAMA_LLM_LIBRARY

Signed-off-by: Lei Jitang <leijitang@outlook.com>

* serve: Add more env to help message of ollama serve

Add more enviroment variables to `ollama serve --help`
to let users know what can be configurated.

Signed-off-by: Lei Jitang <leijitang@outlook.com>

---------

Signed-off-by: Lei Jitang <leijitang@outlook.com>
2024-05-30 09:36:51 -07:00
Patrick Devine 4cc3be3035
Move envconfig and consolidate env vars (#4608) 2024-05-24 14:57:15 -07:00
Josh 9f18b88a06
Merge pull request #4566 from ollama/jyan/shortcuts
add Ctrl + W shortcut
2024-05-21 22:49:36 -07:00
Josh Yan 353f83a9c7 add Ctrl + W shortcut 2024-05-21 16:55:09 -07:00
Patrick Devine d355d2020f add fixes for llama 2024-05-20 16:13:57 -07:00
Patrick Devine ccdf0b2a44
Move the parser back + handle utf16 files (#4533) 2024-05-20 11:26:45 -07:00
Patrick Devine 105186aa17
add OLLAMA_NOHISTORY to turn off history in interactive mode (#4508) 2024-05-18 11:51:57 -07:00
Josh Yan 3d90156e99 removed comment 2024-05-16 14:12:03 -07:00
Josh Yan 26bfc1c443 go fmt'd cmd.go 2024-05-15 17:26:39 -07:00
Josh Yan 799aa9883c go fmt'd cmd.go 2024-05-15 17:24:17 -07:00
Josh Yan c9e584fb90 updated double-width display 2024-05-15 16:45:24 -07:00
Josh Yan 17b1e81ca1 fixed width and word count for double spacing 2024-05-15 16:29:33 -07:00
Patrick Devine c344da4c5a
fix keepalive for non-interactive mode (#4438) 2024-05-14 15:17:04 -07:00
Patrick Devine a4b8d1f89a
re-add system context (#4435) 2024-05-14 11:38:20 -07:00
Patrick Devine 7ca71a6b0f
don't abort when an invalid model name is used in /save (#4416) 2024-05-13 18:48:28 -07:00
Patrick Devine 6845988807
Ollama `ps` command for showing currently loaded models (#4327) 2024-05-13 17:17:36 -07:00
Josh Yan f8464785a6 removed inconsistencies 2024-05-13 14:50:52 -07:00
Josh Yan 91a090a485 removed inconsistent punctuation 2024-05-13 14:08:22 -07:00
todashuta 8080fbce35
fix `ollama create`'s usage string (#4362) 2024-05-11 14:47:49 -07:00
Jeffrey Morgan 6602e793c0
Use `--quantize` flag and `quantize` api parameter (#4321)
* rename `--quantization` to `--quantize`

* backwards

* Update api/types.go

Co-authored-by: Michael Yang <mxyng@pm.me>

---------

Co-authored-by: Michael Yang <mxyng@pm.me>
2024-05-10 13:06:13 -07:00
Tobias Gårdhus 06ac829e70
Fix help string for stop parameter (#2307) 2024-05-07 16:48:35 -07:00
Jeffrey Morgan 39d9d22ca3
close server on receiving signal (#4213) 2024-05-06 16:01:37 -07:00
Michael Yang b7a87a22b6
Merge pull request #4059 from ollama/mxyng/parser-2
rename parser to model/file
2024-05-03 13:01:22 -07:00
Michael Yang e9ae607ece
Merge pull request #3892 from ollama/mxyng/parser
refactor modelfile parser
2024-05-02 17:04:47 -07:00
Bryce Reitano bf4fc25f7b
Add a /clear command (#3947)
* Add a /clear command

* change help messages

---------

Co-authored-by: Patrick Devine <patrick@infrahq.com>
2024-05-01 17:44:36 -04:00
Michael Yang 45b6a12e45 server: target invalid 2024-05-01 12:40:45 -07:00
Michael Yang 119589fcb3 rename parser to model/file 2024-05-01 09:53:50 -07:00
Michael Yang 5ea844964e cmd: import regexp 2024-05-01 09:53:45 -07:00
Michael Yang 176ad3aa6e parser: add commands format 2024-05-01 09:52:54 -07:00
Bruce MacDonald 0a7fdbe533
prompt to display and add local ollama keys to account (#3717)
- return descriptive error messages when unauthorized to create blob or push a model
- display the local public key associated with the request that was denied
2024-04-30 11:02:08 -07:00
Patrick Devine 9009bedf13
better checking for OLLAMA_HOST variable (#3661) 2024-04-29 19:14:07 -04:00
Michael Yang 41e03ede95 check file type before zip 2024-04-26 14:18:07 -07:00
Michael Yang ac0801eced only replace if it matches command 2024-04-24 14:49:26 -07:00
Michael Yang ad66e5b060 split temp zip files 2024-04-24 14:18:01 -07:00
Bruce MacDonald 658e60cf73 Revert "stop running model on interactive exit"
This reverts commit fad00a85e5.
2024-04-22 17:23:11 -07:00
Bruce MacDonald fad00a85e5 stop running model on interactive exit 2024-04-22 16:22:14 -07:00
Blake Mizerany 949d7832cf
Revert "cmd: provide feedback if OLLAMA_MODELS is set on non-serve command (#3470)" (#3662)
This reverts commit 7d05a6ee8f.

This proved to be more painful than useful.

See: https://github.com/ollama/ollama/issues/3624
2024-04-15 16:58:00 -07:00
Patrick Devine 9f8691c6c8
Add llama2 / torch models for `ollama create` (#3607) 2024-04-15 11:26:42 -07:00
Michael Yang 9502e5661f cgo quantize 2024-04-08 15:31:08 -07:00
Blake Mizerany 7d05a6ee8f
cmd: provide feedback if OLLAMA_MODELS is set on non-serve command (#3470)
This also moves the checkServerHeartbeat call out of the "RunE" Cobra
stuff (that's the only word I have for that) to on-site where it's after
the check for OLLAMA_MODELS, which allows the helpful error message to
be printed before the server heartbeat check. This also arguably makes
the code more readable without the magic/superfluous "pre" function
caller.
2024-04-02 22:11:13 -07:00
Pier Francesco Contino 531324a9be
feat: add OLLAMA_DEBUG in ollama server help message (#3461)
Co-authored-by: Pier Francesco Contino <pfcontino@gmail.com>
2024-04-02 18:20:03 -07:00
Patrick Devine 5a5efee46b
Add gemma safetensors conversion (#3250)
Co-authored-by: Michael Yang <mxyng@pm.me>
2024-03-28 18:54:01 -07:00
Patrick Devine 1b272d5bcd
change `github.com/jmorganca/ollama` to `github.com/ollama/ollama` (#3347) 2024-03-26 13:04:17 -07:00
Daniel Hiltgen da20786e3e
Merge pull request #3068 from dhiltgen/win_pipe
Use stdin for term discovery on windows
2024-03-14 11:55:19 -07:00
Jeffrey Morgan 672ffe9b7d
add `OLLAMA_KEEP_ALIVE` to environment variable docs for `ollama serve` (#3127) 2024-03-13 14:35:33 -07:00
Daniel Hiltgen c1a81c6fe3 Use stdin for term discovery on windows
When you feed input to the cmd via a pipe it no longer reports a warning
2024-03-13 10:37:31 -07:00
Blake Mizerany 2ada81e068
cmd: tighten up env var usage sections (#2962)
Also, document OLLAMA_HOST client semantics per command that honors it.
This looks nicer than having a general puprose environment variable
section in the root usage which was showing up after the "addition help
topics" section outputed by Cobra's default template.

It was decided this was easier to work with than using a custom template
for Cobra right now.
2024-03-07 13:57:07 -08:00
Patrick Devine 2c017ca441
Convert Safetensors to an Ollama model (#2824) 2024-03-06 21:01:51 -08:00
Blake Mizerany 0ded7fdc4b
cmd: document environment variables for serve command
Updates #2944
2024-03-06 13:48:46 -08:00
Michael Yang fd10a2ad4b remove format/openssh.go
this is unnecessary now that x/crypto/ssh.MarshalPrivateKey has been
added
2024-02-23 16:52:23 -08:00
lulz ce0c95d097
[fix] /bye and /exit are now treated as prefixes (#2381)
* [fix] /bye and /exit are now treated as prefixes
instead of being treated as entire lines which doesn't align with the way the rest of the commands are treated

* Update cmd/interactive.go

Fixing whitespace

---------

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2024-02-19 21:56:49 -05:00
Bruce MacDonald 88622847c6
fix: chat system prompting overrides (#2542) 2024-02-16 14:42:43 -05:00
Daniel Hiltgen a468ae0459
Merge pull request #2499 from ollama/windows-preview
Windows Preview
2024-02-15 16:06:32 -08:00
Daniel Hiltgen 4a10e7a7fa Harden the OLLAMA_HOST lookup for quotes 2024-02-15 13:46:56 -08:00
Daniel Hiltgen 823a520266 Fix lint error on ignored error for win console 2024-02-15 05:56:45 +00:00
vinjn 66ef308abd Import "containerd/console" lib to support colorful output in Windows terminal 2024-02-15 05:56:45 +00:00
Daniel Hiltgen 29e90cc13b Implement new Go based Desktop app
This focuses on Windows first, but coudl be used for Mac
and possibly linux in the future.
2024-02-15 05:56:45 +00:00
Jeffrey Morgan 1f9078d6ae
Check image filetype in api handlers (#2467) 2024-02-12 11:16:20 -08:00
Jeffrey Morgan 09a6f76f4c fix error on `ollama run` with a non-existent model 2024-02-01 23:11:52 -08:00
Jeffrey Morgan e135167484
Add multimodel support to `ollama run` in noninteractive mopde (#2317) 2024-02-01 21:33:06 -08:00
Jeffrey Morgan 38296ab352
clear previous images when submitting an image to `ollama run` (#2316) 2024-02-01 21:30:26 -08:00
Jeffrey Morgan 7913104527
Improvements to `ollama run` for multimodal models (#2300) 2024-02-01 17:09:51 -08:00
Patrick Devine 7c40a67841
Save and load sessions (#2063) 2024-01-25 12:12:36 -08:00
Michael Yang b6c0ef1e70
Merge pull request #1961 from jmorganca/mxyng/rm-double-newline
remove double newlines in /set parameter
2024-01-12 15:18:19 -08:00
Patrick Devine 565f8a3c44
Convert the REPL to use /api/chat for interactive responses (#1936) 2024-01-12 12:05:52 -08:00
Michael Yang 5121b7ac9c remove double newlines in /set parameter 2024-01-12 11:21:15 -08:00
Michael Yang 2bb2bdd5d4 fix lint 2024-01-09 09:36:58 -08:00
Michael Yang 62023177f6
Merge pull request #1614 from jmorganca/mxyng/fix-set-template
fix: set template without triple quotes
2024-01-09 09:36:24 -08:00
Bruce MacDonald 7e8f7c8358
remove ggml automatic re-pull (#1856) 2024-01-08 14:41:01 -05:00
Daniel Hiltgen e0d05b0f1e Accept windows paths for image processing
This enhances our regex to support windows style paths.  The regex will
match invalid path specifications, but we'll still validate file
existence and filter out mismatches
2024-01-06 10:50:27 -08:00
Michael Yang 5580ae2472 fix: set template without triple quotes 2024-01-05 15:51:33 -08:00
Bruce MacDonald 3a9f447141
only pull gguf model if already exists (#1817) 2024-01-05 18:50:00 -05:00
Patrick Devine 9c2941e61b
switch api for ShowRequest to use the name field (#1816) 2024-01-05 15:06:43 -08:00
Bruce MacDonald 4f4980b66b
simplify ggml update logic (#1814)
- additional information is now available in show response, use this to pull gguf before running
- make gguf updates cancellable
2024-01-05 15:22:32 -05:00
Patrick Devine 22e93efa41 add show info command and fix the modelfile 2024-01-05 12:20:05 -08:00
Patrick Devine 2909dce894 split up interactive generation 2024-01-05 12:20:05 -08:00
Patrick Devine d0409f772f
keyboard shortcut help (#1764) 2024-01-02 18:04:12 -08:00
Daniel Hiltgen 96fb441abd
Merge pull request #1146 from dhiltgen/ext_server_cgo
Add cgo implementation for llama.cpp
2023-12-22 08:16:31 -08:00