Patrick Devine
1c70a00f71
adjust image sizes
2024-08-27 11:15:25 -07:00
Patrick Devine
ac80010db8
update the import docs ( #6104 )
2024-08-26 19:57:26 -07:00
Michael Yang
bb362caf88
update faq
2024-08-23 13:37:21 -07:00
Daniel Hiltgen
f9e31da946
Review comments
2024-08-19 10:36:15 -07:00
Daniel Hiltgen
88bb9e3328
Adjust layout to bin+lib/ollama
2024-08-19 09:38:53 -07:00
Bruce MacDonald
eda8a32a09
update chatml template format to latest in docs ( #6344 )
2024-08-13 16:39:18 -07:00
Pamela Fox
1f32276178
Update openai.md to remove extra checkbox ( #6345 )
2024-08-13 13:36:05 -07:00
Michael Yang
bd5e432630
update import.md
2024-08-12 15:13:29 -07:00
royjhan
5b3a21b578
add metrics to docs ( #6079 )
2024-08-07 14:43:44 -07:00
Kyle Kelley
ad0c19dde4
Use llama3.1 in tools example ( #5985 )
...
* Use llama3.1 in tools example
* Update api.md
2024-08-07 17:20:50 -04:00
Michael Yang
39f2bc6bfc
Merge pull request #6167 from ollama/mxyng/line-feed
...
line feed
2024-08-05 00:06:28 -07:00
frob
b73b0940ef
Disable paging for journalctl ( #6154 )
...
Users using `journalctl` to get logs for issue logging sometimes don't realize that paging is causing information to be missed.
2024-08-05 00:10:53 -04:00
Michael Yang
6a07344786
line feed
2024-08-04 17:25:41 -07:00
royjhan
4addf6b587
Update OpenAI Compatibility Docs with /v1/completions ( #5311 )
...
* Update docs
* token bug corrected
* Update docs/openai.md
* Update docs/openai.md
* add suffix
* merge conflicts
* merge conflicts
2024-08-02 13:16:23 -07:00
royjhan
85c7f11170
Update docs ( #5310 )
2024-08-02 13:05:57 -07:00
Kim Hallberg
ce1fb4447e
Fix models/{model} URL ( #6132 )
2024-08-01 16:31:47 -07:00
royjhan
558a54b098
Update OpenAI Compatibility Docs with /v1/embeddings ( #5470 )
...
* docs without usage
* no usage
* rm metric note
2024-08-01 16:00:29 -07:00
royjhan
ed52833bb1
Add to docs ( #5309 )
2024-08-01 15:58:13 -07:00
royjhan
f561eecfb8
Update OpenAI Compatibility Docs with /v1/models ( #5151 )
...
* OpenAI Docs
* Update docs/openai.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
* Remove newline
---------
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2024-08-01 15:48:44 -07:00
Daniel Hiltgen
1a83581a8e
Merge pull request #5895 from dhiltgen/sched_faq
...
Better explain multi-gpu behavior
2024-07-29 14:25:41 -07:00
Daniel Hiltgen
161e12cecf
Merge pull request #5932 from dhiltgen/win_font
...
Explain font problems on windows 10
2024-07-29 13:40:24 -07:00
Veit Heller
6f26e9322f
Fix typo in image docs ( #6041 )
2024-07-29 08:50:53 -07:00
Jeffrey Morgan
0e4d653687
upate to `llama3.1` elsewhere in repo ( #6032 )
2024-07-28 19:56:02 -07:00
Tibor Schmidt
f3d7a481b7
feat: add support for min_p ( resolve #1142 ) ( #1825 )
2024-07-27 14:37:40 -07:00
Jeffrey Morgan
f5e3939220
Update api.md ( #5968 )
2024-07-25 23:10:18 -04:00
Jeffrey Morgan
ae27d9dcfd
Update openai.md
2024-07-25 20:27:33 -04:00
Michael Yang
37096790a7
Merge pull request #5552 from ollama/mxyng/messages-docs
...
docs
2024-07-25 16:26:19 -07:00
Michael Yang
997c903884
Update docs/template.md
...
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2024-07-25 16:23:40 -07:00
Jeffrey Morgan
455e61170d
Update openai.md
2024-07-25 18:34:47 -04:00
royjhan
4de1370a9d
openai tools doc ( #5617 )
2024-07-25 18:34:06 -04:00
Daniel Hiltgen
6c2129d5d0
Explain font problems on windows 10
2024-07-24 15:22:00 -07:00
Daniel Hiltgen
830fdd2715
Better explain multi-gpu behavior
2024-07-23 15:16:38 -07:00
Michael Yang
9b60a038e5
update api.md
2024-07-22 13:49:51 -07:00
Michael Yang
83a0cb8d88
docs
2024-07-22 13:38:09 -07:00
royjhan
c0648233f2
api embed docs ( #5282 )
2024-07-22 13:37:08 -07:00
Daniel Hiltgen
283948c83b
Adjust windows ROCm discovery
...
The v5 hip library returns unsupported GPUs which wont enumerate at
inference time in the runner so this makes sure we align discovery. The
gfx906 cards are no longer supported so we shouldn't compile with that
GPU type as it wont enumerate at runtime.
2024-07-20 15:17:50 -07:00
royjhan
0d41623b52
OpenAI: Add Suffix to `v1/completions` ( #5611 )
...
* add suffix
* remove todo
* remove TODO
* add to test
* rm outdated prompt tokens info md
* fix test
* fix test
2024-07-16 20:50:14 -07:00
Daniel Hiltgen
1f50356e8e
Bump ROCm on windows to 6.1.2
...
This also adjusts our algorithm to favor our bundled ROCm.
I've confirmed VRAM reporting still doesn't work properly so we
can't yet enable concurrency by default.
2024-07-10 11:01:22 -07:00
Jeffrey Morgan
8f8e736b13
update llama.cpp submodule to `d7fd29f` ( #5475 )
2024-07-05 13:25:58 -04:00
Daniel Hiltgen
52abc8acb7
Document older win10 terminal problems
...
We haven't found a workaround, so for now recommend updating.
2024-07-03 17:32:14 -07:00
Daniel Hiltgen
ef757da2c9
Better nvidia GPU discovery logging
...
Refine the way we log GPU discovery to improve the non-debug
output, and report more actionable log messages when possible
to help users troubleshoot on their own.
2024-07-03 10:50:40 -07:00
Daniel Hiltgen
d2f19024d0
Merge pull request #5442 from dhiltgen/concurrency_docs
...
Add windows radeon concurrency note
2024-07-02 12:47:47 -07:00
Daniel Hiltgen
69c04eecc4
Add windows radeon concurreny note
2024-07-02 12:46:14 -07:00
royjhan
996bb1b85e
OpenAI: /v1/models and /v1/models/{model} compatibility ( #5007 )
...
* OpenAI v1 models
* Refactor Writers
* Add Test
Co-Authored-By: Attila Kerekes
* Credit Co-Author
Co-Authored-By: Attila Kerekes <439392+keriati@users.noreply.github.com>
* Empty List Testing
* Use Namespace for Ownedby
* Update Test
* Add back envconfig
* v1/models docs
* Use ModelName Parser
* Test Names
* Remove Docs
* Clean Up
* Test name
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
* Add Middleware for Chat and List
* Testing Cleanup
* Test with Fatal
* Add functionality to chat test
* OpenAI: /v1/models/{model} compatibility (#5028 )
* Retrieve Model
* OpenAI Delete Model
* Retrieve Middleware
* Remove Delete from Branch
* Update Test
* Middleware Test File
* Function name
* Cleanup
* Test Update
* Test Update
---------
Co-authored-by: Attila Kerekes <439392+keriati@users.noreply.github.com>
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2024-07-02 11:50:56 -07:00
Daniel Hiltgen
dfded7e075
Merge pull request #5364 from dhiltgen/concurrency_docs
...
Document concurrent behavior and settings
2024-07-01 09:49:48 -07:00
Eduard
27402cb7a2
Update gpu.md ( #5382 )
...
Runs fine on a NVIDIA GeForce GTX 1050 Ti
2024-06-30 21:48:51 -04:00
Jeffrey Morgan
c1218199cf
Update api.md
2024-06-29 16:22:49 -07:00
Daniel Hiltgen
aae56abb7c
Document concurrent behavior and settings
2024-06-28 13:15:57 -07:00
royjhan
6d4219083c
Update docs ( #5312 )
2024-06-28 09:58:14 -07:00
royjhan
fedf71635e
Extend api/show and ollama show to return more model info ( #4881 )
...
* API Show Extended
* Initial Draft of Information
Co-Authored-By: Patrick Devine <pdevine@sonic.net>
* Clean Up
* Descriptive arg error messages and other fixes
* Second Draft of Show with Projectors Included
* Remove Chat Template
* Touches
* Prevent wrapping from files
* Verbose functionality
* Docs
* Address Feedback
* Lint
* Resolve Conflicts
* Function Name
* Tests for api/show model info
* Show Test File
* Add Projector Test
* Clean routes
* Projector Check
* Move Show Test
* Touches
* Doc update
---------
Co-authored-by: Patrick Devine <pdevine@sonic.net>
2024-06-19 14:19:02 -07:00
Daniel Hiltgen
9d8a4988e8
Implement log rotation for tray app
2024-06-19 12:53:34 -07:00
Jeffrey Morgan
176d0f7075
Update import.md
2024-06-17 19:44:14 -04:00
Jeffrey Morgan
c7b77004e3
docs: add missing powershell package to windows development instructions ( #5075 )
...
* docs: add missing instruction for powershell build
The powershell script for building Ollama on Windows now requires the `ThreadJob` module. Add this to the instructions and dependency list.
* Update development.md
2024-06-15 23:08:09 -04:00
Jeffrey Morgan
6b800aa7b7
openai: do not set temperature to 0 when setting seed ( #5045 )
2024-06-14 13:43:56 -07:00
Patrick Devine
4dc7fb9525
update 40xx gpu compat matrix ( #5036 )
2024-06-13 17:10:33 -07:00
Jeffrey Morgan
ead259d877
llm: fix seed value not being applied to requests ( #4986 )
2024-06-11 14:24:41 -07:00
Michael Yang
5bc029c529
Merge pull request #4921 from ollama/mxyng/import-md
...
update import.md
2024-06-10 11:41:09 -07:00
Napuh
896495de7b
Add instructions to easily install specific versions on faq.md ( #4084 )
...
* Added instructions to easily install specific versions on faq.md
* Small typo
* Moved instructions on how to install specific version to linux.md
* Update docs/linux.md
* Update docs/linux.md
---------
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2024-06-09 10:49:03 -07:00
Jeffrey Morgan
943172cbf4
Update api.md
2024-06-08 23:04:32 -07:00
Michael Yang
b9ce7bf75e
update import.md
2024-06-07 16:45:15 -07:00
royjhan
28c7813ac4
API PS Documentation ( #4822 )
...
* API PS Documentation
2024-06-05 11:06:53 -07:00
Shubham
60323e0805
add embed model command and fix question invoke ( #4766 )
...
* add embed model command and fix question invoke
* Update docs/tutorials/langchainpy.md
Co-authored-by: Kim Hallberg <hallberg.kim@gmail.com>
* Update docs/tutorials/langchainpy.md
---------
Co-authored-by: Kim Hallberg <hallberg.kim@gmail.com>
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2024-06-03 22:20:48 -07:00
Daniel Hiltgen
0fc0cfc6d2
Merge pull request #4594 from dhiltgen/doc_container_workarounds
...
Add isolated gpu test to troubleshooting
2024-05-30 13:10:54 -07:00
Daniel Hiltgen
1b2d156094
Tidy up developer guide a little
2024-05-23 15:14:05 -07:00
Daniel Hiltgen
f77713bf1f
Add isolated gpu test to troubleshooting
2024-05-23 09:33:25 -07:00
Patrick Devine
3bade04e10
doc updates for the faq/troubleshooting ( #4565 )
2024-05-21 15:30:09 -07:00
alwqx
8800c8a59b
chore: fix typo in docs ( #4536 )
2024-05-20 14:19:03 -07:00
Patrick Devine
f1548ef62d
update the FAQ to be more clear about windows env variables ( #4415 )
2024-05-13 18:01:13 -07:00
睡觉型学渣
9c76b30d72
Correct typos. ( #4387 )
...
* Correct typos.
* Correct typos.
2024-05-12 18:21:11 -07:00
Daniel Hiltgen
8cc0ee2efe
Doc container usage and workaround for nvidia errors
2024-05-09 09:26:45 -07:00
Jeffrey Morgan
d5eec16d23
use model defaults for `num_gqa`, `rope_frequency_base ` and `rope_frequency_scale` ( #1983 )
2024-05-09 09:06:13 -07:00
Carlos Gamez
daa1a032f7
Update langchainjs.md ( #2027 )
...
Updated sample code as per warning notification from the package maintainers
2024-05-08 20:21:03 -07:00
boessu
5d3f7fff26
Update langchainpy.md ( #4236 )
...
fixing pip code.
2024-05-07 16:36:34 -07:00
CrispStrobe
7c5330413b
note on naming restrictions ( #2625 )
...
* note on naming restrictions
else push would fail with cryptic
retrieving manifest
Error: file does not exist
==> maybe change that in code too
* Update docs/import.md
---------
Co-authored-by: C-4-5-3 <154636388+C-4-5-3@users.noreply.github.com>
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2024-05-06 16:03:21 -07:00
Jeffrey Chen
d091fe3c21
Windows automatically recognizes username ( #3214 )
2024-05-06 15:03:14 -07:00
Mohamed A. Fouad
ee02f548c8
Update linux.md ( #3847 )
...
Add -e to viewing logs in order to show end of ollama logs
2024-05-06 15:02:25 -07:00
Darinka
3ecae420ac
Update api.md ( #3945 )
...
* Update api.md
Changed the calculation of tps (token/s) in the documentation
* Update docs/api.md
---------
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2024-05-06 14:39:58 -07:00
Adrien Brault
aa93423fbf
docs: pbcopy on mac ( #3129 )
2024-05-06 13:47:00 -07:00
Hyden Liu
fb8ddc564e
chore: delete `HEAD` ( #4194 )
2024-05-06 10:32:30 -07:00
Daniel Hiltgen
20f6c06569
Make maximum pending request configurable
...
This also bumps up the default to be 50 queued requests
instead of 10.
2024-05-04 21:00:52 -07:00
Daniel Hiltgen
e006480e49
Explain the 2 different windows download options
2024-05-04 12:50:05 -07:00
Dr Nic Williams
e8aaea030e
Update 'llama2' -> 'llama3' in most places ( #4116 )
...
* Update 'llama2' -> 'llama3' in most places
---------
Co-authored-by: Patrick Devine <patrick@infrahq.com>
2024-05-03 15:25:04 -04:00
Michael Yang
94c369095f
fix line ending
...
replace CRLF with LF
2024-05-02 14:53:13 -07:00
alwqx
68755f1f5e
chore: fix typo in docs/development.md ( #4073 )
2024-05-01 15:39:11 -04:00
Christian Frantzen
5950c176ca
Update langchainpy.md ( #4037 )
...
Updated the code a bit
2024-04-29 23:19:06 -04:00
Quinten van Buul
2a80f55e2a
Update windows.md ( #3855 )
...
Fixed a typo
2024-04-26 16:04:15 -04:00
Patrick Devine
74d2a9ef9a
add OLLAMA_KEEP_ALIVE env variable to FAQ ( #3865 )
2024-04-23 21:06:51 -07:00
Sri Siddhaarth
e6f9bfc0e8
Update api.md ( #3705 )
2024-04-20 15:17:03 -04:00
Jeremy
85bdf14b56
update jetson tutorial
2024-04-17 16:17:42 -04:00
Carlos Gamez
a27e419b47
Update langchainjs.md ( #2030 )
...
Changed ollama.call() for ollama.invoke() as per deprecated documentation from langchain
2024-04-15 18:37:30 -04:00
Jeffrey Morgan
e54a3c7fcd
Update modelfile.md
...
Remove Modelfile parameters that are decided at runtime
2024-04-15 15:35:44 -04:00
Blake Mizerany
1524f323a3
Revert "build.go: introduce a friendlier way to build Ollama ( #3548 )" ( #3564 )
2024-04-09 15:57:45 -07:00
Blake Mizerany
fccf3eecaa
build.go: introduce a friendlier way to build Ollama ( #3548 )
...
This commit introduces a more friendly way to build Ollama dependencies
and the binary without abusing `go generate` and removing the
unnecessary extra steps it brings with it.
This script also provides nicer feedback to the user about what is
happening during the build process.
At the end, it prints a helpful message to the user about what to do
next (e.g. run the new local Ollama).
2024-04-09 14:18:47 -07:00
Thomas Vitale
cb03fc9571
Docs: Remove wrong parameter for Chat Completion ( #3515 )
...
Fixes gh-3514
Signed-off-by: Thomas Vitale <ThomasVitale@users.noreply.github.com>
2024-04-06 09:08:35 -07:00
Daniel Hiltgen
0a74cb31d5
Safeguard for noexec
...
We may have users that run into problems with our current
payload model, so this gives us an escape valve.
2024-04-01 16:48:33 -07:00
Jeffrey Morgan
856b8ec131
remove need for `$VSINSTALLDIR` since build will fail if `ninja` cannot be found ( #3350 )
2024-03-26 16:23:16 -04:00
Patrick Devine
1b272d5bcd
change `github.com/jmorganca/ollama` to `github.com/ollama/ollama` ( #3347 )
2024-03-26 13:04:17 -07:00
Jeffrey Morgan
f38b705dc7
Fix ROCm link in `development.md`
2024-03-25 16:32:44 -04:00
Blake Mizerany
22921a3969
doc: specify ADAPTER is optional ( #3333 )
2024-03-25 09:43:19 -07:00
Daniel Hiltgen
d8fdbfd8da
Add docs for GPU selection and nvidia uvm workaround
2024-03-21 11:52:54 +01:00
Bruce MacDonald
a5ba0fcf78
doc: faq gpu compatibility ( #3142 )
2024-03-21 05:21:34 -04:00
Jeffrey Morgan
3a30bf56dc
Update faq.md
2024-03-20 17:48:39 +01:00
Jeffrey Morgan
7ed3e94105
Update faq.md
2024-03-18 10:24:39 +01:00
jmorganca
2297ad39da
update `faq.md`
2024-03-18 10:17:59 +01:00
Daniel Hiltgen
6459377ae0
Add ROCm support to linux install script ( #2966 )
2024-03-14 18:00:16 -07:00
Jeffrey Morgan
5ce997a7b9
Update README.md
2024-03-13 21:12:17 -07:00
Patrick Devine
ba7cf7fb66
add more docs on for the modelfile message command ( #3087 )
2024-03-12 16:41:41 -07:00
Daniel Hiltgen
b53229a2ed
Add docs explaining GPU selection env vars
2024-03-12 11:33:06 -07:00
Jeffrey Morgan
6d3adfbea2
Update troubleshooting.md
2024-03-11 13:22:28 -07:00
Daniel Hiltgen
0fdebb34a9
Doc how to set up ROCm builds on windows
2024-03-09 11:29:45 -08:00
Daniel Hiltgen
4a5c9b8035
Finish unwinding idempotent payload logic
...
The recent ROCm change partially removed idempotent
payloads, but the ggml-metal.metal file for mac was still
idempotent. This finishes switching to always extract
the payloads, and now that idempotentcy is gone, the
version directory is no longer useful.
2024-03-09 08:34:39 -08:00
Jeffrey Morgan
6c0af2599e
Update docs `README.md` and table of contents
2024-03-08 22:45:11 -08:00
Daniel Hiltgen
280da44522
Merge pull request #2988 from dhiltgen/rocm_docs
...
Refined ROCm troubleshooting docs
2024-03-08 13:33:30 -08:00
Jeffrey Morgan
b886bec3f9
Update api.md
2024-03-07 23:27:51 -08:00
Daniel Hiltgen
69f0227813
Refined ROCm troubleshooting docs
2024-03-07 11:22:37 -08:00
Daniel Hiltgen
6c5ccb11f9
Revamp ROCm support
...
This refines where we extract the LLM libraries to by adding a new
OLLAMA_HOME env var, that defaults to `~/.ollama` The logic was already
idempotenent, so this should speed up startups after the first time a
new release is deployed. It also cleans up after itself.
We now build only a single ROCm version (latest major) on both windows
and linux. Given the large size of ROCms tensor files, we split the
dependency out. It's bundled into the installer on windows, and a
separate download on windows. The linux install script is now smart and
detects the presence of AMD GPUs and looks to see if rocm v6 is already
present, and if not, then downloads our dependency tar file.
For Linux discovery, we now use sysfs and check each GPU against what
ROCm supports so we can degrade to CPU gracefully instead of having
llama.cpp+rocm assert/crash on us. For Windows, we now use go's windows
dynamic library loading logic to access the amdhip64.dll APIs to query
the GPU information.
2024-03-07 10:36:50 -08:00
Jeffrey Morgan
d481fb3cc8
update go to 1.22 in other places ( #2975 )
2024-03-07 07:39:49 -08:00
John
23ebe8fe11
fix some typos ( #2973 )
...
Signed-off-by: hishope <csqiye@126.com>
2024-03-06 22:50:11 -08:00
Jeffrey Morgan
ce9f7c4674
Update api.md
2024-03-05 13:13:23 -08:00
Jeffrey Morgan
3b4bab3dc5
Fix embeddings load model behavior ( #2848 )
2024-02-29 17:40:56 -08:00
elthommy
1f087c4d26
Update langchain python tutorial ( #2737 )
...
Remove unused GPT4all
Use nomic-embed-text as embedded model
Fix a deprecation warning (__call__)
2024-02-25 00:31:36 -05:00
Jeffrey Morgan
bdc0ea1ba5
Update import.md
2024-02-22 02:08:03 -05:00
Jeffrey Morgan
7fab7918cc
Update import.md
2024-02-22 02:06:24 -05:00
Jeffrey Morgan
f0425d3de9
Update faq.md
2024-02-20 20:44:45 -05:00
Jeffrey Morgan
8125ce4cb6
Update import.md
...
Add instructions to get public key on windows
2024-02-19 22:48:24 -05:00
Jeffrey Morgan
df56f1ee5e
Update faq.md
2024-02-19 22:16:42 -05:00
Jeffrey Morgan
41aca5c2d0
Update faq.md
2024-02-19 21:11:01 -05:00
Jeffrey Morgan
753724d867
Update api.md to include examples for reproducible outputs
2024-02-19 20:36:16 -05:00
Patrick Devine
9a7a4b9533
add faqs for memory pre-loading and the keep_alive setting ( #2601 )
2024-02-19 14:45:25 -08:00
Daniel Hiltgen
b338c0635f
Document setting server vars for windows
2024-02-19 13:30:46 -08:00
Tristan Rhodes
9774663013
Update faq.md with the location of models on Windows ( #2545 )
2024-02-16 11:04:19 -08:00
Daniel Hiltgen
1ba734de67
typo
2024-02-15 14:56:55 -08:00
Daniel Hiltgen
29e90cc13b
Implement new Go based Desktop app
...
This focuses on Windows first, but coudl be used for Mac
and possibly linux in the future.
2024-02-15 05:56:45 +00:00
Jeffrey Morgan
48a273f80b
Fix issues with templating prompt in chat mode ( #2460 )
2024-02-12 15:06:57 -08:00
Jeffrey Morgan
1c8435ffa9
Update domain name references in docs and install script ( #2435 )
2024-02-09 15:19:30 -08:00
Jeffrey Morgan
42b797ed9c
Update openai.md
2024-02-08 15:03:23 -05:00
Jeffrey Morgan
336aa43f3c
Update openai.md
2024-02-08 12:48:28 -05:00
Jeffrey Morgan
ab0d37fde4
Update openai.md
2024-02-07 17:25:33 -05:00
Jeffrey Morgan
14e71350c8
Update openai.md
2024-02-07 17:25:24 -05:00
Jeffrey Morgan
453f572f83
Initial OpenAI `/v1/chat/completions` API compatibility ( #2376 )
2024-02-07 17:24:29 -05:00
Bruce MacDonald
128fce5495
docs: keep_alive ( #2258 )
2024-02-06 11:00:05 -05:00
Jeffrey Morgan
b9f91a0b36
Update import instructions to use convert and quantize tooling from llama.cpp submodule ( #2247 )
2024-02-05 00:50:44 -05:00
Jeffrey Morgan
f0e9496c85
Update api.md
2024-02-02 12:17:24 -08:00
Daniel Hiltgen
e7dbb00331
Add container hints for troubleshooting
...
Some users are new to containers and unsure where the server logs go
2024-01-29 08:53:41 -08:00
Daniel Hiltgen
e02ecfb6c8
Merge pull request #2116 from dhiltgen/cc_50_80
...
Add support for CUDA 5.0 cards
2024-01-27 10:28:38 -08:00
Jeffrey Morgan
5be9bdd444
Update modelfile.md
2024-01-25 16:29:48 -08:00
Jeffrey Morgan
b706794905
Update modelfile.md to include `MESSAGE`
2024-01-25 16:29:32 -08:00
Michael Yang
93a756266c
faq: update to use launchctl setenv
2024-01-22 13:10:13 -08:00
Daniel Hiltgen
df54c723ae
Make CPU builds parallel and customizable AMD GPUs
...
The linux build now support parallel CPU builds to speed things up.
This also exposes AMD GPU targets as an optional setting for advaced
users who want to alter our default set.
2024-01-21 15:12:21 -08:00
Daniel Hiltgen
a447a083f2
Add compute capability 5.0, 7.5, and 8.0
2024-01-20 14:24:05 -08:00