Commit Graph

3794 Commits

Author SHA1 Message Date
ParthSareen
a2a73ce5e0 wip! 2025-01-23 20:21:50 -08:00
ParthSareen
6ba557f25b checkpoint 2025-01-23 09:46:14 -08:00
ParthSareen
a7c8cc06da json checkpoint 2025-01-21 17:48:12 -08:00
Blake Mizerany
089bbb537d grammar: introduce new grammar package
This package provides a way to convert JSON schemas to equivalent EBNF.
It is intended to be a replacement to llama.cpp's schema_to_grammar.

This is still an early version and does not yet support all JSON schema
features. The to-do list includes:

- minumum/maximum constraints on integer types
- minLength/maxLength constraints on string types
- defs and refs
2025-01-20 14:40:50 -08:00
ParthSareen
7cd9fbbbb1 improve temperature sampler 2025-01-20 13:53:07 -08:00
ParthSareen
b91487f289 Working json sampler 2025-01-15 14:08:54 -08:00
ParthSareen
5b19d4941a addressing comments + cleanup 2025-01-14 16:23:48 -08:00
ParthSareen
5e73f24e16 sampling package 2025-01-14 16:23:38 -08:00
Michael Yang
4aac178cac update ci 2025-01-14 14:25:56 -08:00
Michael Yang
800e21d060 sort devices by type 2025-01-14 14:25:56 -08:00
Michael Yang
5827999e9e next ollama runner
implement llama and mllama model architectures in go using ggml (through
cgo)
2025-01-14 14:25:56 -08:00
Jeffrey Morgan
61676fb506 llama: move grammar tests to llama_test.go (#8411) 2025-01-14 12:55:45 -08:00
Bruce MacDonald
f6f3713001 convert: qwen2 from safetensors (#8408)
Add native support for converting Qwen2 family models (including Qwen2.5)
from safetensors to gguf format so we can run it.
2025-01-14 10:34:37 -08:00
Steve Berdy
a30f347201 readme: add LangChain for .NET to community integrations (#8352) 2025-01-14 09:37:35 -08:00
Jeffrey Morgan
74ea4fb604 remove .prettierrc.json (#8413) 2025-01-14 09:30:34 -08:00
Jeffrey Morgan
6982e9cc96 readme: remove link to missing page 2025-01-13 18:56:31 -08:00
Patrick Devine
ab39872cb4 add new create api doc (#8388) 2025-01-13 17:30:24 -08:00
Parth Sareen
84a2314463 examples: remove codified examples (#8267) 2025-01-13 11:26:22 -08:00
Jeffrey Morgan
17fcdea698 readme: move discord link 2025-01-12 22:45:47 -08:00
Patrick Devine
32bd37adf8 make the modelfile path relative for ollama create (#8380) v0.5.5 2025-01-10 16:14:08 -08:00
Michael Yang
9446c2c902 Merge pull request #8196 from ollama/mxyng/gods-v2
chore: upgrade to gods v2
2025-01-10 13:50:11 -08:00
Jeffrey Morgan
9aa141d023 readme: remove discord badge image for now 2025-01-09 22:02:18 -08:00
Patrick Devine
8bccae4f92 show a more descriptive error in the client if it is newer than the server (#8351) 2025-01-09 10:12:30 -08:00
isamu arimoto
6ae2adc1af openai: accept additional headers to fix CORS errors (#8343) v0.5.5-rc0 2025-01-08 11:28:11 -08:00
Jeffrey Morgan
1deafd8254 llama: update vendored code to commit 46e3556 (#8308) 2025-01-08 11:22:01 -08:00
Michael
57f038ec7b readme: add phi4 model (#8350) 2025-01-08 11:21:39 -08:00
frob
cdf3a181dc Add CUSTOM_CPU_FLAGS to Dockerfile. (#8284)
* Add CUSTOM_CPU_FLAGS.

* fix golangci-lint error.

---------

Co-authored-by: Richard Lyons <rick@frob.com.au>
2025-01-06 09:17:19 -08:00
Ubaldo Porcheddu
3919f4ba3d llama: fix runner api example url in README.md (#8307) 2025-01-04 15:45:16 -08:00
Bruce MacDonald
2d33c4e97d discover: remove leading new-line for linter 2025-01-03 12:03:58 -08:00
Bruce MacDonald
29a8975c66 api: remove unused create fields
These fields are deprecated, but specifying them will not do anything. Removing them as the other deprecated fields will still work, but these do not, so they dont match our existing pattern.
2025-01-03 12:03:58 -08:00
Patrick Devine
86a622cbdc Update the /api/create endpoint to use JSON (#7935)
Replaces `POST /api/create` to use JSON instead of a Modelfile.

This is a breaking change.
2024-12-31 18:02:30 -08:00
Jeffrey Morgan
459d822b51 readme: link header to ollama.com 2024-12-29 17:36:07 -05:00
Simon Schampijer
844899440a examples: updated deprecated imports (#3602) 2024-12-29 14:36:25 -05:00
Anas Khan
103db4216d docs: add /api/version endpoint documentation (#8082)
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2024-12-29 14:33:44 -05:00
Jeffrey Morgan
6daddcde01 readme: update import header 2024-12-29 14:12:23 -05:00
Emilien Lancelot
07f7e69b36 readme: add Yacana multi-agent framework to community integrations (#7259) 2024-12-28 15:05:57 -05:00
CIIDMike
b68e8e5727 docs: add syntax highlighting on Go template code blocks (#8215) 2024-12-27 13:17:49 -05:00
Adarsh Mishra
369fb529e2 readme: add TextLLaMA to community integrations 2024-12-27 13:16:06 -05:00
Jared Donnell
023e4bca14 readme: add neollama to terminal section of community integrations (#8242) 2024-12-25 17:16:11 -05:00
aritra saha
51af455f62 readme: add alpaca client application to community integrations (#8227) 2024-12-24 23:05:35 -05:00
Emanuil Rusev
ffe3549064 readme: add IntelliBar to community integrations (#7950) 2024-12-23 12:04:18 -05:00
湛露先生
928de9050e server: reuse InvalidModelNameErrMsg type (#8163) 2024-12-23 10:38:34 -05:00
ItzCrazyKns
36aea6154a readme: add Perplexica to community-integrations (#8198) 2024-12-22 20:04:01 -05:00
Patrick Devine
dd352ab27f fix crash bug with /save when quotes are used (#8208) 2024-12-21 22:31:37 -08:00
Michael Yang
cb40d60469 chore: upgrade to gods v2
gods v2 uses go generics rather than interfaces which simplifies the
code considerably
2024-12-21 00:05:16 -08:00
Patrick Devine
d8bab8ea44 remove tutorials.md which pointed to removed tutorials (#8189) 2024-12-20 14:04:20 -08:00
Squishedmac
9ab62eb96f update golang.org/x dependencies (#8172) 2024-12-20 09:29:30 -08:00
Parth Sareen
290cf2040a llama: test key order preservation in schema_to_grammar (#8078)
This change adds a test to catch a regression in schema_to_grammar where
the order of keys in the JSON schema is not preserved in the generated
grammar, which is critical for step-by-step reasoning.
2024-12-18 19:44:50 -08:00
Jeffrey Morgan
a72f2dce45 scripts: sign renamed macOS binary (#8131) 2024-12-17 18:03:49 -08:00
Jesse Gross
08a832b482 llama: Ensure KV cache is fully defragmented.
Sometimes the KV cache requires defragmentation even without
triggering the threshold heuristic. In this case, decoding
will not being able to find a KV cache slot. This is particularly
difficult for the caller to handle if it happens in between
ubatches. To avoid this, we should immediately trigger a defrag.

In addition, a heavily fragmented cache can require more than
max_moves to defragment. Currently, we stop when we hit the limit
but this can leave a cache that still does not have adequate space
even after defragmentation is triggered. Instead, we should do
multiple batches of processing until everything is complete.

Fixes #7949
2024-12-17 14:01:19 -08:00