Commit Graph

477 Commits

Author SHA1 Message Date
Vadim Grinco 45dbd14645
Merged latest ollama 0.6.2 and nasrally's Flash Attention patches (#5)
* readme: add Ellama to list of community integrations (#9800)

* readme: add screenpipe to community integrations (#9786)

* Add support for ROCm gfx1151 (#9773)

* conditionally enable parallel pipelines

* sample: make mutations in transforms explicit (#9743)

* updated minP to use early exit making use of sorted tokens

* ml/backend/ggml: allocate memory with malloc when loading model (#9822)

* runner: remove cache prompt flag from ollama runner (#9826)

We do not need to bypass the prompt caching in the ollama runner yet, as
only embedding models needed to bypass the prompt caching. When embedding
models are implemented they can skip initializing this cache completely.

* ollamarunner: Check for minBatch of context space when shifting

Models can specify that a group of inputs need to be handled a single
batch. However, context shifting didn't respect this and could trigger
a break anyways. In this case, we should instead trigger a context
shift earlier so that it occurs before the grouped batch.

Note that there still some corner cases:
 - A long prompt that exceeds the context window can get truncated
   in the middle of an image. With the current models, this will
   result in the model not recognizing the image at all, which is
   pretty much the expected result with truncation.
 - The context window is set less than the minimum batch size. The
   only solution to this is to refuse to load the model with these
   settings. However, this can never occur with current models and
   default settings.

Since users are unlikely to run into these scenarios, fixing them is
left as a follow up.

* Applied latest patches from McBane87

See this for details: https://github.com/whyvl/ollama-vulkan/issues/7#issuecomment-2708820861

Signed-off-by: Vadim Grinco <vadim@grinco.eu>

* Add ability to enable flash attention on vulkan (#4)

* discover: add flash attention handling for vulkan
* envconfig: fix typo in config.go

As part of the process some code was refactored and I added a new field
FlashAttention to GpuInfo since the previous solution didn't allow for a
granular check via vulkan extensions. As a side effect, this now allows
for granular per-device FA support checking in other places

---------

Signed-off-by: Vadim Grinco <vadim@grinco.eu>
Co-authored-by: zeo <108888572+zeozeozeo@users.noreply.github.com>
Co-authored-by: Louis Beaumont <louis.beaumont@gmail.com>
Co-authored-by: Daniel Hiltgen <dhiltgen@users.noreply.github.com>
Co-authored-by: Michael Yang <mxyng@pm.me>
Co-authored-by: Parth Sareen <parth.sareen@ollama.com>
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>
Co-authored-by: Jesse Gross <jesse@ollama.com>
Co-authored-by: Nikita <50599445+nasrally@users.noreply.github.com>
2025-03-23 12:27:37 +01:00
Michael ad4e0bf3be
Adding Gemma 3 to readme (#9671) 2025-03-12 07:39:25 +01:00
Vincent Koc 8585b7b151
docs: add opik to observability integrations (#9626) 2025-03-10 16:15:10 -07:00
Xiaowei Zhu 757668c42f
docs: add SwiftChat (#9540) 2025-03-10 11:01:09 -07:00
Sam 96ec8afd09
docs(tool): add mcp-llm (#9537) 2025-03-10 09:52:02 -07:00
Breaker 1f6986e919
readme: add QwQ to the supported models list (#9565) 2025-03-07 09:30:07 -08:00
aritra saha 8fe6f69f28
docs: add granite-3.2 to the readme 2025-03-04 11:10:56 -08:00
KindBrave fefbf8f74b
docs: add Ollama Android Chat community integration 2025-03-03 16:38:32 -08:00
Mark 36dfb906bb
docs: don't use self-closing tag for anchor element (#9456) 2025-03-03 11:56:34 -08:00
aritra saha a6f0f908b9
docs: update phi3-mini to phi4-mini (#9424)
* Update README.md

removed phi 3 mini and added phi4-mini

* Update README.md

---------

Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>
2025-03-03 11:09:21 -08:00
İbrahim Çetin 3b1ddb2b3a
docs: add reins to community integrations (#9411) 2025-03-03 11:06:30 -08:00
Soulter af68d60a58
readme: add AstrBot to community integrations (#9442) 2025-03-01 21:58:34 -08:00
王贺 25885e5335
docs: Add 1Panel to Community Integrations (#9312) 2025-02-28 09:53:03 -08:00
Gordon Kamer 2db96c18e7
readme: add Nichey to community integrations (#9370) 2025-02-26 10:40:53 -08:00
Junyan Qin (Chin) 5d81c1a184
docs: add `RockChinQ/LangBot` to integrations list (#9272) 2025-02-21 09:36:55 -08:00
danielekp 3d4cc7833c
docs: Add yla to community integrations 2025-02-20 11:34:24 -08:00
zyxucp 778603a818
docs: Add AntSK to Community Integrations (#9214) 2025-02-19 13:22:48 -08:00
maninhill 3c874df46e
docs: Add MaxKB to Community Integrations (#9212) 2025-02-19 13:20:09 -08:00
benhaotang 33ad61b112
Add OpenDeepResearcher-via-searxng to Community Integrations (#9138) 2025-02-18 11:39:11 -08:00
innightwolfsleep 3b4424ff98
readme: add LLM Telegram Bot to community integrations (#9150) 2025-02-18 10:04:30 -05:00
Bùi Đức Nhật 8cf16063a5
docs: add ollamazing to the README.md (#9075) 2025-02-13 10:47:09 -08:00
Clinton 82658c3eec
readme: add Homebrew to package managers section (#9052) 2025-02-12 11:17:39 -08:00
bloominstrong 378d6e1e6a
docs: fix nix package link (#9045)
removing the channel tag from the url so it will always go to the current stable channel.
2025-02-12 09:16:26 -08:00
Hugues Chocart afa55bc70c
doc: fix link for Abso (#9043) 2025-02-12 09:15:08 -08:00
Hugues Chocart 0189bdd0b7
readme: add Abso SDK to community integrations (#8973) 2025-02-11 00:14:45 -08:00
Hugues Chocart 38117fba83
readme: add Lunary to observability community integrations (#8975) 2025-02-09 22:08:46 -08:00
Qusai Ismael 484a99e428
docs: add LocalLLM app to community integrations (#8953) 2025-02-08 12:28:01 -08:00
DravenK ec6121c331
docs: ollama zig community lib (#8688) 2025-02-08 11:10:47 -08:00
Guddu Kumar 7e402ebb8c
readme: add deepseek to supported models 2025-02-07 11:28:28 -08:00
Azis Alvriyanto b901a712c6
docs: improve syntax highlighting in code blocks (#8854) 2025-02-07 09:55:07 -08:00
annilq 6ab4ba4c26
readme: add React Native client to community integrations (#8877) 2025-02-06 17:15:48 -08:00
CosmicEventHorizon e8d4eb3e68
readme: add ChibiChat to community integrations (#8883) 2025-02-06 16:08:46 -08:00
oslook 31acd1ebf9
readme: add Ollama Chat WebUI for Docker to community integrations (#8084) 2025-02-06 15:41:02 -08:00
zyphixor 330b6c50b0
readme: add simple-discord-ai to community integrations (#8659) 2025-02-05 18:35:04 -08:00
Daniel Lok 451c1596af
readme: add MLflow Tracing as an observability integration (#8811) 2025-02-05 16:04:24 -08:00
Tilman Griesel d4d338c224
readme: add Chipper to community integrations (#8803) 2025-02-03 14:18:19 -08:00
Anıl Kaynar f4321a421c
readme: add MinimalNextOllamaChat to community integrations (#8767) 2025-02-02 12:56:10 -08:00
Xiaofu Huang 2ef3c803a1
readme: add AI Toolkit for VSCode to community integrations (#8604) 2025-01-27 00:36:23 -08:00
Matěj Štágl 453e4d090b
readme: add LlmTornado to community integrations (#8551) 2025-01-25 01:04:07 -08:00
Jannik Maierhöfer 021817e59a
readme: add link to Langfuse (#8455) 2025-01-16 22:41:12 -08:00
Steve Berdy a30f347201
readme: add LangChain for .NET to community integrations (#8352) 2025-01-14 09:37:35 -08:00
Jeffrey Morgan 6982e9cc96
readme: remove link to missing page 2025-01-13 18:56:31 -08:00
Jeffrey Morgan 17fcdea698
readme: move discord link 2025-01-12 22:45:47 -08:00
Jeffrey Morgan 9aa141d023
readme: remove discord badge image for now 2025-01-09 22:02:18 -08:00
Michael 57f038ec7b
readme: add phi4 model (#8350) 2025-01-08 11:21:39 -08:00
Jeffrey Morgan 459d822b51
readme: link header to ollama.com 2024-12-29 17:36:07 -05:00
Jeffrey Morgan 6daddcde01
readme: update import header 2024-12-29 14:12:23 -05:00
Emilien Lancelot 07f7e69b36
readme: add Yacana multi-agent framework to community integrations (#7259) 2024-12-28 15:05:57 -05:00
Adarsh Mishra 369fb529e2
readme: add TextLLaMA to community integrations 2024-12-27 13:16:06 -05:00
Jared Donnell 023e4bca14
readme: add neollama to terminal section of community integrations (#8242) 2024-12-25 17:16:11 -05:00