Commit Graph

3742 Commits

Author SHA1 Message Date
Blake Mizerany 87f0a49fe6
llm: do not silently fail for supplied, but invalid formats (#8130)
Changes in #8002 introduced fixes for bugs with mangling JSON Schemas.
It also fixed a bug where the server would silently fail when clients
requested invalid formats. It also, unfortunately, introduced a bug
where the server would reject requests with an empty format, which
should be allowed.

The change in #8127 updated the code to allow the empty format, but also
reintroduced the regression where the server would silently fail when
the format was set, but invalid.

This commit fixes both regressions. The server does not reject the empty
format, but it does reject invalid formats. It also adds tests to help
us catch regressions in the future.

Also, the updated code provides a more detailed error message when a
client sends a non-empty, but invalid format, echoing the invalid format
in the response.

This commits also takes the opportunity to remove superfluous linter
checks.
2024-12-16 21:57:49 -08:00
Jeffrey Morgan 0f06a6daa7
llm: loosen format check to default to no format (#8127) 2024-12-16 18:45:46 -08:00
Daniel Hiltgen 8f805dd74b
darwin: restore multiple runners for x86 (#8125)
In 0.5.2 we simplified packaging to have avx only for macos x86.  It looks like
there may still be some non-AVX systems out there, so this puts back the prior
logic of building no-AVX for the primary binary, and now 2 runners for avx and avx2.
These will be packaged in the App bundle only, so the stand-alone binary will now be
without AVX support on macos.  On arm, we'll also see these runners reported
as available in the log, but they're dormant and will never be used at runtime.
2024-12-16 18:45:02 -08:00
Michael 89d5e2f2fd
readme: example/get started guide for pgai with Ollama (#8115)
readme: example/get started guide for pgai with Ollama
2024-12-16 17:14:37 +08:00
Jascha Beste 297ada6c87
readme: add pgai to readme for semantic search (#8028)
* docs: switch around database integrations order and link to quickstart

* docs: link to blog post in example readme

* chore: link to main readme

* readme: removing example to link externally

readme: removing example to link externally so we don't have to keep this example up-to-date

---------
2024-12-16 17:02:28 +08:00
Patrick Devine 8c9fb8eb73
imageproc mllama refactor (#7537)
Refactor mllama image processing code, and add pixtral and qwen2vl
2024-12-14 19:50:15 -08:00
Daniel Hiltgen b75ccfc5ec
ci: be more aggressive on parallelism in build (#8102) 2024-12-14 14:56:05 -08:00
Jeffrey Morgan 7a81daf026
llama: update vendor code to commit ba1cb19c (#8101) 2024-12-14 14:55:51 -08:00
Daniel Hiltgen 60f75560a2
runner: switch logging back to stderr (#8091)
This puts the low-level runner logging back on stderr for consistency with prior releases
2024-12-13 14:36:50 -08:00
Anuraag (Rag) Agrawal e28f2d4900
openai: return usage as final chunk for streams (#6784)
* openai: return usage as final chunk for streams

---------

Co-authored-by: ParthSareen <parth.sareen@ollama.com>
2024-12-12 17:09:30 -08:00
Pascal Patry c216850523
llama: parse JSON schema using nlohmann::ordered_json to maintain ordering (#8071) 2024-12-12 09:57:28 -08:00
Parth Sareen 18f6a98bd6
llama: enable JSON schema key ordering for generating grammars (#8055) 2024-12-11 17:17:36 -08:00
Blake Mizerany b1fd7fef86
server: more support for mixed-case model names (#8017)
Fixes #7944
2024-12-11 15:29:59 -08:00
Daniel Hiltgen 36d111e788
ci: fix linux version (#8054)
Pass through the version override so the makefiles use it
2024-12-11 14:09:57 -08:00
Blake Mizerany 9039c821a2
llama: preserve field order in user-defined JSON schemas (#8002)
Previously we decoded and re-encoded JSON schemas during validation,
which served no purpose since json.RawMessage already validates JSON
syntax. Worse, the re-encoding lost field ordering from the original
schema, which affects inference quality during step-by-step reasoning.

While fixing this ordering issue by using json.RawMessage directly,
testing revealed that schema_to_grammar (from llama.cpp) also fails to
preserve field order during grammar generation. This appears to be the
root cause of inference degradation.

This change prevents us from mangling the user's original schema order,
but we still need to address the ordering issue in schema_to_grammar.
That will be a separate change.

Updates #7978
2024-12-11 14:07:30 -08:00
Daniel Hiltgen 581a4a5553
ci: fix artifact path prefix for missing windows payloads (#8052)
upload-artifacts strips off leading common paths so when
the ./build/ artifacts were removed, the ./dist/windows-amd64
prefix became common and was stripped, making the
later download-artifacts place them in the wrong location
2024-12-11 10:59:32 -08:00
Daniel Hiltgen cf4d7c52c4
win: builtin arm runner (#8039)
The new build embeds the arm runner in the
main binary, so there is no longer a lib/ollama
2024-12-11 08:32:13 -08:00
Daniel Hiltgen 6a6328a5e9
ci: build dir changed (#8037)
Remove no longer relevant build log dir
2024-12-10 20:33:34 -08:00
Jeffrey Morgan 527cc97899
llama: update vendored code to commit 40c6d79f (#7875) 2024-12-10 19:21:34 -08:00
Blake Mizerany a37f4a86a7
go.mod: go 1.22.8 -> 1.23.4 (#8036) 2024-12-10 18:16:16 -08:00
湛露先生 46f74e0cb5
Return err when NewHipLib() detect error. (#8012)
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
2024-12-10 16:32:29 -08:00
Phil Wornath 7622ea21af
readme: add AI summary helper plugin to community-integrations (#7202) 2024-12-10 16:13:06 -08:00
Tao Zuhong c5d3947084
readme: add Kangaroo, an AI-powered SQL admin tool to community integrations (#7948) 2024-12-10 13:48:32 -08:00
frob 757eeacc1b
server: lowercase hostname for Host header check (#5851) 2024-12-10 13:43:22 -08:00
Dr. Daniel Bender dd42acf737
readme: add aidful-ollama-model-delete to community integrations (#8024) 2024-12-10 13:03:19 -08:00
Daniel Hiltgen b9ccb3741e
Remove unused runner CpuFeatures (#8032)
The final implementation of #7499 removed dynamic vector requirements
in favor of a simpler filename based model, and this was left over logic that
is no longer needed.
2024-12-10 12:59:39 -08:00
Stefan Weil abfdc4710f
all: fix typos in documentation, code, and comments (#7021) 2024-12-10 12:58:06 -08:00
Daniel Hiltgen 82a02e18d9
build: fix typo in override variable (#8031)
The "F" was missing.
2024-12-10 10:51:16 -08:00
Daniel Hiltgen 4879a234c4
build: Make target improvements (#7499)
* llama: wire up builtin runner

This adds a new entrypoint into the ollama CLI to run the cgo built runner.
On Mac arm64, this will have GPU support, but on all other platforms it will
be the lowest common denominator CPU build.  After we fully transition
to the new Go runners more tech-debt can be removed and we can stop building
the "default" runner via make and rely on the builtin always.

* build: Make target improvements

Add a few new targets and help for building locally.
This also adjusts the runner lookup to favor local builds, then
runners relative to the executable, and finally payloads.

* Support customized CPU flags for runners

This implements a simplified custom CPU flags pattern for the runners.
When built without overrides, the runner name contains the vector flag
we check for (AVX) to ensure we don't try to run on unsupported systems
and crash.  If the user builds a customized set, we omit the naming
scheme and don't check for compatibility.  This avoids checking
requirements at runtime, so that logic has been removed as well.  This
can be used to build GPU runners with no vector flags, or CPU/GPU
runners with additional flags (e.g. AVX512) enabled.

* Use relative paths

If the user checks out the repo in a path that contains spaces, make gets
really confused so use relative paths for everything in-repo to avoid breakage.

* Remove payloads from main binary

* install: clean up prior libraries

This removes support for v0.3.6 and older versions (before the tar bundle)
and ensures we clean up prior libraries before extracting the bundle(s).
Without this change, runners and dependent libraries could leak when we
update and lead to subtle runtime errors.
2024-12-10 09:47:19 -08:00
frob 63269668c0
Prevent underflow when FreeMemory < overhead (#8014)
Co-authored-by: Richard Lyons <frob@cloudstaff.com>
2024-12-10 09:10:40 -08:00
Jesse Gross 900f64e6be prompt: Don't trim whitespace from prompts
New lines can be an important part of a user's prompt and trimming
it can alter the results. We previously only trimmed prompts with
images but refactoring brought this behavior to all prompts, where
it became more noticable.

The /generate endpoint adds less whitespace and therefore doesn't
need to trim it out - this brings the same behavior to /chat.

Thanks to @gabe-l-hart for spotting the issue!

Fixes #7795
2024-12-09 11:02:55 -08:00
Yannick Gloster da09488fbf
docs: remove comment regarding tool streaming in openai.md (#7960) 2024-12-07 22:16:21 -08:00
湛露先生 7f0ccc8a9d
docs: fix syntax error in openai.md (#7986) 2024-12-07 22:14:36 -08:00
Parth Sareen de52b6c2f9
bugfix: "null" value json mode (#7979) 2024-12-06 14:13:15 -08:00
Michael acd7d03266
readme: add llama3.3 to readme (#7975)
readme: add llama3.3 to readme
2024-12-06 14:05:11 -05:00
Parth Sareen f6e87fd628
docs: update readmes for structured outputs (#7962) 2024-12-06 10:35:37 -08:00
Jeffrey Morgan aed1419c64
ci: skip go build for tests (#7899) 2024-12-04 21:22:36 -08:00
Parth Sareen c6c526275d
api: add generate endpoint for structured outputs (#7939) 2024-12-04 17:37:12 -08:00
Parth Sareen 630e7dc6ff
api: structured outputs - chat endpoint (#7900)
Adds structured outputs to chat endpoint
---------

Co-authored-by: Michael Yang <mxyng@pm.me>
Co-authored-by: Hieu Nguyen <hieunguyen1053@outlook.com>
2024-12-04 16:31:19 -08:00
Michael Yang eb8366d658
Merge pull request #7932 from ollama/mxyng/fix-merges 2024-12-04 10:04:52 -08:00
Michael Yang 4456012956 fix unmarshaling merges 2024-12-04 09:21:56 -08:00
Sam 539be43640
llm: normalise kvct parameter handling (#7926) 2024-12-03 16:30:40 -08:00
Sam 1bdab9fdb1
llm: introduce k/v context quantization (vRAM improvements) (#6279) 2024-12-03 15:57:19 -08:00
owboson 2b82c5a8a1
docs: correct default num_predict value in modelfile.md (#7693) 2024-12-03 15:00:05 -08:00
Tigran 55c3efa900
docs: remove extra quote in modelfile.md (#7908) 2024-12-02 09:28:56 -08:00
David Mayboroda 1aedffad93
readme: add minima to community integrations (#7906) 2024-12-02 01:14:47 -08:00
Jeffrey Morgan ff6c2d6dc8
cmd: don't rely on reading repo file for test (#7898) 2024-11-30 14:12:53 -08:00
Jeffrey Morgan d543b282a7
server: add warning message for deprecated context field (#7878) 2024-11-30 14:05:50 -08:00
Parth Sareen 5f8051180e
Enable index tracking for tools - openai api support (#7888) 2024-11-29 20:00:09 -08:00
Jeffrey Morgan 39e29ae5dd
llama: fix typo and formatting in readme (#7876) 2024-11-28 17:27:11 -08:00