Commit Graph

  • d261042cad
    int: adjust a few models for integration tests (#11872) Daniel Hiltgen 2025-08-13 15:42:36 -0700
  • bbb91d392a
    cuda: leverage JIT for smaller footprint (#11635) Daniel Hiltgen 2025-08-13 15:42:16 -0700
  • 89f7077c93
    chore: fix some inconsistent function name in comment youzichuan 2025-08-13 16:22:45 +0800
  • 858b91f1e5
    ggml: Use ordinal IDs for AMD GPUs on Linux when UUID is unavailable Jesse Gross 2025-08-11 17:01:07 -0700
  • 6c24c82942
    fix(openai): handle reasoning_effort (#11868) Michael Yang 2025-08-12 11:02:01 -0700
  • 8ea0abf658
    discover: CPU supports flash attention Jesse Gross 2025-08-11 14:45:45 -0700
  • 257f0b6daa
    server: fix error when parsing bad harmony tool calls Devon Rifkin 2025-08-11 14:09:13 -0700
  • fb9aa928c1
    sched: Add support for grouping GPUs (#10678) Daniel Andersen 2025-08-11 22:59:38 +0200
  • 436ecf22fa
    CONTRIBUTING: Explicitly note docs:... as a good example (#11755) Michael Vorburger 2025-08-10 03:12:30 +0200
  • 3d990dc451
    ggml: No-alloc mode Jesse Gross 2025-07-23 14:18:24 -0700
  • bcd5507f4b
    ggml: Support closing backends Jesse Gross 2025-04-17 17:12:01 -0700
  • 723dfd2a33
    ggml: Use GGML's typedef'ed pointer types Jesse Gross 2025-08-06 11:39:08 -0700
  • ae5e109613
    tests: add integration coverage for oss-gpt (#11696) Daniel Hiltgen 2025-08-07 15:06:57 -0700
  • d897a54f08
    server: Reduce gpt-oss context length for small VRAM GPUs Jesse Gross 2025-08-07 13:49:26 -0700
  • afe0c10dbc
    openai: always provide reasoning Devon Rifkin 2025-08-06 18:54:20 -0700
  • 1a7d34231f
    openai: when converting role=tool messages, propagate the tool name Devon Rifkin 2025-08-06 17:00:24 -0700
  • 45eabc3083
    docs: update the faq (#11760) Patrick Devine 2025-08-06 16:55:57 -0700
  • ae9664c01d
    openai: allow for content _and_ tool calls in the same message Devon Rifkin 2025-08-06 15:50:30 -0700
  • cb241cab63
    clean up debugging (#11756) Daniel Hiltgen 2025-08-06 13:31:22 -0700
  • 3e2a98ad55
    Update downloading to pulling in api.md (#11170) Gao feng 2025-08-07 02:33:09 +0800
  • 179bbf2640
    docs: update turbo model name (#11707) Parth Sareen 2025-08-05 17:29:08 -0700
  • c9304f161a
    tools: support anyOf types Devon Rifkin 2025-08-05 16:46:24 -0700
  • e5b777a8d9
    win: static link msvc libs (#11612) Daniel Hiltgen 2025-08-05 16:10:42 -0700
  • b643362f9f
    gptoss: fix memory calc (#11700) Michael Yang 2025-08-05 15:56:12 -0700
  • 063d3e8163
    docs: add docs for Ollama Turbo (#11687) Jeffrey Morgan 2025-08-05 13:09:10 -0700
  • ae8a041461
    ggml: Prevent kv cache quanitization on gpt-oss Jesse Gross 2025-08-05 12:42:07 -0700
  • ed2e8a9022
    gpt-oss (#11672) Michael Yang 2025-08-05 12:21:16 -0700
  • 275510ddf5
    kvcache: Log contents of cache when unable to find a slot Jesse Gross 2025-08-04 16:44:23 -0700
  • c24014a55d
    kvcache: Enable SWA to retain additional entries Jesse Gross 2025-07-30 14:42:57 -0700
  • b923797e99
    fixing broken AMD driver link (#11579) Sajal Kulshreshtha 2025-07-31 00:32:54 +0530
  • 612a87dc69
    Revert "CI: switch back to x86 macos builder" (#11588) Daniel Hiltgen 2025-07-30 08:56:01 -0700
  • 5038e33776
    mac: disable bf16 on unsupported OS versions (#11585) Daniel Hiltgen 2025-07-30 08:50:54 -0700
  • 1d064a0e20
    CI: switch back to x86 macos builder (#11572) Daniel Hiltgen 2025-07-29 16:41:25 -0700
  • 1ee3fe46f3
    Increase performance for Gemma3n models on NVGPUs by enabling CUDA Graph execution (#11525) Oliver Simons 2025-07-29 21:37:06 +0200
  • 279e632945
    kvcache: Don't shift empty batches Jesse Gross 2025-07-28 11:29:25 -0700
  • 9bd69d0110
    docs: fix typos and remove trailing whitespaces (#11554) Yoshi 2025-07-28 11:19:13 -0700
  • 4975cc042e
    readme: add Mayan EDMS to community integrations (#11543) Mayan EDMS 2025-07-27 18:02:52 -0400
  • cdceaff4e1
    kvcache: Group shift operations into batches Jesse Gross 2025-07-25 14:50:05 -0700
  • 9574ed9bb7
    CONTRIBUTING: fix typo in commit message example (#11528) Ruyut 2025-07-26 05:24:06 +0800
  • 0ab1b140af
    cli: catch upstream errors gracefully (#11512) Patrick Devine 2025-07-23 22:16:55 -0700
  • d9a78742ad
    tools: loosen tool argument parsing (#11509) Jeffrey Morgan 2025-07-23 21:21:29 -0700
  • a35d1c358f
    server: use slices.Equal to simplify code (#11502) minxinyi 2025-07-24 05:25:39 +0800
  • 26cd61e41f
    s#x/exp/maps#maps# (#11506) Michael Yang 2025-07-23 13:23:32 -0700
  • 95f5d9d6da
    Fix GetModelInfo (#11496) Patrick Devine 2025-07-22 13:40:47 -0700
  • f5319ac72b
    Update linux.md (#11462) ycomiti 2025-07-22 20:17:31 +0200
  • 59b034f040
    readme: add GMAI - Gradle Managed to community integrations (#11461) Stefan Wärting 2025-07-20 23:55:47 +0200
  • 30ec10cb05
    tools: fix parsing issue when a tool name is a substring of another (#11456) Jeffrey Morgan 2025-07-20 14:55:14 -0700
  • ffa61a51fc
    readme: update argo description to support deep research (#11455) zmldndx 2025-07-20 04:29:38 +0800
  • 5274cd2ead
    ci: switch mac builder to arm64 (#11379) Daniel Hiltgen 2025-07-17 07:33:44 -0700
  • a1a350b608
    docs: add the no-Modelfile function of `ollama create` (#9077) frob 2025-07-17 15:16:10 +1000
  • b2a00a0d2a
    openai: allow openai endpoint to accept webp images (#11412) frob 2025-07-17 12:31:49 +0800
  • 2e57f92b0c
    readme: update the llama.cpp github link (#11427) Haiyue Wang 2025-07-17 12:20:28 +0800
  • 7221b90fe1
    compile bf16 support into ggml-metal (#11430) Michael Yang 2025-07-16 17:32:57 -0700
  • 1c48526e2e
    cmd: add default assistant role to message construction (#11431) Parth Sareen 2025-07-16 11:18:16 -0700
  • 9e9238103d
    api: fix unreachable status err (#11423) Bruce MacDonald 2025-07-16 11:03:28 -0700
  • 8c885fe5eb
    docs: fix typo in macos.md (#11425) Marcelo Fornet 2025-07-16 19:50:46 +0200
  • 43cacd9309
    docs: update modelfile.md to reflect current default num_ctx (#11189) 先知 2025-07-11 22:15:00 +0000
  • b47aa7e75a
    ggml: Use assigned layers when reporting loading stats Jesse Gross 2025-07-07 13:10:14 -0700
  • 015e39a8be
    ggml: Disable unused pipeline parallelism Jesse Gross 2025-07-10 16:55:34 -0700
  • 39cec5338a
    Only load supported models on new engine (#11362) Daniel Hiltgen 2025-07-11 12:21:54 -0700
  • 387cb031b3
    ggml: Report ordinal IDs for AMD GPUs on Windows Jesse Gross 2025-06-25 17:13:32 -0700
  • 50e4df359b
    doc: add MacOS docs (#11334) Daniel Hiltgen 2025-07-08 15:38:04 -0700
  • 4fcc030739
    Reduce default parallelism to 1 (#11330) Daniel Hiltgen 2025-07-08 12:08:37 -0700
  • 1c94c9919b
    API/CLI context enhancements (#11331) Daniel Hiltgen 2025-07-08 11:59:06 -0700
  • 25f6571f34
    add `tool_name` to api.md (#11326) Parth Sareen 2025-07-07 16:53:13 -0700
  • 1efadee48c
    template: add tool result compatibility (#11294) Parth Sareen 2025-07-07 15:53:42 -0700
  • fc4cb04cb9
    ci: modularization (#11324) Daniel Hiltgen 2025-07-07 14:07:43 -0700
  • 5f139b96ab
    Revert "ggml: Temporarily disable reporting UUIDs" Jesse Gross 2025-06-27 16:19:44 -0700
  • ca3520de87
    readme: update Ollama icon size Jeffrey Morgan 2025-07-05 17:20:42 -0700
  • 55a4a37c3a
    int: add performance integration tests (#11173) Daniel Hiltgen 2025-07-05 16:07:09 -0700
  • ba750172ca
    doc: add NVIDIA blackwell to supported list (#11307) Daniel Hiltgen 2025-07-05 16:06:30 -0700
  • 35bf6c0a41
    Update base image to Ubuntu 24.04 LTS (#9681) Vincent RAMPAL 2025-07-06 01:02:33 +0200
  • b23d28b549
    doc: Update link for mac install (#11288) Daniel Hiltgen 2025-07-03 09:48:45 -0700
  • e897624123
    mimic logs for layers on new engine (#11278) Daniel Hiltgen 2025-07-02 16:38:36 -0700
  • a3e4bb7f58
    readme: add NativeMind to community integrations (#11242) XuKecheng 2025-07-02 00:46:15 +0800
  • 9cf8ef9371
    tools: fix parsing tool calls with empty arguments, missing required fields (#11233) Jeffrey Morgan 2025-06-30 08:59:03 -0700
  • 96be53fe6c
    readme: add ollama-bash-toolshed to community integrations (#11224) Attogram Project 2025-06-29 23:59:54 +0200
  • 1cdab47113
    chore: cleanup comments + unused vars (#11225) Michael Yang 2025-06-27 11:45:33 -0700
  • 872d190c8f
    ggml: Temporarily disable reporting UUIDs Jesse Gross 2025-06-27 11:11:49 -0700
  • 8f2099306f
    skip quantizing per_layer_token_embd (#11207) Michael Yang 2025-06-26 21:49:35 -0700
  • 59112600d1
    ci: multi-stage release process (#11001) Daniel Hiltgen 2025-06-26 10:32:48 -0700
  • 10119ec2ee
    fs/ggml: add multiplier in graph estimates (#11208) Jeffrey Morgan 2025-06-26 00:19:44 -0700
  • 84998ae4ba
    fs/ggml: add missing architecture to OllamaEngineRequired() (#11206) Jeffrey Morgan 2025-06-26 00:11:23 -0700
  • 801564fa8b
    add new gemma model (#11204) Michael Yang 2025-06-25 21:47:09 -0700
  • d6253f09c2
    ci: arm sbsa fixes (#11194) Daniel Hiltgen 2025-06-24 21:00:15 -0700
  • 9cf1db79b4
    ci: include dependencies Daniel Hiltgen 2025-06-24 20:27:43 -0700
  • 46654149c9
    ci: pick up arm sbsa cuda libs (#11192) Daniel Hiltgen 2025-06-24 18:59:22 -0700
  • 138c973d8f
    ci: recombine linux amd64 binaries (#11188) Daniel Hiltgen 2025-06-24 18:45:01 -0700
  • dd8d037c16
    load arrays with up to 1024 elements when estimating Devon Rifkin 2025-04-27 13:45:13 -0700
  • 558c1920fa
    ggml: fix crash for array head counts Devon Rifkin 2025-04-25 16:16:15 -0700
  • b9b179fe00
    ci: rocm parallel builds on windows (#11187) Daniel Hiltgen 2025-06-24 15:27:09 -0700
  • 38f92e7332
    CI: switch windows to vs 2022 (#11184) Daniel Hiltgen 2025-06-24 13:26:55 -0700
  • c012d1805b
    avoid context overflow (#11175) Daniel Hiltgen 2025-06-23 15:52:50 -0700
  • 29ec3ddf9a
    Re-remove cuda v11 (#10694) Daniel Hiltgen 2025-06-23 14:07:00 -0700
  • d8b03acc1a
    readme: add ai-hub to community integrations (#11169) AJ 2025-06-23 21:51:12 +0530
  • 95571375dd
    build speedups (#11142) Daniel Hiltgen 2025-06-20 12:32:51 -0700
  • 69ee842b6e
    convert: utility for merging tensors (#11069) Michael Yang 2025-06-20 11:12:01 -0700
  • 4585d231ee
    Reapply "feat: incremental gguf parser (#10822)" (#11114) (#11119) Michael Yang 2025-06-20 11:11:40 -0700
  • 290d4c2c6c
    ggml: Check return status for computation. Jesse Gross 2025-06-19 14:39:20 -0700
  • 29b668e649
    int: add coverage for older models (#11137) Daniel Hiltgen 2025-06-19 12:10:19 -0700