Get up and running with OpenAI gpt-oss, DeepSeek-R1, Gemma 3 and other models.
Go to file
jmorganca 02403c2e62 readme: simplify readme 2025-09-08 22:13:39 -07:00
.github Revert "CI: switch back to x86 macos builder" (#11588) 2025-07-30 08:56:01 -07:00
api fix keep alive (#12041) 2025-08-27 11:51:25 -07:00
app feat: add trace log level (#10650) 2025-05-12 11:43:00 -07:00
auth lint 2024-08-01 17:06:06 -07:00
cmd cli: show the default context length env setting in online help (#11928) 2025-08-15 14:59:52 -07:00
convert convert(gptoss): mxfp4 to ggml layout to avoid jit conversion (#12018) 2025-08-26 16:41:02 -07:00
discover Always filter devices (#12108) 2025-08-29 12:17:31 -07:00
docs docs: show how to debug nvidia init failures (#12216) 2025-09-08 11:39:00 -07:00
envconfig llm: New memory management 2025-08-14 15:24:01 -07:00
format chore(all): replace instances of interface with any (#10067) 2025-04-02 09:44:27 -07:00
fs Hybrid and recurrent memory estimates (#12186) 2025-09-08 14:53:22 -07:00
harmony runner: move harmony to runner (#12052) 2025-09-08 15:07:59 -07:00
integration llm: Clamp batch size to context size 2025-09-08 20:40:11 -07:00
kvcache kvcache: Use Cast instead of Copy for flash attention masks 2025-08-19 12:36:28 -07:00
llama harden uncaught exception registration (#12120) 2025-09-02 09:43:55 -07:00
llm llm: Clamp batch size to context size 2025-09-08 20:40:11 -07:00
logutil logutil: add Trace and TraceContext helpers (#12110) 2025-09-02 13:09:12 -07:00
macapp docs: improve syntax highlighting in code blocks (#8854) 2025-02-07 09:55:07 -08:00
ml logutil: add Trace and TraceContext helpers (#12110) 2025-09-02 13:09:12 -07:00
model embedding gemma model (#12181) 2025-09-04 09:09:07 -07:00
openai openai: remove reasoning as an api.Options (#11993) 2025-08-20 12:21:42 -07:00
parser parser: don't check the file type of safetensors to prevent false negatives. (#12176) 2025-09-05 16:27:40 -07:00
progress create blobs in parallel (#10135) 2025-05-05 11:59:26 -07:00
readline add thinking support to the api and cli (#10584) 2025-05-28 19:38:52 -07:00
runner llm: Clamp batch size to context size 2025-09-08 20:40:11 -07:00
sample model: handle multiple eos tokens (#10577) 2025-05-16 13:40:23 -07:00
scripts ci: rocm parallel builds on windows (#11187) 2025-06-24 15:27:09 -07:00
server runner: move harmony to runner (#12052) 2025-09-08 15:07:59 -07:00
template tools: support anyOf types 2025-08-05 16:46:24 -07:00
thinking thinking: fix double emit when no opening tag 2025-08-21 21:03:12 -07:00
tools api: implement stringer for ToolFunctionParameters (#12038) 2025-08-22 16:26:48 -07:00
types add thinking support to the api and cli (#10584) 2025-05-28 19:38:52 -07:00
version add version 2023-08-22 09:40:58 -07:00
.dockerignore next build (#8539) 2025-01-29 15:03:38 -08:00
.gitattributes chore: update gitattributes (#8860) 2025-02-05 16:37:18 -08:00
.gitignore server/internal: copy bmizerany/ollama-go to internal package (#9294) 2025-02-24 22:39:44 -08:00
.golangci.yaml lint: enable usetesting, disable tenv (#10594) 2025-05-08 11:42:14 -07:00
CMakeLists.txt update vendored llama.cpp and ggml (#11823) 2025-08-14 14:42:58 -07:00
CMakePresets.json Revert "cuda: leverage JIT for smaller footprint (#11635)" (#11913) 2025-08-14 21:19:23 -07:00
CONTRIBUTING.md CONTRIBUTING: Explicitly note docs:... as a good example (#11755) 2025-08-09 18:12:30 -07:00
Dockerfile handle cgo flags in docker build (#11909) 2025-08-15 14:39:35 -07:00
LICENSE `proto` -> `ollama` 2023-06-26 15:57:13 -04:00
Makefile.sync update vendored llama.cpp and ggml (#11823) 2025-08-14 14:42:58 -07:00
README.md readme: simplify readme 2025-09-08 22:13:39 -07:00
SECURITY.md Create SECURITY.md 2024-07-30 21:01:12 -07:00
go.mod s#x/exp/maps#maps# (#11506) 2025-07-23 13:23:32 -07:00
go.sum Reapply "feat: incremental gguf parser (#10822)" (#11114) (#11119) 2025-06-20 11:11:40 -07:00
main.go lint 2024-08-01 17:06:06 -07:00

README.md

  ollama

Ollama

Get up and running with large language models.

macOS

Download

Windows

Download

Linux

curl -fsSL https://ollama.com/install.sh | sh

Manual install instructions

Docker

The official Ollama Docker image ollama/ollama is available on Docker Hub.

Libraries

Community

Quickstart

To run and chat with Gemma 3:

ollama run gemma3

Models

Model Parameters Size Download
Gemma 3 1B 815MB ollama run gemma3:1b
Gemma 3 4B 3.3GB ollama run gemma3
DeepSeek-R1 7B 4.7GB ollama run deepseek-r1
gpt-oss 20B 14GB ollama run gpt-oss

See a full list of models on ollama.com

CLI Reference

Download a model

ollama pull gemma3

Remove a model

ollama rm gemma3

Create a model

ollama create is used to create a model from a Modelfile.

ollama create mymodel -f ./Modelfile

Show model information

ollama show gemma3

Copy a model

ollama cp gemma3 my-model

List models on your computer

ollama list

Multiline input

For multiline input, you can wrap text with """:

>>> """Hello,
... world!
... """
I'm a basic program that prints the famous "Hello, world!" message to the console.

Multimodal models

ollama run gemma3 "What's in this image? /Users/jmorgan/Desktop/smile.png"

Pass the prompt as an argument

ollama run gemma3 "Summarize this file: $(cat README.md)"

List which models are currently loaded

ollama ps

Stop a model which is currently running

ollama stop gemma3

Start Ollama

To run Ollama's server, use:

ollama serve

Building

See the development guide

REST API

Ollama has a REST API for running and managing models.

Generate a response

curl http://localhost:11434/api/generate -d '{
  "model": "gemma3",
  "prompt":"Why is the sky blue?"
}'

See the API documentation for more information.

Community Integrations

Web & Desktop

  • Open WebUI
  • SwiftChat (macOS with ReactNative)
  • Enchanted (macOS native)
  • Hollama
  • Lollms-Webui
  • LibreChat
  • Bionic GPT
  • HTML UI
  • Saddle
  • TagSpaces (A platform for file-based apps, utilizing Ollama for the generation of tags and descriptions)
  • Chatbot UI
  • Chatbot UI v2
  • Typescript UI
  • Minimalistic React UI for Ollama Models
  • Ollamac
  • big-AGI
  • Cheshire Cat assistant framework
  • Amica
  • chatd
  • Ollama-SwiftUI
  • Dify.AI
  • MindMac
  • NextJS Web Interface for Ollama
  • Msty
  • Chatbox
  • WinForm Ollama Copilot
  • NextChat with Get Started Doc
  • Alpaca WebUI
  • OllamaGUI
  • OpenAOE
  • Odin Runes
  • LLM-X (Progressive Web App)
  • AnythingLLM (Docker + MacOs/Windows/Linux native app)
  • Ollama Basic Chat: Uses HyperDiv Reactive UI
  • Ollama-chats RPG
  • IntelliBar (AI-powered assistant for macOS)
  • Jirapt (Jira Integration to generate issues, tasks, epics)
  • ojira (Jira chrome plugin to easily generate descriptions for tasks)
  • QA-Pilot (Interactive chat tool that can leverage Ollama models for rapid understanding and navigation of GitHub code repositories)
  • ChatOllama (Open Source Chatbot based on Ollama with Knowledge Bases)
  • CRAG Ollama Chat (Simple Web Search with Corrective RAG)
  • RAGFlow (Open-source Retrieval-Augmented Generation engine based on deep document understanding)
  • StreamDeploy (LLM Application Scaffold)
  • chat (chat web app for teams)
  • Lobe Chat with Integrating Doc
  • Ollama RAG Chatbot (Local Chat with multiple PDFs using Ollama and RAG)
  • BrainSoup (Flexible native client with RAG & multi-agent automation)
  • macai (macOS client for Ollama, ChatGPT, and other compatible API back-ends)
  • RWKV-Runner (RWKV offline LLM deployment tool, also usable as a client for ChatGPT and Ollama)
  • Ollama Grid Search (app to evaluate and compare models)
  • Olpaka (User-friendly Flutter Web App for Ollama)
  • Casibase (An open source AI knowledge base and dialogue system combining the latest RAG, SSO, ollama support, and multiple large language models.)
  • OllamaSpring (Ollama Client for macOS)
  • LLocal.in (Easy to use Electron Desktop Client for Ollama)
  • Shinkai Desktop (Two click install Local AI using Ollama + Files + RAG)
  • AiLama (A Discord User App that allows you to interact with Ollama anywhere in Discord)
  • Ollama with Google Mesop (Mesop Chat Client implementation with Ollama)
  • R2R (Open-source RAG engine)
  • Ollama-Kis (A simple easy-to-use GUI with sample custom LLM for Drivers Education)
  • OpenGPA (Open-source offline-first Enterprise Agentic Application)
  • Painting Droid (Painting app with AI integrations)
  • Kerlig AI (AI writing assistant for macOS)
  • AI Studio
  • Sidellama (browser-based LLM client)
  • LLMStack (No-code multi-agent framework to build LLM agents and workflows)
  • BoltAI for Mac (AI Chat Client for Mac)
  • Harbor (Containerized LLM Toolkit with Ollama as default backend)
  • PyGPT (AI desktop assistant for Linux, Windows, and Mac)
  • Alpaca (An Ollama client application for Linux and macOS made with GTK4 and Adwaita)
  • AutoGPT (AutoGPT Ollama integration)
  • Go-CREW (Powerful Offline RAG in Golang)
  • PartCAD (CAD model generation with OpenSCAD and CadQuery)
  • Ollama4j Web UI - Java-based Web UI for Ollama built with Vaadin, Spring Boot, and Ollama4j
  • PyOllaMx - macOS application capable of chatting with both Ollama and Apple MLX models.
  • Cline - Formerly known as Claude Dev is a VSCode extension for multi-file/whole-repo coding
  • Cherry Studio (Desktop client with Ollama support)
  • ConfiChat (Lightweight, standalone, multi-platform, and privacy-focused LLM chat interface with optional encryption)
  • Archyve (RAG-enabling document library)
  • crewAI with Mesop (Mesop Web Interface to run crewAI with Ollama)
  • Tkinter-based client (Python tkinter-based Client for Ollama)
  • LLMChat (Privacy focused, 100% local, intuitive all-in-one chat interface)
  • Local Multimodal AI Chat (Ollama-based LLM Chat with support for multiple features, including PDF RAG, voice chat, image-based interactions, and integration with OpenAI.)
  • ARGO (Locally download and run Ollama and Huggingface models with RAG and deep research on Mac/Windows/Linux)
  • OrionChat - OrionChat is a web interface for chatting with different AI providers
  • G1 (Prototype of using prompting strategies to improve the LLM's reasoning through o1-like reasoning chains.)
  • Web management (Web management page)
  • Promptery (desktop client for Ollama.)
  • Ollama App (Modern and easy-to-use multi-platform client for Ollama)
  • chat-ollama (a React Native client for Ollama)
  • SpaceLlama (Firefox and Chrome extension to quickly summarize web pages with ollama in a sidebar)
  • YouLama (Webapp to quickly summarize any YouTube video, supporting Invidious as well)
  • DualMind (Experimental app allowing two models to talk to each other in the terminal or in a web interface)
  • ollamarama-matrix (Ollama chatbot for the Matrix chat protocol)
  • ollama-chat-app (Flutter-based chat app)
  • Perfect Memory AI (Productivity AI assists personalized by what you have seen on your screen, heard, and said in the meetings)
  • Hexabot (A conversational AI builder)
  • Reddit Rate (Search and Rate Reddit topics with a weighted summation)
  • OpenTalkGpt (Chrome Extension to manage open-source models supported by Ollama, create custom models, and chat with models from a user-friendly UI)
  • VT (A minimal multimodal AI chat app, with dynamic conversation routing. Supports local models via Ollama)
  • Nosia (Easy to install and use RAG platform based on Ollama)
  • Witsy (An AI Desktop application available for Mac/Windows/Linux)
  • Abbey (A configurable AI interface server with notebooks, document storage, and YouTube support)
  • Minima (RAG with on-premises or fully local workflow)
  • aidful-ollama-model-delete (User interface for simplified model cleanup)
  • Perplexica (An AI-powered search engine & an open-source alternative to Perplexity AI)
  • Ollama Chat WebUI for Docker (Support for local docker deployment, lightweight ollama webui)
  • AI Toolkit for Visual Studio Code (Microsoft-official VSCode extension to chat, test, evaluate models with Ollama support, and use them in your AI applications.)
  • MinimalNextOllamaChat (Minimal Web UI for Chat and Model Control)
  • Chipper AI interface for tinkerers (Ollama, Haystack RAG, Python)
  • ChibiChat (Kotlin-based Android app to chat with Ollama and Koboldcpp API endpoints)
  • LocalLLM (Minimal Web-App to run ollama models on it with a GUI)
  • Ollamazing (Web extension to run Ollama models)
  • OpenDeepResearcher-via-searxng (A Deep Research equivalent endpoint with Ollama support for running locally)
  • AntSK (Out-of-the-box & Adaptable RAG Chatbot)
  • MaxKB (Ready-to-use & flexible RAG Chatbot)
  • yla (Web interface to freely interact with your customized models)
  • LangBot (LLM-based instant messaging bots platform, with Agents, RAG features, supports multiple platforms)
  • 1Panel (Web-based Linux Server Management Tool)
  • AstrBot (User-friendly LLM-based multi-platform chatbot with a WebUI, supporting RAG, LLM agents, and plugins integration)
  • Reins (Easily tweak parameters, customize system prompts per chat, and enhance your AI experiments with reasoning model support.)
  • Flufy (A beautiful chat interface for interacting with Ollama's API. Built with React, TypeScript, and Material-UI.)
  • Ellama (Friendly native app to chat with an Ollama instance)
  • screenpipe Build agents powered by your screen history
  • Ollamb (Simple yet rich in features, cross-platform built with Flutter and designed for Ollama. Try the web demo.)
  • Writeopia (Text editor with integration with Ollama)
  • AppFlowy (AI collaborative workspace with Ollama, cross-platform and self-hostable)
  • Lumina (A lightweight, minimal React.js frontend for interacting with Ollama servers)
  • Tiny Notepad (A lightweight, notepad-like interface to chat with ollama available on PyPI)
  • macLlama (macOS native) (A native macOS GUI application for interacting with Ollama models, featuring a chat interface.)
  • GPTranslate (A fast and lightweight, AI powered desktop translation application written with Rust and Tauri. Features real-time translation with OpenAI/Azure/Ollama.)
  • ollama launcher (A launcher for Ollama, aiming to provide users with convenient functions such as ollama server launching, management, or configuration.)
  • ai-hub (AI Hub supports multiple models via API keys and Chat support via Ollama API.)
  • Mayan EDMS (Open source document management system to organize, tag, search, and automate your files with powerful Ollama driven workflows.)
  • Serene Pub (Beginner friendly, open source AI Roleplaying App for Windows, Mac OS and Linux. Search, download and use models with Ollama all inside the app.)
  • Andes (A Visual Studio Code extension that provides a local UI interface for Ollama models)
  • Clueless (Open Source & Local Cluely: A desktop application LLM assistant to help you talk to anything on your screen using locally served Ollama models. Also undetectable to screenshare)

Cloud

Terminal

Apple Vision Pro

  • SwiftChat (Cross-platform AI chat app supporting Apple Vision Pro via "Designed for iPad")
  • Enchanted

Database

  • pgai - PostgreSQL as a vector database (Create and search embeddings from Ollama models using pgvector)
  • MindsDB (Connects Ollama models with nearly 200 data platforms and apps)
  • chromem-go with example
  • Kangaroo (AI-powered SQL client and admin tool for popular databases)

Package managers

Libraries

Mobile

  • SwiftChat (Lightning-fast Cross-platform AI chat app with native UI for Android, iOS, and iPad)
  • Enchanted
  • Maid
  • Ollama App (Modern and easy-to-use multi-platform client for Ollama)
  • ConfiChat (Lightweight, standalone, multi-platform, and privacy-focused LLM chat interface with optional encryption)
  • Ollama Android Chat (No need for Termux, start the Ollama service with one click on an Android device)
  • Reins (Easily tweak parameters, customize system prompts per chat, and enhance your AI experiments with reasoning model support.)

Extensions & Plugins

Supported backends

  • llama.cpp project founded by Georgi Gerganov.

Observability

  • Opik is an open-source platform to debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards. Opik supports native intergration to Ollama.
  • Lunary is the leading open-source LLM observability platform. It provides a variety of enterprise-grade features such as real-time analytics, prompt templates management, PII masking, and comprehensive agent tracing.
  • OpenLIT is an OpenTelemetry-native tool for monitoring Ollama Applications & GPUs using traces and metrics.
  • HoneyHive is an AI observability and evaluation platform for AI agents. Use HoneyHive to evaluate agent performance, interrogate failures, and monitor quality in production.
  • Langfuse is an open source LLM observability platform that enables teams to collaboratively monitor, evaluate and debug AI applications.
  • MLflow Tracing is an open source LLM observability tool with a convenient API to log and visualize traces, making it easy to debug and evaluate GenAI applications.