Get up and running with OpenAI gpt-oss, DeepSeek-R1, Gemma 3 and other models.

deepseek gemma gemma3 gemma3n go golang gpt-oss llama llama2 llama3 llava llm llms mistral ollama phi4 qwen

Go to file

Michael Yang fb19059b29 just in time install llama-cpp-python		2023-06-30 17:23:17 -07:00
desktop	fix desktop `README.md` until download links are ready	2023-06-30 12:34:32 -04:00
docs	update development.md	2023-06-28 12:41:30 -07:00
ollama	just in time install llama-cpp-python	2023-06-30 17:23:17 -07:00
.gitignore	add templates to prompt command	2023-06-26 13:41:16 -04:00
Dockerfile	add basic `Dockerfile`	2023-06-30 12:19:04 -04:00
LICENSE	`proto` -> `ollama`	2023-06-26 15:57:13 -04:00
README.md	Update README.md	2023-06-30 16:30:35 -04:00
build.py	desktop: fixes for initial publish	2023-06-27 14:34:56 -04:00
models.json	change name.json	2023-06-30 15:15:18 -04:00
poetry.lock	just in time install llama-cpp-python	2023-06-30 17:23:17 -07:00
pyproject.toml	just in time install llama-cpp-python	2023-06-30 17:23:17 -07:00
requirements.txt	just in time install llama-cpp-python	2023-06-30 17:23:17 -07:00

README.md

Ollama

Ollama is a tool for running any large language model on any machine. It's designed to be easy to use and fast, supporting the largest number of models possible by using the fastest loader available for your platform and model.

Note: this project is a work in progress. Certain models that can be run with ollama are intended for research and/or non-commercial use only.

Install

Using pip:

pip install ollama

Using docker:

docker run ollama/ollama

Quickstart

To run a model, use ollama run:

ollama run orca-mini-3b

You can also run models from hugging face:

ollama run huggingface.co/TheBloke/orca_mini_3B-GGML

Or directly via downloaded model files:

ollama run ~/Downloads/orca-mini-13b.ggmlv3.q4_0.bin

Python SDK

Example

import ollama
ollama.generate("orca-mini-3b", "hi")

`ollama.generate(model, message)`

Generate a completion

ollama.generate("./llama-7b-ggml.bin", "hi")

`ollama.models()`

List available local models

models = ollama.models()

`ollama.load(model)`

Manually a model for generation

ollama.load("model")

`ollama.unload(model)`

Unload a model

ollama.unload("model")

`ollama.pull(model)`

Download a model

ollama.pull("huggingface.co/thebloke/llama-7b-ggml")

`ollama.search(query)`

Search for compatible models that Ollama can run

ollama.search("llama-7b")

Documentation

Development