ollama/docs/agent-skills-changes.md

7.5 KiB

Agent Skills Feature - Implementation Summary

This document summarizes all changes made to implement agent skills in Ollama, enabling ollama run <agent> with skill-based capabilities.

Overview

Agents are models with attached skills. Skills are directories containing a SKILL.md file with instructions and optional executable scripts. When an agent runs, skills are loaded and injected into the system prompt, and the model can execute scripts via tool calls.

Files Changed

1. cmd/skills.go (NEW FILE)

Core skills implementation:

// Key types
type skillMetadata struct {
    Name        string `yaml:"name"`
    Description string `yaml:"description"`
}

type skillDefinition struct {
    Name        string
    Description string
    Content     string  // SKILL.md body content
    Dir         string  // Absolute path to skill directory
    SkillPath   string  // Absolute path to SKILL.md
}

type skillCatalog struct {
    Skills []skillDefinition
    byName map[string]skillDefinition
}

Key functions:

  • loadSkills(paths []string) - Walks skill directories, parses SKILL.md files
  • parseSkillFile(path, skillDir) - Extracts YAML frontmatter and body content
  • SystemPrompt() - Generates system prompt with skill instructions
  • Tools() - Returns run_skill_script and read_skill_file tools
  • RunToolCall(call) - Executes tool calls from the model
  • runSkillScript(skillDir, command) - Executes shell commands in skill directory

Tools provided to model:

Tool Description
run_skill_script Execute a script in a skill's directory
read_skill_file Read a file from a skill's directory

Security note: runSkillScript has documented limitations (no sandboxing, no path validation). See the function's doc comment for details.


2. cmd/cmd.go

Changes to runOptions struct:

type runOptions struct {
    // ... existing fields ...
    IsAgent   bool
    AgentType string
    Skills    []string
}

Agent detection in RunHandler (~line 497-503):

// Check if this is an agent
isAgent := info.AgentType != "" || len(info.Skills) > 0
if isAgent {
    opts.IsAgent = true
    opts.AgentType = info.AgentType
    opts.Skills = info.Skills
}

Route agents to chat API (~line 557-562):

// For agents, use chat API even in non-interactive mode to support tools
if opts.IsAgent {
    opts.Messages = append(opts.Messages, api.Message{Role: "user", Content: opts.Prompt})
    _, err := chat(cmd, opts)
    return err
}

Skills loading in chat function (~line 1347-1361):

var skillsCatalog *skillCatalog
if opts.IsAgent && len(opts.Skills) > 0 {
    skillsCatalog, err = loadSkills(opts.Skills)
    // ... error handling ...
    // Print loaded skills
    fmt.Fprintf(os.Stderr, "Loaded skills: %s\n", strings.Join(skillNames, ", "))
}

System prompt injection (~line 1448-1455):

  • Skills system prompt is prepended to messages

Tool execution (~line 1497-1533):

  • Executes pending tool calls via skillsCatalog.RunToolCall()
  • Displays script execution and output to terminal

3. parser/parser.go

New valid commands in isValidCommand():

case "from", "license", "template", "system", "adapter", "renderer",
     "parser", "parameter", "message", "requires", "skill", "agent_type":

Command handling in CreateRequest():

case "skill":
    skills = append(skills, c.Args)
case "agent_type":
    req.AgentType = c.Args

Underscore support in command names (~line 545):

case isAlpha(r), r == '_':
    return stateName, r, nil

4. api/types.go

CreateRequest additions (~line 560-564):

// Skills is a list of skill directories for the agent
Skills []string `json:"skills,omitempty"`

// AgentType defines the type of agent (e.g., "conversational", "task-based")
AgentType string `json:"agent_type,omitempty"`

ShowResponse additions (~line 633-637):

// Skills loaded for this agent
Skills []string `json:"skills,omitempty"`

// AgentType for this agent
AgentType string `json:"agent_type,omitempty"`

5. types/model/config.go

ConfigV2 additions:

type ConfigV2 struct {
    // ... existing fields ...

    // Agent-specific fields
    Skills    []string `json:"skills,omitempty"`
    AgentType string   `json:"agent_type,omitempty"`
}

6. server/create.go

Store agent fields (~line 65-66):

config.Skills = r.Skills
config.AgentType = r.AgentType

7. server/routes.go

Return agent fields in ShowResponse (~line 1107):

resp := &api.ShowResponse{
    // ... existing fields ...
    Skills:    m.Config.Skills,
    AgentType: m.Config.AgentType,
}

8. envconfig/config.go

Environment variable support:

func Skills() []string {
    raw := strings.TrimSpace(Var("OLLAMA_SKILLS"))
    if raw == "" {
        return []string{}
    }
    return strings.Split(raw, ",")
}

Agentfile Format

Agentfiles use the same syntax as Modelfiles with additional commands:

FROM gpt-oss:20b

AGENT_TYPE conversational
SKILL /path/to/skills/directory

SYSTEM You are a helpful assistant.

PARAMETER temperature 0.3
PARAMETER top_p 0.9
Command Description
SKILL Path to a directory containing skill subdirectories
AGENT_TYPE Type of agent (e.g., "conversational")

SKILL.md Format

Each skill is a directory with a SKILL.md file:

calculator-skill/
├── SKILL.md
└── scripts/
    └── calculate.py

SKILL.md structure:

---
name: calculator-skill
description: A skill for performing calculations.
---

# Calculator Skill

## Instructions

1. Use `run_skill_script` to execute calculations
2. Call: `python3 scripts/calculate.py '<expression>'`

## Examples

For "What is 25 * 4?":
- Call: run_skill_script with skill="calculator-skill" and command="python3 scripts/calculate.py '25 * 4'"

Requirements:

  • name must match directory name
  • name must be lowercase alphanumeric with hyphens only
  • name max 64 characters
  • description required, max 1024 characters

Usage

# Create an agent
ollama create math-agent -f math-agent.Agentfile

# Run the agent
ollama run math-agent "What is 25 * 4?"

# Output:
# Loaded skills: calculator-skill
# Running script in calculator-skill: python3 scripts/calculate.py '25 * 4'
# Output:
# 25 * 4 = 100

Flow Diagram

1. ollama run math-agent "query"
       │
       ▼
2. RunHandler detects agent (AgentType or Skills present)
       │
       ▼
3. Routes to chat() instead of generate()
       │
       ▼
4. loadSkills() parses SKILL.md files
       │
       ▼
5. SystemPrompt() injects skill instructions
       │
       ▼
6. Tools() provides run_skill_script, read_skill_file
       │
       ▼
7. Model generates response (may include tool calls)
       │
       ▼
8. RunToolCall() executes scripts, returns output
       │
       ▼
9. Display results to user

Security Considerations

The runSkillScript function has known limitations documented in the code:

  • No sandboxing (commands run with user permissions)
  • No path validation (model can run any command)
  • Shell injection risk (sh -c is used)
  • No executable allowlist
  • No environment isolation

Potential improvements (documented as TODOs):

  • Restrict to skill directory paths only
  • Allowlist executables (python3, node, bash)
  • Use sandboxing (Docker, nsjail, seccomp)
  • Require explicit script registration in SKILL.md