ollama/cmd/eval/README.md

51 lines
847 B
Markdown

# eval
Evaluation tool for testing Ollama models.
## Usage
Run all tests:
```bash
go run . -model llama3.2:latest
```
Run specific suite:
```bash
go run . -model llama3.2:latest -suite tool-calling-basic -v
```
List available suites:
```bash
go run . -list
```
## Adding Tests
Edit `suites.go` to add new test suites. Each test needs:
- `Name`: test identifier
- `Prompt`: what to send to the model
- `Check`: function to validate the response
Example:
```go
{
Name: "my-test",
Prompt: "What is 2+2?",
Check: Contains("4"),
}
```
Available check functions:
- `HasResponse()` - response is non-empty
- `Contains(s)` - response contains substring
- `CallsTool(name)` - model called specific tool
- `NoTools()` - model called no tools
- `MinTools(n)` - model called at least n tools
- `All(checks...)` - all checks pass