When truncating inputs to the the context window at the beginning of a sequence, we remove the minimum amount possible. However, this may cause us to truncate to the middle of a set of inputs that the model specified should not be split up. To avoid this, we need to remove the rest of the partial batch. |
||
|---|---|---|
| .. | ||
| cache.go | ||
| cache_test.go | ||
| runner.go | ||