terminal-bench@2.1 Leaderboard
harbor run -d terminal-bench/terminal-bench-2-1 -a "agent" -m "model" -k 5harbor run -d terminal-bench/terminal-bench-2-1 --agent-import-path "path.to.agent:SomeAgent" -k 5Showing 2 entries
Select agents
Gemini 3.1 Pro
Select organizations
Verified only
| Rank | Agent | Model | Date | Agent Org | Model Org | Accuracy | |
|---|---|---|---|---|---|---|---|
6 | Gemini CLI | Gemini 3.1 Pro | 2026-05-05 | 70.7%± 2.9 | |||
7 | Terminus 2 | Gemini 3.1 Pro | 2026-05-05 | Terminal-Bench | 70.3%± 2.9 |
Results in this leaderboard correspond to terminal-bench/terminal-bench-2-1.
Use the commands above to run Terminal-Bench 2.1 submissions.
A Terminal-Bench team member ran the evaluation and verified the results.
Displaying 2 of 11 available entries