News
Latest updates and announcements from the Terminal-Bench team.
Fri May 23 2025News
Terminal-Bench on the Claude 4 Model Card
Anthropic features Terminal-Bench in their latest release and sets a new SOTA.
By Mike Merrill and Alex Shaw
Mon May 19 2025Release
Introducing Terminal-Bench
An evaluation framework and benchmark to quantify agents' ability to complete complex tasks in the terminal.
By Mike Merrill, Alex Shaw, Chris Rytting, Ludwig Schmidt, and Andy Konwinski