Benchmarks Registry

Evaluate and train computer-use agents across 0 tasks.