WebArbiter - Datasets Benchmark, training data, and search trajectories for WebArbiter. ICLR 2026. ZYao720/WEBPRMBENCH Viewer • Updated Apr 9 • 4.6k • 108 • 1 ZYao720/WebArbiter-Data Viewer • Updated Apr 9 • 9.64k • 59 • 2 ZYao720/WebArbiter-Trajectories Updated Apr 8 • 12
WebArbiter - Models WebArbiter process reward models for web agents. Reasoning distillation + RL. ICLR 2026. ZYao720/WebArbiter-8B-Qwen3 Text Generation • 8B • Updated Apr 9 • 6 ZYao720/WebArbiter-7B Text Generation • 8B • Updated Apr 9 • 4 • 1 ZYao720/WebArbiter-4B-Qwen3 Text Generation • 4B • Updated Apr 9 • 3 • 1 ZYao720/WebArbiter-3B Text Generation • 3B • Updated Apr 9 • 5
WebArbiter - Datasets Benchmark, training data, and search trajectories for WebArbiter. ICLR 2026. ZYao720/WEBPRMBENCH Viewer • Updated Apr 9 • 4.6k • 108 • 1 ZYao720/WebArbiter-Data Viewer • Updated Apr 9 • 9.64k • 59 • 2 ZYao720/WebArbiter-Trajectories Updated Apr 8 • 12
WebArbiter - Models WebArbiter process reward models for web agents. Reasoning distillation + RL. ICLR 2026. ZYao720/WebArbiter-8B-Qwen3 Text Generation • 8B • Updated Apr 9 • 6 ZYao720/WebArbiter-7B Text Generation • 8B • Updated Apr 9 • 4 • 1 ZYao720/WebArbiter-4B-Qwen3 Text Generation • 4B • Updated Apr 9 • 3 • 1 ZYao720/WebArbiter-3B Text Generation • 3B • Updated Apr 9 • 5