OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

submited by
Style Pass
2024-04-28 19:30:04

Key statistics of OSWorld. The “Supp. tasks” refers to the Windows-based tasks, that could only be used after activation due to copyright restrictions.

Distribution of task instructions in OSWorld based on the app domains and operation types to showcase the content intuitively.

We thank Sida Wang, Peter Shaw, Alane Suhr, Luke Zettlemoyer, Chen Henry Wu, Pengcheng Yin, Shunyu Yao, Xing Han Lu, Siva Reddy, Ruoxi Sun, Zhiyuan Zeng, and Lei Li for their helpful feedback on this work

This website is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Leave a Comment