🪄 Lumos: Learning Agents with Unified Data, Modular Design, and Open-Source LLMs

submited by
Style Pass
2024-04-02 04:00:06

Instead of using Self-Instruct method, we use LLMs to convert ground-truth intermediate reasoning steps into the expected high-quality annotations aligning with our proposed formulations.

Finally, we are able to generate ~40K annotations to train Lumos planning and grounding modules (one of the largest resources for language agent fine-tuning). The annotation sources cover web, complex QA and math task types. See our final annotation data in Huggingface Dataset and prompt details in Github.

We find that Lumos outperforms GPT-4/3.5-based agents on complex QA and web tasks. In particular, Lumos outperforms GPT-4 5.1 step success rate on Mind2Web and GPT-3.5-turbo-based ReAct 5.1 LLM accuracy. Lumos also achieves better performance than 2-4x bigger language agents on maths tasks.

We compare Lumos formulation with other baseline formulations to train open-source agents. The baseline formulations are Chain-of-Thought Training and Integrated Agent Training.

Leave a Comment