RepoQA for Evaluating Long-Context Code Understanding

submited by

Style Pass

2024-04-28 06:30:03

🔊 The goal of RepoQA: is to create a series of long-context code understanding tasks to challenge chat/instruction models for code:

Overview: This task ask the model to retrieve 10 needle functions from each of 5 langauges x 10 repositories (500 sub-tasks/tests). Each time the model is given a long chunk of source code (following import dependency) and a precise function description, and we ask the model to find the function in the context that corresponds to the description. More details can be found at 🔗How It Works.

SNF includes 500 sub-tasks from 5 languages x 10 repositories x 10 needles. The prompt and expected output are demonstrated in the following figure: