If you've ever wondered how to make sure an LLM performs well on your specific task, this guide is for you! It covers the different ways you can evalu

Search code, repositories, users, issues, pull requests...

submited by
Style Pass
2024-10-10 13:30:03

If you've ever wondered how to make sure an LLM performs well on your specific task, this guide is for you! It covers the different ways you can evaluate a model, guides on designing your own evaluations, and tips and tricks from practical experience.

Whether working with production models, a researcher or a hobbyist, I hope you'll find what you need; and if not, open an issue (to suggest ameliorations or missing resources) and I'll complete the guide!

These are mostly beginner guides to LLM basics, but will still contain some tips and cool references! If you're a advanced user, I suggest skimming to the Going further sections.

Many thanks also to all the people who inspired this guide through discussions either at events or online, notably and not limited to:

Leave a Comment