In March 2024, a new AI company burst onto the scene with impressive backing: a $21 million Series A led by Founders Fund, with support from industry

Thoughts On A Month With Devin

submited by
Style Pass
2025-01-17 07:00:18

In March 2024, a new AI company burst onto the scene with impressive backing: a $21 million Series A led by Founders Fund, with support from industry leaders including the Collison brothers, Elad Gil, and other tech luminaries. The team behind it? IOI gold medalists - the kind of people that solve programming problems most of us can’t even understand. Their product, Devin, promised to be a fully autonomous software engineer that could chat with you like a human colleague, capable of everything from learning new technologies and debugging mature codebases to deploying full applications and even training AI models.

The early demos were compelling. A video showed Devin independently completing an Upwork bounty, installing and running a PyTorch project without human intervention.1 The company claimed Devin could resolve 13.86% of real-world GitHub issues end-to-end on the SWE-bench benchmark - ~3x times better than previous systems. Only a select group of users could access it initially, leading to breathless tweets about how this would revolutionize software development.

As a team at Answer.AI that routinely experiments with AI developer tools, something about Devin felt different. If it could deliver even half of what it promised, it could transform how we work. But while Twitter was full of enthusiasm, we couldn’t find many detailed accounts of people actually using it. So we decided to put it through its paces, testing it against a wide range of real-world tasks. This is our story - a thorough, real-world attempt to work with one of the most hyped AI products of 2024.

Leave a Comment