We ran a series of experiments to explore how far Generative AI can currently be pushed toward autonomously developing high-quality, up-to-date sof

How far can we push AI autonomy in code generation?

submited by

Style Pass

2025-08-05 16:30:01

We ran a series of experiments to explore how far Generative AI can currently be pushed toward autonomously developing high-quality, up-to-date software without human intervention. As a test case, we created an agentic workflow to build a simple Spring Boot application end to end. We found that the workflow could ultimately generate these simple applications, but still observed significant issues in the results—especially as we increased the complexity. The model would generate features we hadn't asked for, make shifting assumptions around gaps in the requirements, and declare success even when tests were failing. We concluded that while many of our strategies — such as reusable prompts or a reference application — are valuable for enhancing AI-assisted workflows, a human in the loop to supervise generation remains essential.

Birgitta is a Distinguished Engineer and AI-assisted delivery expert at Thoughtworks. She has over 20 years of experience as a software developer, architect and technical leader.

How far can we push AI autonomy in code generation?

Leave a Comment

Related Posts

Recent Posts

The AI Agent Backbone for Businesses and Developers

NeuralMorse — Reinventing Morse Code with Neural Networks

Building Income Through LibrePCB: A Personal Story

Search code, repositories, users, issues, pull requests...

Technical Issues of Separation in Function Cells and Value Cells

No more dependency management headaches

Skynet: LLMs controlling real robots and drones with Bash

The Prevalence of Recursive Reckoning in Everyday Life

Resilience Media Secures Investment, Scales its Editorial Team

UCIe 3.0 Spec Released with Big Speed Up for Chiplets

Trusting the Science: Do Diet Researchers Know What's in the Diets They Research?

Desperate measures to save Intel: US reportedly forcing TSMC to buy 49% stake in Intel to secure tariff relief for Taiwan

LinkedIn Feed Simplifier

Is a Billion Dollar AI Company Possible?

How we saved $13,000 per month by fixing indentation

Research: The Hidden Penalty of Using AI at Work

Inside SpinLaunch, the Space Industry’s Best Kept Secret

Three US agencies get failing grades for not following IT best practices

Attacker could defeat Dell firmware flaws with a vegetable

When does the body really start aging? The answer may surprise you.