The Best AI Software Engineer is Open Source

OpenHands (fka OpenDevin) tops the SWE-bench Benchmark

Nov 04, 2024

OpenHands (formerly OpenDevin) just became the most accurate AI engineer in the world. We topped the SWE-bench leaderboard, solving 53% of the Verified set, 41.67% of the Lite set, and 29.4% of the Full set, putting us squarely at the top of the leaderboard.

This is a huge moment for LLM-based code generation: for the first time, we’re able to solve a majority of well-written, tightly scoped issues using nothing but AI.

But the best AI software engineer isn't just about benchmark scores—it's about how the technology is developed, who controls it, and most importantly, who benefits from it. Being open source makes OpenHands fundamentally better not just in terms of technical capability, but in ways that will persist regardless of what future benchmarks might show.

The Power of Many

In addition to our ~33k stars, OpenHands has over 200 contributors. That's not just a vanity metric—it represents a fundamentally different approach to development. When someone finds a better way to handle browser actions, or discovers a more efficient approach to file localization, that improvement is immediately available to everyone.

And this isn’t just a community of hobbyists. We have researchers from over a dozen leading institutions working on:

Long-term planning strategies
New file editing techniques
File localization
Multi-agent architectures
Browser interactions

This diversity of expertise and approach is impossible to replicate in a closed environment. Each research team brings their own perspective, their own methodologies, and their own insights. They're not just using OpenHands—they're building it, pushing the boundaries of what's possible in AI-assisted development.

Speed and Agility

When Claude 3.5 released new tool-use capabilities, we had them deeply integrated within a couple days, and then evaluated within a few hours—largely thanks to the hard work of Xingyao Wang.

This wasn't luck or exceptional programming—it was the natural result of having an extensible agent framework built through hundreds of iterations of community feedback. Every researcher has their own ideas about how agents should be implemented, so the OpenHands framework has prioritized flexibility, allowing us to move quickly when new ideas crop up.

Of course, our initial implementation had a few bugs here and there, but they were hard to spot by pouring through the mountain of data generated by our evaluation pipeline. Fortunately we have thousands of users applying OpenHands to their daily work, and they were able to surface bug reports that pushed us up a few more percentage points. (Big thanks to all the brave folks working off the main branch instead of a stable release!)

Free as in Freedom

But again, the being open source doesn’t just allow us to build better technology. It keeps control in the hands of the software engineering community.

Agentic technology is too important, too transformative to be controlled by any single entity. It’s changing how we work, how we interact with machines, even how we interact with each other.

Every advance we make with OpenHands is immediately available to everyone. Every breakthrough becomes a building block for the next innovation. Every contributor owns a piece of the future we're building.

This matters more than any benchmark score.

The Long Game

To be clear, I don’t expect OpenHands to stay permanently at the top of the leaderboard; someone new leapfrogs to the top every few weeks. Closed-source companies—many of which have hundreds of millions in funding—will occasionally surge ahead on specific benchmarks. They'll have a moment in the sun, a big press release, and a temporary advantage.

But they're fighting a losing battle against basic economics and network effects. The collective intelligence of a global community, working in the open, will always outpace what can be accomplished behind closed doors.

The future of software development will be increasingly AI-assisted. The question isn't whether AI will help write code, but who will control that technology and how it will evolve. OpenHands proves that the open source model isn't just more equitable—it's more effective.

Our position at the top of the SWE-bench leaderboard is gratifying, but it's not the point. The point is that we're building something that belongs to everyone, that improves through the contributions of everyone, and that ultimately serves everyone.

That's what makes OpenHands the best AI software engineer, regardless of what any particular benchmark might say.

Want to help us reimagine software development? Join the community on GitHub or come work with us!

Coding on Autopilot

Discussion about this post