We’re announcing some major changes in Devin 2.1. Devin now reports its confidence that it can complete tasks using 🟢 🟡 🔴 and is more likely to succeed in large codebases with better codebase context 🧠
It can sometimes be hard to tell whether Devin (or any other coding agent) is likely to succeed with a task. When Devin tries to tackle a task that's outside of its current capabilities, this sometimes leads to wasted time and ACUs. And, one of the biggest blockers to successful sessions is simply that initial prompts are underspecified—they don't contain enough detail for Devin to use.
Today, we’re announcing some major changes in Devin 2.1, with these issues in mind.
Devin now reports its confidence that it can complete tasks using 🟢 🟡 🔴, and is more likely to succeed in large codebases with better codebase context 🧠. We've also added these upgrades to the Linear and Jira integrations.
At multiple points in each session, Devin will express how confident it is in its approach:
When Devin doesn’t have 🟢 confidence, it will ask clarifying questions to improve its understanding and raise the score.
Developers can boost Devin's confidence by providing guidance or answering the questions that Devin raises.
We've found that Confidence Scores are highly correlated with task success, with 🟢 scores resulting in twice the likelihood of a merged PR compared to 🔴.
When using Devin via our Linear and Jira integrations, it's easy to request confidence scores for multiple issues at once. This happens without starting actual Devin sessions, meaning you can score as many issues as you'd like and only have Devin work on the highest-confidence issues.
In Settings, you can configure Devin to automatically scan all issues when they're created.
The same codebase understanding that powers DeepWiki is now built-in to Devin. At any point during the task, you can ask a question about the code, get clarification on some implementation, or ask for follow-up, and Devin will provide a response informed by codebase snippets.
Devin auto-detects when it should scan your codebase to answer your question, but you can also trigger it using !ask.
Now that Devin is aware of its own confidence, Devin will wait for user approval when it is unsure about its plan. Otherwise, it will proceed automatically and accept async feedback.
Note: The setting to control whether Devin automatically proceeds with its plan will be deprecated. The planning process can still be controlled with user input. Simply tell Devin to wait for approval. Or, add knowledge expressing any default preferences around getting plan confirmation.
You can start working with Devin 2.1 today at app.devin.ai.
For more information about Devin Enterprise, reach out to our Sales team here.