Using Devin to Recover from the CrowdStrike Outage
July 20, 2024 by The Cognition Team
A lot of businesses and critical systems were affected by the CrowdStrike outage yesterday, and we wanted to explore how Devin can help.
- announcements
- devin
The CrowdStrike incident yesterday left Windows machines around the world stuck in the infamous Blue Screen of Death. Recovery efforts are ongoing but painful, and sometimes require manually fixing each machine:
We wanted to explore how Devin can help.
To test Devin’s ability to help, we set up a Windows machine in a cloud environment with simulated CrowdStrike failure conditions. Below, we dive into the fix. You can also review the Devin session yourself to see the full details.
Start of the session
To start, we instructed Devin to follow a playbook to recover machine i-0deda09f7e624a5d8
. The playbook was written by a Cognition engineer, based on the remediation steps recommended on CrowdStrike’s blog for cloud Windows machines. It contains 8 steps, including an overview of the task, high level guidance on the approach to take, and details on which specific files need to be removed.
Playbooks are a great way to communicate upfront all the context Devin needs to find the most efficient path to solve the task. With playbooks, there will be fewer clarifying questions or help needed from the user mid-session.
After these initial instructions, Devin completed the rest of the session on its own. Let’s see what it did.
Devin’s Work Log gives users an overview of what Devin completed
Devin’s work log shows that Devin successfully completed all steps of the playbook. One specific step, remove_crowdstrike_files
, is expanded above to show its execution details — it deleted all files matching Windows/System32/drivers/CrowdStrike/C-00000291-*
.
Devin’s shell history shows the exact commands that were run
We see that Devin successfully ran key commands, for example mounting the drive and removing the bad files:
> ubuntu@ip-10-240-169-238:~$ ssh -i ~/.ssh/devin-us-west-2.pem ubuntu@54.191.93.12 "sudo mkdir /mnt/windows && sudo mount /dev/xvdf1 /mnt/windows"
> ubuntu@ip-10-240-169-238:~$ ssh -i ~/.ssh/devin-us-west-2.pem ubuntu@54.191.93.12 "sudo rm /mnt/windows/Windows/System32/drivers/CrowdStrike/C-00000291-*"
Devin encounters an error it had to debug
Here’s another illustrative moment: After stopping the target machine, Devin tried detaching the root volume. The operation failed because the instance had not fully stopped yet. To debug this, Devin ran a command to check the instance state, since it sometimes takes a few seconds to fully shut down, before trying to detach it again. Devin solved this problem without requesting any additional help from the user.
Run was started with a snapshot
Here’s a small detail: this session was started with a snapshot called with-us-west-2-pem
. The machine snapshots feature allows users of Devin to preload its machine with whatever they want; in this case, private key files used for ssh. Users often use this to preinstall software, pre-authenticate into systems, or have repositories pre-cloned on Devin’s machine. (Snapshot details are only visible from inside the organization that created the session, and won’t show up in the public session link.)
Devin retrieved Knowledge in the middle of the session.
Devin comes equipped with a Knowledge database that users can add to or edit. In this run, “AWS CLI Best Practices” was automatically loaded to help Devin know the right way to use AWS tools given its CLI-only environment. Not visible in this screenshot alone, but it also includes guidance on how to choose security groups, regions, key pairs, and other configs when launching EC2 instances.
Knowledge is a collection of documentation, tips, custom internal libraries, and other materials that Devin needs to be successful within an organization. Devin will use relevant Knowledge automatically to improve its performance, and automatically suggest Knowledge to add based on what it learns.
End of the session
At the end, Devin reported successfully completing the task, and confirmed that the machine is now bootable.
Check out the session yourself.
To fix a single impacted Windows machine, it would probably have been faster to do manually instead of creating a playbook and running Devin. But when the same type of task has to be done many times, whether for DevOps, code migrations, or refactors, playbooks with Devin are a powerful feature.
Join us
Our team is small and talent-dense. Our founding team has 10 IOI gold medals and includes leaders and builders who have worked at the cutting edge of applied AI at companies like Cursor, Scale AI, Lunchclub, Modal, Google DeepMind, Waymo, and Nuro.
Building Devin is just the first step—our hardest challenges still lie ahead. If you’re excited to solve some of the world’s biggest problems and build AI that can reason, learn more about our team and apply to one of the roles below.