Fixing Files on the Root EBS Volume of an EC2 Instance


You can examine and edit files on the root EBS volume on an EC2 instance even if you are in what you considered a disastrous situation like:

  • You lost your ssh key or forgot your password

  • You made a mistake editing the /etc/sudoers file and can no longer gain root access with sudo to fix it

  • Your long running instance is hung for some reason, cannot be contacted, and fails to boot properly

  • You need to recover files off of the instance but cannot get to it

On a physical computer sitting at your desk, you could simply boot the system with a CD or USB stick, mount the hard drive, check out and fix the files, then reboot the computer to be back in business.

A remote EC2 instance, however, seems distant and inaccessible when you are in one of these situations. Fortunately, AWS provides us with the power and flexibility to be able to recover a system like this, provided that we are running EBS boot instances and not instance-store.

The approach on EC2 is somewhat similar to the physical solution, but we’re going to move and mount the faulty “hard drive” (root EBS volume) to a different instance, fix it, then move it back.

In some situations, it might simply be easier to start a new EC2 instance and throw away the bad one, but if you really want to fix your files, here is the approach that has worked for many:

Set Up

Identify the original instance (A) and the root EBS volume that contains the broken root file system with the files you want to view and edit.


volume=$(ec2-describe-instances $instance_a |
  egrep '^BLOCKDEVICE./dev/sda1' | cut -f3)

Identify the second EC2 instance (B) that you will use to fix the files on the original EBS volume. This instance must be running in the same availability zone as instance A so that it can have the EBS volume attached to it. If you don’t have an instance already running, start a temporary one.


Stop (do not terminate) the broken instance A, wait for it to come to a complete stop, detach the root EBS volume from the instance, wait for it to be detached, then attach the volume to instance B on an unused device.

ec2-stop-instances $instance_a
ec2-detach-volume $volume
ec2-attach-volume --instance $instance_b --device /dev/sdj $volume

ssh to instance B and mount the volume so that you can access its file system.

ssh [instance b]

sudo mkdir -m 000 /vol-a
sudo mount /dev/xvdj /vol-a

Note: On older kernels, you may need to use /dev/sdj instead of /dev/xvdj inside the instance.

Fix It

At this point your entire root file system from instance A is available for viewing and editing under /vol-a on instance B. For example, you may want to:

  • Put the correct ssh keys in /vol-a/home/ubuntu/.ssh/authorized_keys

  • Edit and fix /vol-a/etc/sudoers

  • Look for error messages in /vol-a/var/log/syslog

  • Copy important files out of /vol-a/

Note: The uids on the two instances may not be identical, so take care if you are creating, editing, or copying files that belong to non-root users. For example, your mysql user on instance A may have the same UID as your postfix user on instance B which could cause problems if you chown files with one name and then move the volume back to A.

Wrap Up

After you are done and you are happy with the files under /vol-a, unmount the file system (still on instance-B):

sudo umount /vol-a
sudo rmdir /vol-a

Now, back on your system with ec2-api-tools, continue moving the EBS volume back to its home on the original instance A and start the instance again:

ec2-detach-volume $volume
ec2-attach-volume --instance $instance_a --device /dev/sda1 $volume
ec2-start-instances $instance_a

Hopefully, you fixed the problem, instance A comes up just fine, and you can accomplish what you originally set out to do. If not, you may need to continue repeating these steps until you have it working.

Note: If you had an Elastic IP address assigned to instance A when you stopped it, you’ll need to reassociate it after starting it up again.

Remember! If your instance B was temporarily started just for this process, don’t forget to terminate it now.

[Update 2014-08-09: Most modern AMIs use xvdX instead of sdX for attached volumes.]


Would like to know whether this can cause change in private/internal i-p address of original instance particularly when the volume in question is a root volume. What can be done if there is a requirement of internal i-p to be not changed? for eg some software license is host i-p based.


hpk. warrier :

Stopping and starting a standard instance is likely to give you a different internal IP address. An Elastic IP address lets you keep a specific external IP address. Starting an instance in VPC should allow you to assign and keep a specific internal IP address.

Hi Eric

I'm facing a problem here.I launched new instance with new key pair. In new instance I mounted the volume of original instance to dir /vol-old. I rename the authorized_keys file in /vol-old/home/ubuntu/.ssh/authorized_keys_bkup. Then I copied the authorized_keys file from /home/ubuntu/.ssh/authorized_keys to /vol-old/home/ubuntu/.ssh/authorized_keys .

Then i detached the volume and re-attached to original instance. However when I try to access original instance with new key details, i get "Network error: connection refused".

Any suggestions or hint to check further?


"connection refused" is completely different from "Permission denied". You haven't even made a connection to the ssh server, so it doesn't matter what your authorized_keys file contains. Here's and article where I point out information you'll want to provide when asking for help with EC2 connection problems:

Thank you so much for this. You just saved my bacon.


I don't usually approve and post comments that just say "thanks" but I like bacon, so you're welcome.

Help, I lost the reference to $volume by accident during "Fix It", and now I can't properly do "Wrap Up". Also, attempting to get a new reference with the help of ec2-describe-instances doesn't work anymore. How do I get a reference to the volume now?


Use this to see what volume os attached to the /dev/sdj on the second instance:

ec2-describe-instances $instance_b

Then follow the instructions to move it to the first instance and restart.

Thanks for this topic. It saved my butt after I accidentally trashed sudo on my server. One note, though: Apparently volumes named sd(*) get automatically renamed to xvd(*) on the type of server I use, so my system couldn't seem to find the volume after it was attached and it took me a while to figure out why!


The device name depends on the kernel version. I've updated the instructions to make this easier to catch.

