A Simpler Way To Replace Instance Hardware on EC2

A while back I wrote an article describing a way to move the root EBS volume from one running instance to another. I pitched this as a way to replace the hardware for your instance in the event of failures.

Since then, I have come to the realization that there is a much simpler method to move your instance to new hardware, and I have been using this new method for months when I run into issues that I suspect might be attributed to underlying hardware issues.

This method is so simple, that I am almost embarrassed about having written the previous article, but I’ll point out below at least one benefit that still exists with the more complicated approach.

I now use this process as the second step–after a simple reboot–when I am experiencing odd problems like not being able to connect to a long running EC2 instance. (The zeroth step is to start running and setting up a replacement instance in the event that steps one and two do not produce the desired results.)

Here goes…

Method

To move your EBS boot instance to new hardware on EC2:

  1. Stop the EC2 instance

     ec2-stop-instances $instanceid
    
  2. Start the EC2 instance

     ec2-start-instances $instanceid
    
  3. (optional) If you had an Elastic IP address associated with the instance, re-associate it:

     ec2-associate-address --instance $instanceid $ipaddress
    

It’s that simple. In my experience I almost always get new hardware for my instance by performing these steps. But…

Caveats

Some things to consider when using this approach:

  1. Make sure you “stop” the instance and not “terminate” it. Terminating an instance generally loses all disk based information.

  2. This will only work with EBS boot instances. S3 based instances cannot be stopped.

  3. Stopping an EBS boot instance preserves files on attached EBS volumes, but all information on ephemeral instance-store disks will be lost (e.g., /mnt).

  4. There may be a small chance that you will get the exact same hardware after starting the instance again. If the internal IP address before and after are the same or if you continue observing what you sincerely believe is a host system issue, you may want to run the process again.

  5. There will be a short outage while your instance is stopped and started. In my experience this lasts roughly about the same time as it takes for a normal system to boot up.

  6. There is a risk that after stopping the instance, you will not be able to start it again because that availability zone no longer has open instances of that type.

I ran into this last issue recently when I stopped an m2.4xlarge instance in a us-east-1 availability zone. Upon attempting to start the instance, I received the error that instances of that type were not currently available in that zone. I ended up having to start a replacement instance from scratch in another us-east-1 availability zone which worked out fine, but I would have preferred to keep my instances closer to each other. Eventually instances freed up and I moved the server back to its home zone.

If I had used the more complicated approach to move the root EBS volume to a new instance I would have made sure that there was an instance of the right type available before stopping the original instance.