Three Ways to Protect EC2 Instances from Accidental Termination and Loss of Data

Here are a few little-publicized benefits that were launched with Amazon EC2’s new EBS boot instances: You can lock them from being accidentally terminated; you can prevent termination even when you try to shutdown the server from inside the instance; and you can automatically save your data storage when they are terminated.

In discussions with other EC2 users, I’ve found that it is a common feeling of near panic when you go to terminate an instance and you check very carefully to make sure that you are deleting the right instance instead of an active production system. Slightly less common but even worse is the feeling of dread when you realize you just casually terminated the wrong EC2 instance.

It is always recommended to set up your AWS architecture so that you are able to restart production systems on EC2 easily, as they could, in theory, be lost at any point due to hardware failure or other actions. However, new features released with the EBS boot instances help reduce the risks associated with human error.

The following examples will demonstrate with the EC2 API command line tools ec2-run-instances, ec2-modify-instance-attribute, and ec2-terminate-instances. Since AWS is Amazon’s “web service” all of these features are also available through the API and should be coming available (if they aren’t already) in other tools, packages, and interfaces using the web service API.

  1. Shutdown Behavior

First, let’s look at what happens when you run a command like the following in an EC2 instance:

sudo shutdown -h now
# or, equivalently and much easier to type:
sudo halt

Using the legacy S3 based AMIs, either of the above terminates the instance and you lose all local and ephemeral storage (boot disk and /mnt) forever. Hope you remembered to save the important stuff elsewhere!

A shutdown from within an EBS boot instance, by default, will initiate a “stop” instead of a “terminate”. This means that your instance is not in a running state and not getting charged, but the EBS volumes still exist and you can “start” the same instance id again later, losing nothing.

You can explicitly set or change this with the --instance-initiated-shutdown-behavior option in ec2-run-instances. For example:

ec2-run-instances \
  --instance-initiated-shutdown-behavior stop \
  [...]

This is the first safety precaution and, as mentioned, should already be built in, though it doesn’t hurt to document your intentions with an explicit option.

  1. Delete on Termination

Though EBS volumes created and attached to an instance at instantiation are preserved through a “stop”/“start” cycle, by default they are destroyed and lost when an EC2 instance is terminated. This behavior can be changed with a delete-on-termination boolean value buried in the documentation for the --block-device-mapping option of ec2-run-instances.

Here is an example that says “Don’t delete the root EBS volume when this instance is terminated”:

ec2-run-instances \
  --block-device-mapping /dev/sda1=::false \
  [...]

If you are associating other EBS snapshots with the instance at run time, you can also specify that those created EBS volumes should be preserved past the lifetime of the instance:

  --block-device-mapping /dev/sdh=SNAPSHOTID::false

When you use these options, you’ll need to manually clean up the EBS volume(s) if you no longer want to pay for the storage costs after an instance is gone.

Note: EBS volumes attached to an instance after it is already running are, by default, left alone on termination (i.e., not deleted). The default rules are: If the EBS volume is created by the creation of the instance, then the termination of the instance deletes the volumes. If you created the volume explicitly, then you must delete it explicitly.

  1. Disable Termination

This is my favorite new safety feature. Years ago, I asked Amazon for the ability to lock an EC2 instance from being accidentally terminated, and with the launch of EBS boot instances, this is now possible. Using ec2-run-instances, the key option is:

ec2-run-instances \
  --disable-api-termination \
  [...]

Now, if you try to terminate the instance, you will get an error like:

Client.OperationNotPermitted: The instance 'i-XXXXXXXX' may not be terminated.
Modify its 'disableApiTermination' instance attribute and try again.

Yay!

To unlock the instance and allow termination through the API, use a command like:

ec2-modify-instance-attribute \
  --disable-api-termination false \
  INSTANCEID

Then end it all with:

ec2-terminate-instances INSTANCEID

Oh, wait! did you make sure you unlocked and terminated the right instance?! :)

Note: disableApiTermination is also available when you run S3 based AMIs today, but since the instance can still be terminated from inside (shutdown/halt) I am moving towards EBS based instances for all around security.

Put It Together

Here’s a command line I just used to start up an EC2 instance of an Ubuntu 9.10 Karmic EBS boot AMI which I intend to use as a long-term production server with strong uptime and data safety requirements. I wanted to add all the protection available:

ec2-run-instances \
  --key $keypair \
  --availability-zone $availabilityzone \
  --user-data-file $startupscript \
  --block-device-mapping /dev/sda1=::false \
  --block-device-mapping /dev/sdh=$snapshotid::false \
  --instance-initiated-shutdown-behavior stop \
  --disable-api-termination \
  $amiid

With regular EBS snapshots of the volumes, copies to off site backups, and documented/automated restart processes, I feel pretty safe.

Runtime Modification

If you have a valuable running EC2 instance, but forgot to specify the above options to protect it when you started it, or you are now ready to turn a test system into a production system, you can still lock things after the fact using the ec2-modify-instance-attribute command (or equivalent API call).

For example:

ec2-modify-instance-attribute --disable-api-termination true INSTANCEID
ec2-modify-instance-attribute --instance-initiated-shutdown-behavior stop INSTANCEID
ec2-modify-instance-attribute --block-device-mapping /dev/sda1=:false INSTANCEID
ec2-modify-instance-attribute --block-device-mapping /dev/sdh=:false INSTANCEID

Notes:

  • Only one type of option can be specified with each invocation.

  • The --disable-api-termination option has no argument value when used in in ec2-run-instances, but it takes a value of true or false in ec2-modify-instance-attribute.

  • You don’t specify the snapshot id when changing the delete-on-terminate state of an attached EBS volume.

  • You may change the delete-on-terminate to “true” for an EBS volume you created and attached after the instance was running. By default it will not be deleted since you created it.

  • The above --block-device-mapping option requires recent ec2-api-tools to avoid a bug.

You can find out the current state of the options using ec2-describe-instance-attribute, which only takes one option at a time:

ec2-describe-instance-attribute --disable-api-termination INSTANCEID
ec2-describe-instance-attribute --instance-initiated-shutdown-behavior INSTANCEID
ec2-describe-instance-attribute --block-device-mapping INSTANCEID

Unfortunately, the block-device-mapping output does not currently show the state of delete-on-termination value, but thanks to Andrew Lusk for pointing out in a comment below that it is available through the API. Here’s a hack which extracts the information from the debug output:

ec2-describe-instance-attribute -v --block-device-mapping INSTANCEID | 
  perl -0777ne 'print "$1\t$2\t$3\n" while 
  m%<deviceName>(.*?)<.*?<volumeId>(.*?)<.*?<deleteOnTermination>(.*?)<%sg'

While we’re on the topic of EC2 safety, I should mention the well known best practice of separating development and production systems by using a different AWS account for each. Amazon lets you create and use as many accounts as you’d like even with the same credit card as long as you use a unique email address for each. Now that you can share EBS snapshots between accounts, this practice is even more useful.

What additional safety precautions do you take with your EC2 instances and data?

[Update 2011-09-13: Corrected syntax for modifying block-device-mapping on running instance (only one “:")]