Amazon published a tutorial about best practices in creating public AMIs for use on EC2 last week:
Though the general principles put forth in the tutorial are good, some of the specifics are flawed in how to accomplish those principles. (Comments here relate to the article update from June 7, 2011 3:45 AM GMT.)
The primary message of the article is that you should not publish private information on a public AMI. Excellent advice!
Unfortunately, the article seems to recommend or at least to assume that you are building the public AMI by taking a snapshot of a running instance. Though this method seems an easy way to build an AMI and is fine for private AMIs, it is is a dangerous approach for public AMIs because of how difficult it is to identify private information and to clear that private information from a running system in such a way that it does not leak into the public AMI.
The article recommends the use of the
rm command to remove files containing confidential files. This is inadequate if you are building an EBS boot AMI for public use by taking a snapshot of the volume, as the deleted files are likely to remain in the EBS block device and are copied into the snapshot and public AMI.
In September, 2009, I published an article that describes this danger with an example of an EBS snapshot containing deleted files. (An EBS boot AMI is simply an EBS snapshot that has been registered).
Almost immediately after I published that article with a sample public EBS snapshot, a reader found and restored the deleted file and redeemed the $100 Amazon gift certificate I had hidden in it. Now, imagine that the deleted file contained your AWS credentials with power over your entire infrastructure on EC2, power to charge tens of thousands of dollars to your credit card with a single API call, and that person finding them does not have the best of intentions. Ouch.
If you are generating a public AMI by creating an EBS snapshot of a running instance, then the use of
rm is not sufficient to clear and protect secret information. As much as I love
srm, even attempting to overwrite blocks may not be sufficient with modern journaling file systems.
Amazon recommends “Always delete the shell history before creating your AMI” with an example of an
rm of dot history files. Unfortunately, depending on exactly when this is run and what you do between that command and the creation of the AMI, a history file might be silently re-created and included on your AMI including commands that you typed before removing the history file. This is true even disregarding the fact that a file removed with
rm isn’t really removed if you are creating an EBS boot AMI with a snapshot.
For example, simply exiting a shell will cause
bash to re-create
.bash_history with all the commands entered into that shell, perhaps including the ones where you had passed AWS credentials to commands.
You can test this with a sequence like:
- ssh to your server
echo "my secret"
- ssh to your server again
It looks like you removed your secret, but there it is!
There are ways of configuring
bash to not create history files, but
bash is not the only program that creates history files, and you may have multiple users to worry about, and some information gets stored in system log files, and… It just gets messy. With security, you don’t want to say “I probably got everything”. You want to know that you are not at risk.
Amazon recommends removing all
authorized_keys files from your disk before creating the AMI. This is a great recommendation because you don’t want to release public AMIs with back doors letting you access your user’s servers. It’s bad form and makes you look suspicious and untrustworthy once discovered.
Unfortunately, this makes it difficult to reconnect to your running instance, so you need to make sure you keep some ssh connections alive. You would need to restore the
authorized_keys to copy files or create new ssh connections to the instance, and then hope you remember to remove them again before taking the next snapshot.
This becomes error prone if you are in the test/development loop trying to perfect an AMI.
The article recommends you remove any unrecognized
authorized_keys files and user login accounts when you run a public AMI. I would go further. If you run a public AMI that includes back doors like this, you should question what other security problems might exist with that AMI and get off of it as soon as possible. In fact, Amazon sends out alerts and turns off access when they find AMIs with pre-existing
authorized_keys so it’s a bit odd this article seems to take such a security hole casually.
So now you’re sufficiently worried and are wondering how to create public AMIs without including your secret information or accidentally releasing AMIs with embarrassing back doors.
The cleanest and safest approach is to build the file system for the new image separate from the file system you are using with the running system. This means that you are creating a chroot environment with a complete operating system in a directory structure that is not the same root directory where you are logging in and running commands.
The great thing about this approach is that you know exactly what files you are putting on the new image; you only run the exact commands you want to in the chroot environment; no history or log files show up there that you don’t control; your
authorized_keys files are never seen in that file system; and any files you delete are not leaked when you copy that file system onto a new EBS volume to snapshot and register the public AMI.
This is a reasonably advanced Linux concept and sometimes requires some special tools and knowledge, but the tools and knowledge are publicly available and folks in the community are standing by to help guide you when you run into problems.
I think this approach is made easiest with Ubuntu, as Canonical has provided downloadable file system images that are configured correctly for the EC2 environment and that can be easily customized to add software and configuration for making AMIs to your own specification.
I published a tutorial on how to build AMIs with Canonical’s downloadable images back in 2009 with Ubuntu Karmic (now past end of life). It is similar with modern versions of Ubuntu, and if there is interest I can publish a new article with the updated steps.
The code I use to build the Alestic Git Server AMIs uses this approach with Ubuntu 10.04 Natty and is available as a resource to study:
If you aren’t building an Ubuntu AMI and your Linux distro does not provide clean, downloadable file systems for EC2 images, and you really believe you can identify which files on your running system contain private information, and you really want to create a public AMI from a running system, then here is how you can avoid releasing an AMI with recoverable deleted files.
Start with a fresh instance of a clean public AMI to install and configure your software. If your instance has been running for a while, you really don’t know what has leaked into the file system in visible and deleted files
Put as little private information on the running system as possible. If you need to have AWS credentials on the running system, drop them in a mounted ephemeral store disk (/mnt on some distros). Don’t type private information like keys or passwords into command lines where they might get dropped into history files.
Delete all sensitive files and all
authorized_keys. (This is still risky as you might miss files, and history files can sometimes reappear as described above.)
Do not snapshot the live EBS volume as it still contains the deleted files and you don’t want to make them public in the new AMI. Instead,
Create a new EBS volume, attach, and mount it on the running instance, say under
Copy the root file system over to the new EBS volume. This only copies the current view of the undeleted files and does not copy the blocks containing the deleted files or any other modified file information. The command might look something like:
rsync -axvSHAX / /image/
umount and detach the new EBS volume.
Create an EBS snapshot of the new EBS volume.
Register the EBS snapshot as a new AMI.
Test by running an instance of the new AMI and verifying it works and contains no private information before you change its attributes so the public can run it. You can then terminate instances and delete the extra EBS volume. Only the EBS snapshot is required to be kept for the new AMI.
Warning: Even though these instructions are under a section titled “Recommendation 2” I would not build public AMIs this way myself. It just seems too risky that something sensitive might slip into the public AMI. It’s better than snapshotting a running system directly because it removes the possibility deleted files leak out, but “Recommendation 1” is a safer path.
I’m a bit surprised that the above article’s
auhorized_keys instructions came from Amazon as I’ve seen their security folks and documentation warn against these very practices in the past. Amazon knows better as an organization and I hope that this misinformation is corrected soon so that people building AMIs don’t follow the guidance and leave secret information or back doors on public AMIs.
This is a complicated topic, as security related issues tend to be. I welcome comments, clarifications, and questions on the original post. Articles from Alestic.com are republished on a couple specific sites with explicit permission, but I only read and respond to comments on my primary blog.
[Update 2011-06-17: Corrected comments about
authorized_keys. Amazon does recommend removing all of these files before publishing a new public AMI and recommends removing “unrecognized”
authorized_keys files when running public AMIs.]