Creating a New Image for EC2 by Rebundling a Running Instance

NOTE: This is an article from 2009, back before EBS boot instances were available on Amazon EC2. I recommend you use EBS boot instances which make it trivial to create new AMIs (single command/API call). Please stop reading this article now and convert to EBS boot AMIs!

When you start up an instance (server) on Amazon EC2, you need to pick the image or AMI (Amazon Machine Image) to run. This determines the Linux distribution and version as well as the initial software installed and how it is configured.

There are a number of public images to choose from with EC2 including the Ubuntu and Debian image published on https://alestic.com but sometimes it is appropriate to create your own private or public images. There are two primary ways to create an image for EC2:

  1. Create an EC2 image from scratch. This process lets you control every detail of what goes into the image and is the easiest way to automate image creation.

  2. Rebundle a running EC2 instance into a new image. This approach is the topic of the rest of this article.

After you rebundle a running instance to create a new image, you can then run new EC2 instances of that image. Each instance starts off looking exactly like the original instance as far as the files on the disk go (with a few exceptions).

This guide is primarily written in the context of running Ubuntu on EC2, but the concepts should apply without too much changing on Debian and other Linux distributions.

To use this rebundling approach, you start by running an instance of an image that (1) is as close as possible to the image you want to create, and (2) is published by a source you trust. You then proceed to install software and configure that instance so that it contains exactly what you want to be available on new instances right down to the startup scripts.

The next step is to bundle the instance’s disk image into a new AMI, but before we get to that, it is important to understand a few things about security.

Security

If you are creating a new EC2 image, you need to be very careful what pieces of information you inadvertently leave on the image, especially if you have the goal of publishing it as a public AMI. Anybody who runs an instance of that AMI will have access to the files you included in the bundle, and there is no way to modify an AMI after it has been created (though you can delete it).

For example, you don’t want to leave your AWS certificate or private key on the disk. You’ll even want to clear out the shell history file in case you had typed secret information in commands or in setting environment variables.

You also want to consider the security concerns from the perspective of the people who run the new image. For example, you don’t want to leave any passwords active on accounts. You should also make sure you don’t include your public ssh key in authorized_keys files. Leaving a back door into other people’s servers is in poor taste even if you have no intention of ever using it.

Here are some sample commands, but only you can decide if this wipes out too much or what other files you need to exclude depending on how you set up and used the instance you are bundling:

sudo rm -f /root/.*hist* $HOME/.*hist*
sudo rm -f /var/log/*.gz
sudo find /var/log -name mysql -prune -o -type f -print | 
  while read i; do sudo cp /dev/null $i; done

Whole directories can be excluded from the image using the --exclude option of the ec2-bundle-vol command (see below).

Rebundling

Now we’re ready to bundle the actual EC2 image (AMI). To start, you need to copy your certificate and key to the instance ephemeral storage. Adjust the sample command to use the appropriate keypair file for authentication and the appropriate location of your certification and private key files. If you are not running a modern Ubuntu image, then change remoteuser to “root”.

remotehost=<ec2-instance-hostname>
remoteuser=ubuntu

rsync \
  --rsh="ssh -i KEYPAIR.pem" \
  --rsync-path="sudo rsync" \
  PATHTOKEYS/{cert,pk}-*.pem \
  $remoteuser@$remotehost:/mnt/

Set up some environment variables for convenience in the following commands. A single S3 bucket can be used for multiple AMIs. The manifest prefix should be descriptive, especially if you plan to publish the AMI publicly, as it is the only piece of documentation many users will see when they look through AMI lists. At a minimum, I recommend including the Linux distribution (e.g, “ubuntu”), the architecture (e.g., “i386” or “32”), and the date (e.g., “20090621”), as well as some tag that indicates the special nature of the image (e.g., “desktop” or “lamp”).

bucket=<your-bucket-name>
prefix=<descriptive-image-title>

On the EC2 instance itself, you also set up some environment variables to help the bundle and upload commands. You can find these values in your EC2 account.

export AWS_USER_ID=<your-value>
export AWS_ACCESS_KEY_ID=<your-value>
export AWS_SECRET_ACCESS_KEY=<your-value>

if [ $(uname -m) = 'x86_64' ]; then
  arch=x86_64
else
  arch=i386
fi

Bundle the files on the current instance into a copy of the image under /mnt:

sudo -E ec2-bundle-vol \
  -r $arch \
  -d /mnt \
  -p $prefix \
  -u $AWS_USER_ID \
  -k /mnt/pk-*.pem \
  -c /mnt/cert-*.pem \
  -s 10240 \
  -e /mnt,/root/.ssh,/home/ubuntu/.ssh

Upload the bundle to a bucket on S3:

ec2-upload-bundle \
   -b $bucket \
   -m /mnt/$prefix.manifest.xml \
   -a $AWS_ACCESS_KEY_ID \
   -s $AWS_SECRET_ACCESS_KEY

Now that the AMI files have been uploaded to S3, you register the image as a new AMI. This is done back on your local system (with the API tools installed):

ec2-register \
  --name "$bucket/$prefix" \
  $bucket/$prefix.manifest.xml

The output of this command is the new AMI id which is used to run new instances of that image.

It is important to use the same account access information for the ec2-bundle-vol and ec2-register commands even though they are run on different systems. If you don’t you’ll get an error indicating you don’t have the rights to register the image.

Public Images

By default, the new EC2 image is private, which means it can only be seen and run by the user who created it. You can share access with another individual account or with the public.

To let another EC2 user run the image without giving access to the world:

ec2-modify-image-attribute -l -a <other-user-id> <ami-id>

To let all other EC2 users run instances of your image:

ec2-modify-image-attribute -l -a all <ami-id>

Cost

AWS will charge you standard S3 charges for the stored AMI files which comes out to $0.15 per GB per month. Note, however, that the bundling process uses sparse files and compression, so the final storage size is generally very small and your resulting cost may only be pennies per month.

The AMI owner incurs no charge when users run the image in new instances. The users who run the AMI are responsible for the standard hourly instance charges.

Cleanup

Before removing any public image, please consider the impact this might have on people who depend on that image to run their business. Once you publish an AMI, there is no way to tell how many users are regularly creating instances of that AMI and expecting it to stay available. There is also no way to communicate with these users to let them know that the image is going away.

If you decide you want to remove an image anyway, here are the steps to take.

Deregister the AMI

ec2-deregister ami-XXX

Delete the AMI bundle in S3:

ec2-delete-bundle \
  --access-key $AWS_ACCESS_KEY_ID \
  --secret-key $AWS_SECRET_ACCESS_KEY \
  --bucket $bucket \
  --prefix $prefix

[Update 2009-09-12: Security tweak for running under non-root.] [Update 2010-02-01: Update to use latest API/AMI tools and work for Ubuntu 9.10 Karmic.]

New Releases of Ubuntu Images for Amazon EC2 2009-06-23 (Karmic Koala Alpha released)

Ubuntu Karmic Koala Alpha is being developed and will be released as Ubuntu 9.10 in October. If you want to play around with Karmic Alpha on Amazon EC2, I have published new AMIs in the US and EU regions for 32- and 64-bit:

https://alestic.com

A Karmic desktop image for EC2 is also available if you wish to monitor progress in that area.

Warning! Karmic is an unstable alpha developer version and is not intended for use in anything resembling a production environment.

Please note that we are still defaulting to Amazon’s 2.6.21fc8 kernel which, though functional and stable, is getting older and older for each new release of Ubuntu. One effect of this is that AppArmor will not be enabled, though this should not affect the functionality of any software.

Enjoy!

New Releases of Ubuntu and Debian Images for Amazon EC2 2009-06-14 (Reliability and Security)

New updates have been released for the Ubuntu and Debian AMIs (EC2 images) published on:

https://alestic.com

The following improvements are included in this release:

  • Ubuntu 9.04 Jaunty now uses an Ubuntu mirror inside of EC2 hosted by RightScale. This dramatically improves the performance of updates and upgrades. Hardy and Intrepid were already using the mirrors inside EC2.

  • The Hardy, Intrepid, and Jaunty images have been enhanced to add failover for Ubuntu archive mirror hosts across availability zones (data centers). This change lets an Ubuntu instance perform package updates and upgrades even if one or two of the EC2 availability zones are completely unavailable.

  • The denyhosts package is now installed on desktop images for improved security. The Amazon abuse team has identified the Ubuntu desktop images as a source of compromised systems. The cause for this is believed to be unsecure passwords set by users, since the desktop images have PasswordAuthentication enabled by default so that the NX client can connect. The denyhosts package blocks ssh attacks by adding remote systems to /etc/hosts.deny if they keep failing password logins.

    The published Ubuntu and Debian server images continue to have PasswordAuthentication turned off by default for improved security. If you choose to turn this on, I recommend installing a package like denyhosts and using software like the following to generate secure passwords:

      sudo apt-get install pwgen
      pwgen -s 10 1
    
  • The EC2 AMI tools have been upgraded to version 1.3-31780.

  • All software packages have been updated to versions current as of 2009-06-14.

Community support for Ubuntu on EC2 is available in this group:

http://groups.google.com/group/ec2ubuntu

Community support for Debian on EC2 is available in this group:

http://groups.google.com/group/ec2debian

The 32-bit Debian squeeze images and the 32-bit Debian etch desktop image have not been updated yet due to problems with initial package installation. Images will be released when these issues are resolved.

The following enhancements have been made to the ec2ubuntu-build-ami software which is used to build Ubuntu and Debian images for EC2.

  • New --kernel and --ramdisk options have been added to specify AKI and ARI. If you specify a different kernel, you should also specify kernel modules with --package or install them with the --script option.

  • Support has been removed for Ubuntu Edgy, Feisty, and Gutsy. These releases have reached their end of life. To improve the clarity of the code this software no longer supports building these images.

  • There has been a typo fix for $originaldir for folks who were using the --script option.

  • There has been a typo fix for /dev/ptmx though it apparently had no effect given how these images are built.

Thanks to Stephen Parkes and Paul Dowman for submitting patches.

Enjoy!

Automate EC2 Instance Setup with user-data Scripts

user-data Scripts

The Ubuntu and Debian EC2 images published on https://alestic.com allow you to send in a startup script using the EC2 user-data parameter when you run a new instance. This functionality is useful for automating the installation and configuration of software on EC2 instances.

The basic rule followed by the image is:

If the instance user-data starts with the two characters #! then the instance runs it as the root user on the first boot.

The “user-data script” is run late in the startup process, so you can assume that networking and other system services are functional.

If you start an EC2 instance with any user-data which does not start with #! the image simply ignores it and allows your own software to access and use the data as it sees fit.

This same user-data startup script functionality has been copied in the Ubuntu images published by Canonical, and your existing user-data script should be portable across images with little change. Read a comparison of the Alestic and Canonical EC2 images.

Example

Here is a sample user-data script which sets up an Ubuntu LAMP server on a new EC2 instance:

#!/bin/bash
set -e -x
export DEBIAN_FRONTEND=noninteractive
apt-get update && apt-get upgrade -y
tasksel install lamp-server
echo "Please remember to set the MySQL root password!"

Save this to a file named, say, install-lamp and then pass it to a new EC2 instance, say, Ubuntu 9.04 Jaunty:

ec2-run-instances --key KEYPAIR --user-data-file install-lamp ami-bf5eb9d6

Please see https://alestic.com for the latest AMI ids for Ubuntu and Debian.

Note: This simplistic user-data script is for demonstration purposes only. Though it does set up a fully functional LAMP server which may be as good as some public LAMP AMIs, it does not take into account important design issues like database persistence. Read Running MySQL on Amazon EC2 with Elastic Block Store.

Debugging

Since you are passing code to the new EC2 instance, there is a very small chance that you may have made a mistake in writing the software. Well maybe not you, but somebody else out there might not be perfect, so I have to write this for them.

The stdout and stderr of your user-data script is output in /var/log/syslog and you can review this for any success and failure messages. It will contain both things you echo directly in the script as well as output from programs you run.

Tip: If you add set -x at the top of a bash script, then it will output every command executed. If you add set -e to the script, then the user-data script will exit on the first command which does not succeed. These help you quickly identify where problems might have started.

Limitations

Amazon EC2 limits the size of user-data to 16KB. If your startup instructions are larger than this limit, you can write a user-data script which downloads the full program(s) from somewhere else like S3 and runs them.

Though a shell is a handy tool for writing scripts to install and configure software, the user-data script can be written in any language which supports the shabang (#!) mechanism for running programs. This includes bash, Perl, Python, Ruby, tcl, awk, sed, vim, make, or any other language you can find pre-installed on the image.

If you want to use another language, a user-data script written in bash could install the language, install the program, and then run it.

Security

Setting up a new EC2 instance often requires installing private information like EC2 keys and certificates (e.g., to make AWS API calls). You should be aware that if you pass secrets in the user-data parameter, the complete input is available to any user or process running on the instance.

There is no way to change the instance user-data after instance startup, so anybody who has access to the instance can simply request http://169.254.169.254/latest/user-data

Depending on what software you install on your instance, even Internet users may be able to exploit holes to get at your user-data. For example, if your web server lets users specify a URL to upload a file, they might be able to enter the above URL and then read the contents.

Alternatives

Though user-data scripts are my favorite method to set up EC2 instances, it’s not always the appropriate approach. Alternatives include:

  1. Manually ssh in to the instance and enter commands to install and configure software.

  2. Automatically ssh in to the instance with automated commands to install and configure software.

  3. Install and configure software using (1) or (2) and then rebundle the instance to create a new AMI. Use the new image when running instances.

  4. Build your own EC2 images from scratch.

The ssh options have the benefit of not putting any private information into the user-data accessible from the instance. They have the disadvantage of needing to monitor new instances waiting for the ssh server to accept connections; this complicates the startup process compared to user-data scripts.

The rebundled AMI approach and building your own AMI approach are useful when the installation and configuration of your required software take a very long time or can’t be done with automated processes (less common than you might think). A big drawback of creating your own AMIs is maintaining them, keeping up with security patches and other enhancements and fixes which might be applied by the base image maintainers.

Software

Note to AMI authors: If you wish to add to your EC2 images the same ability to run user-data scripts, feel free to include the following code and make it run on image startup:

http://ec2-run-user-data.notlong.com

Credits

Thanks to RightScale for the original idea of EC2 images with user-data startup hooks. RightScale has advanced startup plugins which include scripts, software packages, and attachments, all of which integrate with the RightScale service.

Thanks to Kim Scheibel and Jorge Oliveira who submitted code used in the original ec2-run-user-data script.

What do you use EC2 user-data for?

Official Ubuntu Images for Amazon EC2 from Canonical

Canonical has released official Ubuntu images for EC2 for Ubuntu 9.10 Karmic.

The primary technical benefit brought by Canonical's involvement in building official Ubuntu images is that custom kernels can be built for EC2 through a relationship with Amazon. This means that the Ubuntu images can now run on more modern Ubuntu kernels instead of on Amazon's older, Fedora kernels.

Other differences are listed below:

Alestic.com Ubuntu images Canonical Ubuntu images
Kernel 2.6.21 Karmic: 2.6.31
Releases 9.04 Jaunty
8.10 Intrepid
8.04 Hardy (LTS)
7.10 Gutsy (obsolete)
7.04 Feisty (obsolete)
6.10 Edgy (obsolete)
6.06 Dapper (LTS)
9.10 Karmic
Flavors server
desktop
server
ssh access ssh to root ssh to "ubuntu" with sudo to root
Apt Sources main
restricted
universe
multiverse
Alestic PPA
main
restricted
universe
Apt Mirror Jaunty, Intrepid, Hardy:
ec2-us-east-mirror.rightscale.com (load balanced with failover)
Others: us.archive.ubuntu.com
US: us.ec2.archive.ubuntu.com
EU: eu.ec2.archive.ubuntu.com
Default runlevel runlevel 4 runlevel 2
Tools Amazon EC2 AMI tools installed
runurl installed
euca2ools installed
Amazon tools available (multiverse)
runurl available through Alestic PPA

Items listed are likely to change as images are enhanced. This table may or may not be updated to match. Please leave comments if you notice or question other differences.

Note: There are some older (2009-04) Canonical AMIs floating around for Hardy and Intrepid. These have not been maintained and are not recommended at this point.

Updated 2009-06-15: Alestic.com Jaunty is using an Ubuntu mirror inside EC2. Alestic.com images using load balanced mirror with failover between EC2 availability zones.

Updated 2009-06-25: Alestic.com published Karmic (Alpha) but later withdrew.

Updated 2009-10-29: Canonical released Karmic. None of the image currently have RightScale support built in, but RightScale has their own Ubuntu AMIs.

New releases of Ubuntu AMIs for Amazon EC2 2009-04-23 (Jaunty released)

As you may have heard, Ubuntu 9.04 Jaunty has been officially released by Ubuntu today, right on schedule:

http://ubuntu.com

Matching updates have been released for the Ubuntu 9.04 Jaunty AMIs listed on:

https://alestic.com

Please note that we are still defaulting to Amazon’s 2.6.21fc8 kernel which is getting older and older for each new release of Ubuntu. Please do let the group know if you find incompatibilities with Ubuntu Jaunty other than the known problem that AppArmor is not enabled.

You might be able to run the 9.04 Jaunty image with the official Ubuntu 2.6.27 kernel (for Intrepid) which is currently in release candidate state from Canonical.

For what it’s worth, I still run Ubuntu 8.04 LTS Hardy on Amazon EC2 personally and for my company.

New releases of Ubuntu AMIs for Amazon EC2 2009-04-18 (XFS fixes)

New updates have been released for all* of the Ubuntu and Debian AMIs listed on:

https://alestic.com

The primary enhancements in this release are:

  • The images which were experiencing problems with XFS and the Amazon 2.6.21fc8 kernel have been fixed by installing an XFS kernel module which matches Amazon’s kernel. This includes Ubuntu Intrepid, Ubuntu Jaunty, Debian Lenny, and Debian Squeeze.

  • The Ubuntu 9.04 Jaunty image is using release candidate software. The official Jaunty release is expected April 23.

  • At the request of the Amazon security folks, ssh PasswordAuthentication has been disabled by default on the server images. Even though the base images have passwords disabled on the root account, some folks may be creating accounts with poor passwords susceptible to attacks. The desktop images require password authentication for NX (as far as I know) so please use secure passwords.

  • The desktop images have been upgraded to a recent version of NX Free Edition software.

  • This is the last published image for Ubuntu 7.10 Gutsy. This version has reached its end of life on April 18 and should not be used any more unless you really need to test something on Gutsy and you aren’t going to leave it running long (no security patches available).

All of the AMIs are available in both the US and European regions.

Notes:

  • The Ubuntu 6.10 Edgy, 7.04 Feisty, and 7.10 Gutsy AMIs are obsolete and unsupported. Running these images introduces a security risk as no security patches are being produced any more by Ubuntu.
New releases of Ubuntu Jaunty AMIs for Amazon EC2 2009-03-29

New updates have been released for the Ubuntu Jaunty AMIs on

https://alestic.com

Jaunty recently moved from “alpha” to “beta” in preparation for its official release as Ubuntu 9.04 next month.

For details on what is new in Jaunty, see:

http://www.ubuntu.com/testing/jaunty/beta

This is beta software and is not suitable for production use.

All of the AMIs are available in both the US and European regions.

New releases of Ubuntu AMIs for Amazon EC2 2009-02-16 (EC2 mirrors)

New updates have been released for all* of the Ubuntu and Debian AMIs listed on:

https://alestic.com

The primary enhancements in this release are:

  • Ubuntu Hardy and Intrepid have new apt sources.list pointing to the local EC2 mirrors provided by RightScale. Please let me know if you have any problems with updates.

  • Debian “lenny” has been released as the new “stable”. Debian “squeeze” is the new “testing”, so the latest Debian mapping is as follows:

    squeeze - “testing” lenny - “stable” etch - “oldstable”

As always, “sid” is “unstable” and I can’t imagine why you would want to run this on EC2 unless you’re a Debian developer in which case you should probably built your own AMIs.

When I run “squeeze” it thinks that it is “lenny” (lsb_release -a). I assume that this is because it has just been branched from lenny but it’s possible that I didn’t build it correctly. Let me know if you have further information on this.

Notes:

  • The Ubuntu 6.10 Edgy and 7.04 Feisty AMIs are obsolete, unsupported, and are not updated.

  • The AMIs are in the process of being copied to eu-west-1 (Europe). Documentation will be updated soon.

New releases of Ubuntu AMIs for Amazon EC2 2008-12-22

New updates have been released for all* of the Ubuntu and Debian AMIs listed on:

https://alestic.com

The primary enhancements in this release are:

  • The EC2 AMI tools have been upgraded to 1.3-30748. This adds support for EC2 regions including the new eu-west-1 European region.

  • AMIs have been created for Ubuntu Jaunty Jackelope alpha (planned for release 2009-04). This is alpha software and is not suitable for production use.

All of the AMIs are available in both the US and European regions.

  • The Ubuntu 6.10 Edgy and 7.04 Feisty AMIs are obsolete and unsupported.
Ubuntu AMIs available in Europe (eu-west-1)

The Ubuntu and Debian images listed on https://alestic.com are now available in both the US (us-east-1) and Europe (eu-west-1) EC2 regions.

Click on the “Europe” tab at the top of the table to see the new AMI ids for Europe.

Only the most recent images have been copied over to the Europe region. Let me know if you have specific older images which you would like to run in Europe.

New releases of Ubuntu AMIs for Amazon EC2 2008-11-30

New updates have been released for all* of the Ubuntu and Debian AMIs listed on:

https://alestic.com

The primary enhancements in this release are:

  • The kernel modules for Amazon’s old 2.6.16 kernel are no longer included in the AMI. The images have been running for a while on Amazon’s 2.6.21 kernel by default and kernel modules are included for this version. If you need the 2.6.16 kernel modules they are available for download or you can build a new AMI with the kernel modules using the ec2-ubuntu-build-ami script on the above site.

  • The EC2 AMI tools have been upgraded to 1.3-26357. This has reduced the number of required patches down to one as Amazon continues to improve support for Ubuntu and Debian in these tools.

  • A new –arch option has been added based on a patch submitted by Don Spaulding II. This may help in building images for both 32- and 64-bit on the same system.

  • Note that the Ubuntu 6.10 Edgy release is marked as obsolete and is no longer being maintained or updated. The Ubuntu 7.04 Feisty AMI was able to be updated it is also past its end of life.

There have been no volunteers for building an automated AMI testing framework so not all of these 24 images have been tested. Please report quickly any problems you encounter and hold off a bit on upgrading to these in production environments.

Ubuntu 8.10 Intrepid Ibex AMIs released for Amazon EC2 2008-10-30

The big news in the Ubuntu community today is that Ubuntu 8.10 Intrepid Ibex has been released right on schedule:

http://ubuntu.com

If you’d like to take it for a spin on EC2, brand new AMIs are available for 32-bit and 64-bit instances in both base install and desktop flavors.

You can find the latest AMIs for all Ubuntu releases on the following site with links to the public AMI documents on Amazon:

https://alestic.com

Ubuntu 8.10 is not an LTS (long term support release). It will be supported for 18 months. If you would like a bit more longevity, Ubuntu 8.04 LTS (Hardy) still has 4.5 years of life left in the server edition and is available from the same sites above.

Final release of Ubuntu 7.04 Feisty AMI for Amazon EC2 2008-10-20 (Feisty is now obsolete)

A new and final update has been released today for the Ubuntu 7.04 Feisty AMI on

https://alestic.com

Since Ubuntu 7.04 has reached its end-of-life (see forwarded attachment) there will be no further updates to the Ubuntu 7.04 Feisty base install AMI.

Anybody who might have been using the Feisty AMI series should have moved on by now to either the Ubuntu 7.10 Gutsy or 8.04 Hardy AMIs.

Please note: Ubuntu 7.10 Gutsy will reach its end-of-life in 6 months during which time everybody using that release should be upgrading to 8.04 Hardy.

Ubuntu 8.04 Hardy is LTS. The LTS versions of Ubuntu receive long- term support: 3 years for desktop versions and 5 years for server versions.

New releases of Ubuntu AMIs for Amazon EC2 2008-09-24 (emergency fix for /etc/fstab and /mnt)

New updates have been released for all of the Ubuntu and Debian AMIs listed on:

https://alestic.com

The primary enhancement in this release is an emergency correction for a defect in the previous AMIs, to wit: The /etc/fstab file was missing and /mnt was not mounted as ephemeral storage.

Thanks to Garrett Smith and Ken Lim for identifying and reporting this problem. I had apparently missed the notice that the latest ec2- bundle-vol from Amazon requires –generate-fstab if you want to have an /etc/fstab when creating an AMI from scratch :-\

The defective AMIs are hereby deprecated and may eventually be withdrawn.