I will be attending the Ubuntu Developer Summit (UDS) next week in Oakland, CA.  This event brings people from around the world together in one place every six months to discuss and plan for the next release of Ubuntu.  The May 2012 UDS is for Ubuntu-Q which will eventually be named and become Ubuntu 12.10 when it is released in October (2012-10).

I’ve attended two UDS in person prior to this, one held at Google (Mountain View) for Ubuntu Jaunty (9.04) and one in Dallas for Ubuntu Lucid (10.04). UDS wanders around the world to mix it up and get input from a wide variety of contributors. I’m not a fan of flying long distances, so I tend to wait until UDS comes to within a couple hours of my home in Los Angeles.

My primary involvement at UDS is to contribute my perspectives to the plans for Ubuntu as it relates to running on Amazon EC2 and interacting with other features of AWS, though I also have interest in general Ubuntu server functionality.  I’ve been running Ubuntu on servers since 2005, and Ubuntu servers on EC2 since 2007.

I am grateful to Canonical for sponsoring my trip to and stay at UDS as they do for many community members.  I continue to be impressed by how Ubuntu is developed in such an open fashion with Canonical’s support.

All community members interested in learning about how Ubuntu is developed and/or interested in helping give input to the future of Ubuntu are welcome to participate in UDS. You can either attend in person as I will be, or you can participate online.  Be sure to register (free) at the UDS site.

Taking a full week off for UDS is a little much for me, so I’ll be attending three full days (Wed-Fri). Will I see you there or online? What feedback and suggestions would you have for running Ubuntu on EC2?

The ssh protocol uses two different keys to keep you secure:

  1. The user ssh key is the one we normally think of. This authenticates us to the remote host, proving that we are who we say we are and allowing us to log in.

  2. The ssh host key gets less attention, but is also important. This authenticates the remote host to our local computer and proves that the ssh session is encrypted so that nobody can be listening in.

Every time you see a prompt like the following, ssh is checking the host key and asking you to make sure that your session is going to be encrypted securely.

The authenticity of host 'ec2-...' can't be established.
ECDSA key fingerprint is ca:79:72:ea:23:94:5e:f5:f0:b8:c0:5a:17:8c:6f:a8.
Are you sure you want to continue connecting (yes/no)?

If you answer “yes” without verifying that the remote ssh host key fingerprint is the same, then you are basically saying:

I don’t need this ssh session encrypted. It’s fine for any man-in-the-middle to intercept the communication.

Ouch! (But a lot of people do this.)

Note: If you have a line like the following in your ssh config file, then you are automatically answering “yes” to this prompt for every ssh connection.

# DON'T DO THIS!
StrictHostKeyChecking false

Care about security

Since you do care about security and privacy, you want to verify that you are talking to the right server using encryption and that no man-in-the-middle can intercept your session.

There are a couple approaches you can take to check the fingerprint for a new Amazon EC2 instance. The first is to wait for the console output to be available from the instance, retrieve it, and verify that the ssh host key fingerprint in the console output is the same as the one which is being presented to you in the prompt.

Scott Moser has written a blog post describing how to verify ssh keys on EC2 instances. It’s worth reading so that you understand the principles and the official way to do this.

The rest of this article is going to present a different approach that lets you in to your new instance quickly and securely.

Passing ssh host key to new EC2 instance

Instead of letting the new EC2 instance generate its own ssh host key and waiting for it to communicate the fingerprint through the EC2 console output, we can generate the new ssh host key on our local system and pass it to the new instance.

Using this approach, we already know the public side of the ssh key so we don’t have to wait for it to become available through the console (which can take minutes).

Generate a new ssh host key for the new EC2 instance.

tmpdir=$(mktemp -d /tmp/ssh-host-key.XXXXXX)
keyfile=$tmpdir/ssh_host_ecdsa_key
ssh-keygen -q -t ecdsa -N "" -C "" -f $keyfile

Create the user-data script that will set the ssh host key.

userdatafile=$tmpdir/set-ssh-host-key.user-data
cat <<EOF >$userdatafile
#!/bin/bash -xeu
cat <<EOKEY >/etc/ssh/ssh_host_ecdsa_key
$(cat $keyfile)
EOKEY
cat <<EOKEY >/etc/ssh/ssh_host_ecdsa_key.pub
$(cat $keyfile.pub)
EOKEY
EOF

Run an EC2 instance, say Ubuntu 11.10 Oneiric, passing in the user-data script. Make a note of the new instance id.

ec2-run-instances --key $USER --user-data-file $userdatafile ami-4dad7424
instanceid=i-...

Wait for the instance to get a public DNS name and make a note of it.

ec2-describe-instances $instanceid
host=ec2-...compute-1.amazonaws.com

Add new public ssh host key to our local ssh known_hosts after removing any leftover key (e.g., from previous EC2 instance at same IP address).

knownhosts=$HOME/.ssh/known_hosts
ssh-keygen -R $host -f $knownhosts
ssh-keygen -R $(dig +short $host) -f $knownhosts
(
  echo -n "$host "; cat $keyfile.pub
  echo -n "$(dig +short $host) "; cat $keyfile.pub
) >> $knownhosts

When the instance starts running and the user-data script has executed, you can ssh in to the server without being prompted to verify the fingerprint

ssh ubuntu@$host

Don’t forget to clean up and to terminate your test instance.

rm -rf $tmpdir
ec2-terminate-instances $instanceid

Caveat

There is one big drawback in the above sample implementation of this approach. We have placed secret information (the private ssh host key) into the EC2 user-data, which I generally recommend against.

Any user who can log in to the instance or who can cause the instance to request a URL and get the output, can retrieve the user-data. You might think this is unlikely to happen, but I’d rather avoid or minimize unnecessary risk.

In a production implementation of this approach, I would take steps like the following:

  1. Upload the new ssh host key to S3 in a private object.

  2. Generate an authenticated URL to the S3 object and have that URL expire in, say, 10 minutes.

  3. In the user-data script, download the ssh host key with the authenticated, expiring S3 URL.

Now, there is a short window of exposure and you don’t have to worry about protecting the user-data after the URL has expired.

Amazon Web Services is such a huge, complex service with so many products and features that sometimes very simple but powerful features fall through the cracks when you’re reading the extensive documentation.

One of these features, which has been around for a very long time, is the ability to use AWS to seed (serve) downloadable files using the BitTorrent™ protocol. You don’t need to run EC2 instances and set up software. In fact, you don’t need to do anything except upload your files to S3 and make them publicly available.

Any file available for normal HTTP download in S3 is also available for download through a torrent. All you need to do is append the string ?torrent to the end of the URL and Amazon S3 takes care of the rest.

Steps

Let’s walk through uploading a file to S3 and accessing it with a torrent client using Ubuntu as our local system. This approach uses s3cmd to upload the file to S3, but any other S3 software can get the job done, too.

  1. Install the useful s3cmd tool and set up a configuration file for it. This is a one time step:

    sudo apt-get install s3cmd
    s3cmd --configure
    

    The configure phase will prompt for your AWS access key id and AWS secret access key. These are stored in $HOME/.s3cmd which you should protect. You can press [Enter] for the encryption password and GPG program. I prefer “Yes” for using the HTTPS protocol, especially if I am using s3cmd from outside of EC2.

  2. Create an S3 bucket and upload the file with public access:

    bucket=YOURBUCKETNAME
    filename=FILETOUPLOAD
    basename=$(basename $filename)
    s3cmd mb s3://$bucket
    s3cmd put --acl-public $filename s3://$bucket/$basename
    
  3. Display the URLs which can be used to access the file through normal web download and through a torrent:

    cat <<EOM
    web:     http://$bucket.s3.amazonaws.com/$basename
    torrent: http://$bucket.s3.amazonaws.com/$basename?torrent
    EOM
    

Notes

  1. The above process makes your file publicly available to anybody in the world. Don’t use this for anything you wish to keep private.

  2. You will pay standard S3 network charges for all downloads from S3 including the initial torrent seeding. You do not pay for network transfers between torrent peers once folks are serving the file chunks to each other.

  3. You cannot throttle the rate or frequency of downloads from S3. You can turn off access to prevent further downloads, but monitoring accesses and usage is not entirely real time.

  4. If your file is not popular enough for other torrent peers to be actively serving it, then every person who downloads it will transfer the entire content from S3’s torrent servers.

  5. There is no way to force people to use the Torrent URL. If they know what they are doing, they can easily remove “?torrent” and download the entire file direct from S3, perhaps resulting in a higher cost to you.

CloudCamp

| 0 Comments

There are a number of CloudCamp events coming up in cities around the world. These are free events, organized around the various concepts, technologies, and services that fall under the “cloud” term.

There’s always some discussion about my favorite topic, Amazon AWS and EC2, but there are sure to be experts and beginners for every other cloud-related flavor as well. You can attend presentations, join in discussions, or hang out in the hallway and make connections with local folks who are interested in the same things you are.

CloudCamp follows somewhat of an unconference format, though the couple I’ve been to in LA tended to have more pre-planned elements than, say, a BarCamp. Glancing through the schedules, it looks like each city also has their own twist and personality for CloudCamp.

Here are two upcoming CloudCamps that are of particular interest to me:

  • CloudCamp Los Angeles - I plan to attend this short event to hang out with old friends and make some new acquaintances. (Do we really need 12 organizers for 3 hours?)

  • CloudCamp Rochester - A full day event with some big clouderati names in attendance including Jeff Barr, Mitch Garnaat, David Kavanagh, Chris Moyer.

If you have a business related to “cloud” (and who doesn’t these days) why not pitch in with a little support as a sponsor? These are small events, so it doesn’t take much to help out. Plus you get your brand in front of your target market.

I went ahead and tossed in a bit personally to sponsor CloudCamp Rochester as they make it easy to contribute and know what you’re getting in return. I couldn’t find any sponsorship information for CloudCamp LA, but am still looking if anybody knows how that works.

A few hours ago, Amazon AWS announced that all EC2 instance types can now run 64-bit AMIs.

Though t1.micro, m1.small, and c1.medium will continue to also support 32-bit AMIs, it is my opinion that there is virtually no reason to use 32-bit instances on EC2 any more.

This is fantastic news!

Sticking with 64-bit instances everywhere all the time gives you the most flexibility to switch the instance type of your running instances, reduces the choices and work necessary when building your own AMIs, and just makes life simpler.

In fact, to celebrate this occasion, I have dropped my listing of 32-bit Ubuntu AMI ids at the top of Alestic.com. The new simplified AMI id table listing only 64-bit Ubuntu AMIs now fits into the right sidebar.

Simply pick an EC2 region in the pulldown in the right sidebar, and you’ll get a clean listing of current available Ubuntu AMIs. Click on the orange arrow to the right of the AMI id to launch an instance of Ubuntu in your AWS console.

Note that reserved instances only specify an instance type not an architecture, so if you have already purchased reserved instances for m1.small or c1.medium, you can switch from 32-bit to 64-bit and still have your new instance be covered by the reserved instance pricing.

Do you have reasons why you might still need to run 32-bit instances on EC2? How much work is it going to take you to convert your existing instances and AMIs from 32-bit to 64-bit?

The source for ec2-conssitent-snapshot has historically been available here:

ec2-consistent-snapshot on Launchpad.net using Bazaar

For your convenience, it is now also available here:

ec2-consistent-snapshot on GitHub using Git

You are welcome to fork ec2-consistent snapshot under the liberal terms of the Apache License, Version 2.0.

I welcome patch submissions, especially if:

  1. The patch accomplishes a single enhancement or bug fix, changing as little as possible to accomplish the goals, while still performing appropriate error checks.

  2. The patch includes relevant updates to the documentation.

  3. The patch does not add functionality outside of the narrow goal of ec2-consistent-snapshot (initiate EBS snapshots with consistent filesystems and application data).

  4. The patch is created against the latest version of the source.

I also recommend submitting a bug/feature in Launchpad.net to track adding the patch to ec2-consistent-snapshot:

https://bugs.launchpad.net/ec2-consistent-snapshot/+filebug

Not all patches will be accepted, but you are encouraged to maintain forks on GitHub so that others can include your patches if they find them useful.

New Release

Two user-contributed patches have been incorporated (with minor adjustments) into ec2-consistent-snapshot with the latest release of version 0.43:

  1. Ability to freeze multiple file systems by specifying —freeze-filesystem multiple times (thanks to Bobb Crosbie).

  2. Ability to specify commands to run just before freezing the file system(s) and just after thawing the file systems (thanks to Craig Tracey).

The new release of ec2-consistent-snapshot is available in the Alestic PPA for easy installation and upgrading in Ubuntu. You can install it on Ubuntu using:

sudo add-apt-repository ppa:alestic &&
sudo apt-get update &&
sudo apt-get install -y ec2-consistent-snapshot

ec2-consistent-snapshot is usable on other Linux distros as long as you install dependencies like xfsprogs, perl, libnet-amazon-ec2-perl, libfile-slurp-perl, libwww-perl, libdigest-hmac-perl, libparams-validate-perl, libxml-simple-perl, libmoose-perl, libcrypt-ssleay-perl.

EBS boot vs. instance-store

If you are just getting started with Amazon EC2, then use EBS boot instances and stop reading this article. Forget that you ever heard about instance-store and accept my apology that I just mentioned it. Once you are completely comfortable with using EBS boot instances on EC2, you may (or may not) want to come back here and read why you made a good decision.

EC2 experts may find that there are specific cases, few and far between, where instance-store might make sense, but they don’t attempt to use instance-store without understanding and accounting for all the serious drawbacks and dangers that go with making this choice. For example, experts using instance-store don’t mind losing all of the data on the instance as they have designed the system so that the data is stored elsewhere and so that a new instance can easily and automatically be rebuilt from scratch.

One of the challenges for beginners is that many of the benefits of EBS boot don’t necessarily seem like something you’ll need to use right away. Then they get down the road and into situations where they realize that they would have been much better off if they had gone with EBS boot in the first place and may find it takes some work to make the transition.

Big benefits of EBS boot instances

Here are some of the reasons I use and recommend EBS boot instances. None of these benefits are available with instance-store, so even a single one of these can be an overriding factor for choosing EBS boot.

  1. EBS boot instances store the root file system on an EBS volume which is persistent storage. If the instance hardware fails, the EBS volume remains accessible. It is also possible to request the EBS volume to persist beyond the termination of an EC2 instance. When an instance-store instance fails or is shut down, all of the data on the root disk is lost forever and can never be retrieved. Read more about protecting EC2 instances from accidental termination and loss of data.

  2. EBS boot instances can be stopped and restarted at will. The “stopped” state suspends the hourly instance billing charges, preserving the information on the EBS volumes. The stopped instance can be started again a few minutes later or months later, restoring state just as if the instance was rebooted. An instance-store instance can only be left running with full charges or terminated, which causes you to lose all data on the disk. Read more about how stop/start of an EBS boot instance is similar to and different from a simple reboot.

  3. When something goes wrong with an EBS boot instance so that it can’t be booted, or you lose access through ssh (e.g., lost keys, bad ssh config change), you can still view and modify or fix the EBS root volume by attaching it to another running instance. With an instance-store instance, everything on the root disk is lost and cannot be recovered. Read more about fixing files on the root EBS volume of an EC2 instance.

  4. EBS boot instances can be run with a root EBS volume size from the default specified by the AMI (often 8GB) up to 1,000GB (1TB). The instance-store AMIs have a max root disk size of 10GB with no way to increase it. Read more about increasing the root disk size of an EBS boot AMI.

  5. It is possible to grow the size of the root disk after an EBS boot instance has been started. An instance-store instance has no way to grow the root disk size. Read more about resizing the root disk on a running EBS boot EC2 instance.

  6. It is possible to change the instance type for a running EBS boot instance without needing to start a new instance. For example, you can scale up from an m1.large to an m1.xlarge and then a few hours or days later, scale back down. An instance-store instance is stuck with the type on which it was originally run. Read more about changing the instance type of a running EBS boot instance.

  7. You can easily replace the hardware for your instance if you are running an EBS boot instance. This is extremely valuable if your instance is having problems that you suspect may be related to the underlying hardware. An instance-store instance is bound to the hardware it started on and cannot be moved. Read more about using stop/start to replace EC2 hardware.

  8. EBS boot AMIs are simpler and faster to create than instance-store AMIs. In fact, you can trigger the creation of an EBS boot AMI from a running instance in one command, API call, or console click. You need to copy sensitive AWS credentials to the instance when creating an instance-store AMI.

  9. Amazon has stated that EBS boot AMIs boot up faster than S3 based AMIs (instance-store). In my recent experience, the difference is negligible, especially when testing popular AMIs that are likely to be cached, but we might as well chalk this up as another benefit.

  10. The t1.micro instance type released recently by Amazon only supports EBS boot instances. This move is like a sign from Amazon that you really don’t want to run the legacy instance-store instances.

  11. Some versions of Windows (Server 2008) only run on EBS boot instances. I believe this may be related to the disk size limitations of instance-store, but I don’t use Windows, so am not an expert in that area.

Possible benefits of instance-store instances

EBS volumes and EBS boot instances aren’t perfect. Running an instance-store instance might be preferable in some very specific cases where you don’t care so much about losing the data you are storing on the root disk.

I’m going to list some of the possible benefits of instance-store, but each of these may not be as beneficial as they appear at first glance and you must remember that running instance-store loses all of the above benefits.

  • There is a negigible cost savings with instance-store, as there is no charge for an EBS volume nor the I/O transactions. Note however, that the cost for an 8GB root EBS volume is only around 80 cents per month (depending on the region) and you get about a billion I/O requests for a dollar, billed by the penny increment. The cost savings to run instance-store is generally not worth the increased risk of losing your valuable data, especially when you compare it as a percentage of the cost of running the instance hours in the first place (tens or hundreds of dollars per month).

  • There may be some applications that perform somewhat better when running against ephemeral or instance-store disks than against EBS volumes. Some high end users have also reported inconsistency in the performance they see from EBS volumes at high I/O transaction rates for long periods of time. EBS volumes perform better for some applications and are generally good enough for most applications. If you don’t care about losing your application data, and you have tested to see that instance-store performs better than EBS volumes, then you could either run instance-store instances, or you could run EBS boot instances and drop your data on the ephemeral disks available to all instances above t1.micro. It is also becoming increasingly popular to run multiple EBS volumes in a mdadm RAID-0 or LVM striping configuration to improve performance and smooth out some experiences of performance volatility.

  • There are a couple historical failure modes that have happened with EBS volumes that could not happen for instance-store disks. You may hear people say this is a reason not to use EBS volumes, but EBS volumes are still far more reliable, persistent, and protected against failure than instance-store. The fact that it is possible for EBS to fail is not a reason to use the less persistent instance-store. It is a reason to create regular snapshots of your EBS volumes to improve reliability and act as backups.

  • If you are creating a “paid AMI” you can only do it as an instance-store AMI, not EBS boot. Only two of the thousands of people reading this article are creating paid AMIs and they already know this fact.

Why did I write this article?

I regularly provide consulting services in public communities like serverfault, the ##aws IRC channel on Freenode, the ec2ubuntu Google group, and Amazon’s EC2 forum. It’s a rare week that passes where I am not telling somebody that they would not have the problem they’re having if they had been using EBS boot instances.

The saddest response I can give to a plea for help is that the customer’s valuable data has been lost because they were running instance-store instead of EBS boot and they did not have real-time streaming backups.

Historical background

When Amazon launched EC2 in 2006, only instance-store AMIs were available (though they weren’t called instance-store at the time as they were the only kind). For years, Amazon customers learned to work with the server limitations of risking the loss of all data at any point in time.

In August of 2008, Amazon introduced the concept of EBS volumes, and there was much rejoicing. Data could finally be stored on persistent disks even though the root disk remained on ephemeral storage (another name for instance-store).

In December of 2009, Amazon released the ability to launch instances with EBS volumes as the root disk, or EBS boot instances. Now, the entire server could be persistent and all of the above benefits were realized.

Notes

Just because EBS boot volumes are “persistent” does not mean that they do not ever fail. Amazon has released figures about their failure rates, which are proportional to the number of blocks modified since the last EBS snapshot was created. Regular snapshots improve the intrinsic reliability of your EBS volumes in addition to acting as backups. I also recommend creating regular off-site backups (outside of Amazon) to eliminate your AWS account as a single point of failuire.

Even if you use an EBS boot instance, I still recommend keeping your data on a separate EBS volume. This has a number of benefits that could perhaps form the core of a followup article. Read an example of how to set up a MySQL database on a separate EBS boot volume. I still follow this approach with EBS boot instances, storing data on a second volume.

[Update 2011-01-04: Added benefit #11 (Windows Server 2008). Added note about potential (in)consistency of EBS IO at sustained, high rates.]

Retrieve Public ssh Key From EC2

| 2 Comments

A serverfault poster had a problem that I thought was a cool challenge. I had so much fun coming up with this answer, I figured I’d share it here as it demonstrates a few handy features of EC2.

Challenge

The basic need is to get the public ssh key from a keypair that exists inside of EC2. You don’t have access to the private key at the moment (but somebody else does or you will at a different location).

The AWS console and EC2 API do not let you ask for the public ssh key associated with a keypair. However, EC2 does pass the public ssh key to a new EC2 instance when you run it with a specific keypair.

The problem is that we don’t currently have the private key, so we can’t log in to the EC2 instance to get the public key. (Besides, if we did have the private key, we could extract the public key from it directly.)

Solution

I proposed creating a user-data script that sends the public ssh key to the EC2 instance console output. You can retrieve the console output without logging in to the EC2 instance.

Save the following code to a file named output-ssh-key.userdata on your local computer. DO NOT RUN THESE COMMANDS LOCALLY!

#!/bin/bash -ex
exec> >(tee /var/log/user-data.log|logger -t user -s 2>/dev/console) 2>&1
adminkey=$(GET instance-data/latest/meta-data/public-keys/ | 
  perl -ne 'print $1 if /^0=[^a-z0-9]*([-.@\w]*)/i')
cat <<EOF
SSHKEY:===================================================================
SSHKEY:HERE IS YOUR PUBLIC SSH KEY FOR KEYPAIR "$adminkey":
SSHKEY:$(cat /home/ubuntu/.ssh/authorized_keys)
SSHKEY:===================================================================
SSHKEY:Halting in 50min ($(date --date='+50 minutes' +"%Y-%m-%d %H:%M UTC"))
EOF
sleep 3000
halt

Run a stock Ubuntu 10.04 LTS instance with the above file as a user-data script. Specify the keypair for which you want to retrieve the public ssh key:

ec2-run-instances   --key YOURKEYPAIRHERE   --instance-type t1.micro   --instance-initiated-shutdown-behavior terminate   --user-data-file output-ssh-key.userdata   ami-ab36fbc2

Keep requesting the console output from the instance until it shows your public ssh key. Specify the instance id returned from the run-instances command:

ec2-get-console-output YOURINSTANCEID | grep SSHKEY: | cut -f3- -d:

Repeat the above command a couple times a minute and within 2-10 minutes you will get output like this:

===================================================================
HERE IS YOUR PUBLIC SSH KEY FOR KEYPAIR "erich":
ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEA6rn8cl41CkzaH4ZBhczOJZaR4xBBDI1Kelc2ivzVvCB
THcdJRWpDd5I5hY5W9qke9Tm4fH3KaUVndlcP0ORGvS3PAL4lTpkS4D4goMEFrwMO8BG0NoE8sf2U/7g
aUkdcrDC7jzKYdwleRCI3uibNXiSdeG6RotClAAp7pMflDVp5WjjECDZ+8Jzs2wasdTwQYPhiWSiNcfb
fS97QdtROf0AcoPWElZAgmabaDFBlvvzcqxQRjNp/zbpkFHZBSKp+Sm4+WsRuLu6TDe9lb2Ps0xvBp1F
THlJRUVKP2yeZbVioKnOsXcjLfoJ9TEL7EMnPYinBMIE3kAYw3FzZZFeX3Q== erich
===================================================================
Halting in 50min (2011-12-20 05:58 UTC)

The temporary instance will automatically terminate itself in under an hour, but you can terminate it yourself if you’d like to make sure that you aren’t charged more than the two cents this will cost to run.

Notes

  • If you currently have access to the private ssh key (not true in the above challenge) you can extract the public ssh key using a command like:

    ssh-keygen -y -f KEYFILE.pem
    

    but that’s obviously not as fun.

  • There is no way to retrieve the private ssh key if you have lost it. To protect your security, Amazon EC2 does not store a copy of this. If you are looking to get access to an EC2 instance where you have lost the private ssh key, I recommend following the approach I wrote about in this article: http://alestic.com/2011/02/ec2-fix-ebs-root

  • In seemingly-related-but-not news, Scott Moser is working on an enhancement to cloud-init (used by Ubuntu on EC2, Amazon Linux, and perhaps others) so that the public ssh host keys are output to the console output on instance startup. This cool feature will allow us to add the ssh host keys to our local known_hosts files, safely avoiding that pesky “Are you sure you want to continue connecting (yes/no)?” warning.

Do you want to run short jobs on Amazon EC2 on a recurring schedule, but don’t want to pay for an instance running all the time?

Would you like to do this using standard Amazon AWS services without needing an external server to run and terminate the instance?

Amazon EC2 Auto Scaling is normally used to keep a reasonable number of instances running to handle measured or expected load (e.g., web site traffic, queue processing).

In this article I walk through the steps to create an Auto Scaling configuration that runs an instance on a recurring schedule (e.g., four times a day) starting up a pre-defined task and letting that instance shut itself down when it is finished. We tweak the Auto Scaling group so that this uses the minumum cost in instance run time, even though we may not be able to predict in advance exactly how long it will take to complete the job.

Here’s a high level overview for folks familiar with Auto Scaling:

  • The instance is started using a recurring schedule action that raises the min and max to 1.

  • The launch configuration is set to pass in a user-data script that runs the desired job on first boot. There’s no need to build our own AMI unless the software installation takes too long.

  • The user-data script runs shutdown at the end to move the instance to the “stopped” state, suspending hourly run charges.

  • Another recurring schedule action lowers the min and max to 0. This is done long after the job should have completed, just to clean up the stopped instance or to terminate an instance that perhaps failed to shut itself down. It also resets the min/max count so a new instance will be started the next time they are raised.

The missing key I found through experimentation is that we need to also suspend Auto Scaling’s (usually valid) desire to replace any unhealthy instances. If we don’t turn off this behavior, Auto Scaling will start another instance as soon as the first one shuts itself down, causing the job to be run over and over.

Prerequisites

The examples in this article assume that you have:

  1. Installed and set up the EC2 command line tools

  2. Installed and set up the EC2 Auto Scaling command line tools

  3. Tell EC2 about your ssh keys using the approach described here:

    Uploading Personal ssh Keys to Amazon EC2

User-data Script

The first step in setting up this process is to create a user-data script that performs the tasks you want your scheduled instance to execute each time it runs.

This user-data script can do anything you want as long as it runs this command after all the work has beeen completed:

shutdown -h now

I have created a demo user-data script for this article which you can download to your local computer now, saving it as demo-user-data-script.sh

wget -q https://raw.github.com/alestic/demo-ec2-schedule-instance/master/user-data/demo-user-data-script.sh

WARNING: Do not attempt to run this user-data script on your local computer!

Edit the downloaded demo-user-data-script.sh file and change this line at the top of the script to reflect your email address:

EMAIL=youraddress@example.com

This demo user-data script:

  • Upgrades the Ubuntu instance

  • Installs Postfix with a generic configuration so that it can send email

  • Sends you a demo informative message about the instance at the email address you edited in the script.

  • Sleeps for 5 minutes giving the email a chance to be delivered

  • Shuts down the EC2 instance

The first time you run this demo, try using the script as it stands, only changing your email address. Then, try adjusting the script little by little to make it do tasks that you would find more useful.

If you’d like, you can test the user-data script by itself running an instance of Ubuntu 11.10. Please update the AMI id to use the latest release:

ami_id=ami-a7f539ce # Ubuntu 11.10 Oneiric server
ec2-run-instances   --key $USER   --instance-type t1.micro   --instance-initiated-shutdown-behavior terminate   --user-data-file demo-user-data-script.sh   $ami_id

You should see an email from the instance and then it should terminate itself about 5 minutes later. Make sure you terminate it manually if it stays running after 10 minutes.

Auto Scaling Group

With the user-data script in hand, we are now ready to create the Auto Scaling setup.

Set some variables used in the commands below. Make sure you are using the latest release of the appropriate AMI.

ami_id=ami-a7f539ce # Ubuntu 11.10 Oneiric server
region=us-east-1    # Region for running the demo

zone=${region}a     # A zone in that region
export EC2_URL=https://$region.ec2.amazonaws.com
export AWS_AUTO_SCALING_URL=https://autoscaling.$region.amazonaws.com
launch_config=demo-launch-config
auto_scale_group=demo-auto-scale-group

This lauch configuration describes how we want our instance run each time including the AMI id, instance type, ssh key, and most importantly, our user-data script we edited above:

as-create-launch-config   --key $USER   --instance-type t1.micro   --user-data-file demo-user-data-script.sh   --image-id $ami_id   --launch-config "$launch_config"

The Auto Scaling group keeps track of many things including how many instances we want to have running, how they should be run (launch config above), and what instances are currently running.

as-create-auto-scaling-group   --auto-scaling-group "$auto_scale_group"   --launch-configuration "$launch_config"   --availability-zones "$zone"   --min-size 0   --max-size 0

Here’s a non-obvious but key part of this approach! Tell the Auto Scaling group that we don’t want it to restart our instance right after the instance intentionally shuts down (or fails):

as-suspend-processes   "$auto_scale_group"   --processes ReplaceUnhealthy

Now we’re finally ready to tell EC2 Auto Scaling when we want to run the instance launch configuration in the Auto Scaling group.

Here’s an example that starts one instance four times a day to run the above user-data script:

# UTC: 1:00, 7:00, 13:00, 19:00
as-put-scheduled-update-group-action   --name "demo-schedule-start"   --auto-scaling-group "$auto_scale_group"   --min-size 1   --max-size 1   --recurrence "0 01,07,13,19 * * *"

And, we need to create a matching action to make sure the instance is terminated at some point after the longest time the job could take. For this demo, we’ll trigger it 55 minutes later, but it could just as easily be 3 hours and 55 minutes later:

# UTC: 1:55, 7:55, 13:55, 19:55
as-put-scheduled-update-group-action   --name "demo-schedule-stop"   --auto-scaling-group "$auto_scale_group"   --min-size 0   --max-size 0   --recurrence "55 01,07,13,19 * * *"

The recurrence value is in a cron format using UTC.

You are welcome to change the specs in the above commands to any time you want to run the demo, especially if you don’t want to wait up to six hours for it to trigger.

Before setting new schedules, make sure you delete the existing schedule (see the next section). Don’t forget to make the stop time(s) match up appropriately with the start time(s).

Clean up

Once you’re done with this demo, you can delete the AWS resources we created by following these steps:

Delete the schedule start and stop actions:

as-delete-scheduled-action   --force   --name "demo-schedule-start"   --auto-scaling-group "$auto_scale_group"
as-delete-scheduled-action   --force   --name "demo-schedule-stop"   --auto-scaling-group "$auto_scale_group"

Scale the Auto Scaling group down to zero instances. This will terminate any running instances:

as-update-auto-scaling-group   "$auto_scale_group"   --min-size 0   --max-size 0

Delete the Auto Scaling group

as-delete-auto-scaling-group   --force-delete   --auto-scaling-group "$auto_scale_group"

Delete the Auto Scaling launch config:

as-delete-launch-config   --force   --launch-config "$launch_config"

You might now want to check to make sure nothing was left over. This works best in a wide terminal:

as-describe-launch-configs --headers
as-describe-auto-scaling-groups --headers
as-describe-scheduled-actions --headers
as-describe-auto-scaling-instances --headers

Timing

Everything takes a little time to filter through the system including:

  • scheduled action to raise min/max

  • triggering the start of an instance after a min/max is raised

  • starting an instance

  • booting an instance, installing software

  • running your job

  • shutting down an instance

  • scheduled action to lower min/max

  • triggering the termination of an instance after a min/max is lowered

When you set up the schedules, remember to make room for these things. For example, don’t schedule the termination of your instance too early or it could kill your job before it has a chance to complete.

Notes

Here are some great resources from Amazon for getting started with and learning about Auto Scaling:

The user-data script logging uses the approach described here:

Watching the output of the user-data script lets you monitor its progress as well as debug where things might be going wrong.

I haven’t run the above approach except in testing, and would welcome any pointers or improvements folks might have to offer.

Amazon just announced that the AWS MFA (multi-factor authentication) now supports virtual or software MFA devices in addition to the physical hardware MFA devices like the one that’s been taking up unwanted space in my pocket for two years.

Multi-factor authentication means that in order to log in to my AWS account using the AWS console or portal (including the AWS forums) you not only need my secret password, you also need access to a device that I carry around with me.

Before, this was a physical device attached to my key ring. Now, this is my smart phone which has the virtual (software) MFA device on it. I already carry my phone with me, so the software doesn’t take up any additional space.

To log in to AWS, I enter my password and then the current 6 digit access code displayed by the Android app on my phone. These digits change every 30 seconds in an unguessable pattern, so this enhances the security of my AWS account.

I started by using Amazon’s AWS Virtual MFA app for my Android phone, but had some complaints about it including:

  • You have to click on an account name to see the current digits instead of just having them shown when the app is run. There’s nothing else for the app to do but show me these digits. Just do it!

  • The digits disappear from the screen too fast. Sometimes I want to glance back and see if I typed them in correctly, but they’re gone and I have to click again, hoping that they haven’t changed yet.

  • It’s hard to choose your own account names so that you know which entry to use for different AWS accounts.

I then noticed some cryptic information in the announcements: the new feature will work with “any application that supports the open OATH TOTP standard”.

Hmmm…

Sure ‘nuff!

I already use the Google Authenticator app on my Android phone so that my Google logins can use MFA. As it turns out, Google Authenticator also works seamlessly with AWS Virtual MFA.

  • Google Authenticator shows the codes as soon as it is run with a little timer showing me when they will change.

  • Google Authenticator lets me easily edit the displayed name so that I know at a glance which code is for my personal AWS account and which one is for my company AWS account.

This also means that I only have to run one app to get access to my devices for Google accounts and for AWS accounts. Amazon may improve their Android app over time, but by using open standards users can pick whatever works best for them at the time.

I love the fact that Amazon now supports Virtual MFA. I’ve already thrown away my hardware token and my pocket feels less full.

I love the fact that Amazon implemented this as a service based on existing standards so that I can use Google’s Android app to access my account.

I love open standards.

Update: I just found this great starting page which even links to Google Authenticator as a client for Android and iPhone:

http://aws.amazon.com/mfa/