Recently in PlanetUbuntu Category

I was chatting with Kevin Boyd (aka Beryllium) on the ##aws Freenode IRC channel about the challenge of invalidating a large number of CloudFront objects (35,000) due to a problem where the cached copies of the objects were out of date and the system had not been designed with versioning in the object path or name.

In addition to the work to perform all of these invalidations (in batches of up to 1,000 in each request with at most 3 request outstanding) there is also the issue of cost. The first thousand CloudFront invalidations are free in a month, but the remainder of the invalidations in this case would cost $170 (at $0.005 for each object).

It occurred to me that one could take advantage of the on-demand nature of AWS by using the following approach:

  1. Create a new CloudFront distribution, set up exactly like the existing distribution (except that the new distribution caches would be empty).

  2. Change the application to point to the new CloudFront distribution domain when referring to the objects.

Step 2 consists of a simple DNS change, assuming that you use your own domain name (e.g., cdn.example.com) when referring to the CloudFront objects in your web site or application, and where that domain name is a CNAME reference to the actual CloudFront distribution.

As soon as this is completed (preferably with a short DNS TTL) then the new CloudFront distribution will be hit by clients and will be filled up with the new versions of the objects.

After a while, you would then destroy and stop paying for the original CloudFront distribution that is no longer being referenced or used.

Simply replacing the CloudFront distribution effectively “invalidates” all of the objects at once, with no charges for invalidation requests and very little effort.

Once again, AWS wins with the principles of on-demand, pay for what you use, throw away what you don’t need.

using CloudWatch and SNS to send yourself email messages when AWS costs accrue past limits you define

The Amazon documentation describes how to use the AWS console to monitor your estimated charges using Amazon CloudWatch and includes some pointers for folks using the command line. Unfortunately, that article leaves out the commands to set up the SNS (Simple Notification Service) topics and SNS subscriptions, so I present here the complete steps I use.

I like using the command line tools as they let me automate and repeat actions without having to do lots of pointing, clicking, and re-entering data. For example, I want to set up a number of billing alerts in each new account, sometimes at $10 increments, and sometimes at $100 or $1000 increments. The steps below let me do this in seconds with a simple copy and paste.

When I get one of these billing alert emails, I glance at the day of the month to see if that account’s charges are progressing on an appropriate pace or if they require further investigation.

This was the mechanism that recently alerted me to the extra charges involved in automating the transition of S3 objects to Glacier.

Once you’ve installed the AWS command line tools here are the steps to set up automated billing alert emails.

Billing Alerts

Create an SNS topic where billing alert notifications will be sent.

snstopic=$(sns-create-topic BillingAlert)
echo snstopic=$snstopic

Subscribe your email address to the SNS topic so you receive email messages when your AWS billing estimates exceed each trigger point.

email=YOURNAME@YOURDOMAIN.com
sns-subscribe "$snstopic" --protocol email --endpoint "$email"

Check your mailbox for a confirmation email from Amazon and click on the link to complete your subscription to this SNS topic.

Create CloudWatch monitor alarms for the AWS billing estimated charges at each dollar figure where you want to be alerted. This example sets alarms at every $100 increment up to $1000, but you can change this to any values you’d like.

for amount in 100 200 300 400 500 600 700 800 900 1000
do
  mon-put-metric-alarm "awsbilling-$amount"     --alarm-description "AWS billing alarm: \$$amount"     --namespace AWS/Billing     --metric-name EstimatedCharges     --evaluation-periods 1     --period 21600     --statistic Maximum     --comparison-operator GreaterThanOrEqualToThreshold     --dimensions "Currency=USD"     --threshold "$amount"     --actions-enabled true     --alarm-actions "$snstopic"
done

See the CloudWatch monitor alarms you have created:

mon-describe-alarms --headers

Now spend lots of money with AWS and monitor your inbox for email alerts.

Cleanup

The above sample commands may incur minimal charges in your account (SNS Pricing, CloudWatch Pricing). If you don’t want to keep these alerts in place, you will need to undo what was set up.

Delete the alarms you created (replace with your specific trigger values used above).

for amount in 100 200 300 400 500 600 700 800 900 1000
do
  mon-delete-alarms "awsbilling-$amount" --force
done

Delete the SNS topic.

sns-delete-topic "$snstopic" --force

Notes

  • In order to follow these examples, you will need to install the AWS command line tools, at least for SNS and CloudWatch.

  • It may take half a day or more for a billing alarm to be triggered based on how AWS collects billing data and how the alarms are set.

  • Make sure you confirm your subscription to the SNS topic by clicking on the link in the confirmation email AWS sends to you, or Amazon will send no email billing alerts.

  • Each alert email will have an unsubscribe link in it for your convenience. This will unsubscribe you from all of the alerts, not just the specific cost level in that particular email.

  • Amazon’s documentation states that you must first “enable the monitoring of estimated charges” in each account. I just tested this with a new account and found that this was not necessary, so the documentation may be a bit out of date.

how I was surprised by a large AWS charge and how to calculate the break-even point

Glacier Archival of S3 Objects

Amazon recently introduced a fantastic new feature where S3 objects can be automatically migrated over to Glacier storage based on the S3 bucket, the key prefix, and the number of days after object creation.

This makes it trivially easy to drop files in S3, have fast access to them for a while, then have them automatically saved to long-term storage where they can’t be accessed as quickly, but where the storage charges are around a tenth of the price.

…or so I thought.

S3 Lifecycle Rule

My first use of this feature was on some buckets where I store about 350 GB of data that fits the Glacier use pattern perfectly: I want to save it practically forever, but expect to use it rarely.

It was straight forward to use the S3 Console to add a lifecycle rule to the S3 buckets so that all objects are archived to Glacier after 60 days:

S3 Lifecycle Rule

(Long time readers of this blog may be surprised I didn’t list the command lines to accomplish this task, but Amazon has not yet released useful S3 tools that include the required functionality.)

Since all of the objects in the buckets were more than 60 days old, I expected them to be transitioned to Glacier within a day, and true to Amazon’s documentation, this occurred on schedule.

Surprise Charge

What I did not expect was an email alert from my AWS billing alarm monitor on this account letting me know that I had just passed $200 for the month, followed a few hours later by an alert for $300, followed by an alert for a $400 trigger.

This is one of my personal accounts, so a rate of several hundred dollars a day is not sustainable. Fortunately, a quick investigation showed that this increase was due to one time charges, so I wasn’t about to run up a $10k monthly bill.

The line item on the AWS Activity report showed the source of the new charge:

$0.05 per 1,000 Glacier Requests x 5,306,220 Requests = $265.31

It had not occurred to me that there would be much of a charge for transitioning the objects from S3 to Glacier. I should have read the S3 Pricing page, where Amazon states:

Glacier Archive and Restore Requests: $0.05 per 1,000 requests

This is five times as expensive as the initial process of putting objects into S3, which is $0.01 per 1,000 PUT requests.

There is one “archive request” for each S3 object that is transitioned from S3 to Glacier, and I had over five million objects in these buckets, something I didn’t worry about previously because my monthly S3 charges were based on the total GB, not the number of objects.5306220

Overhead per Glacier Object

josh.monet has pointed out in the comments that Amazon has documented some Glacier storage overhead:

For each S3 object migrated to Glacier, Amazon adds “an additional 32 KB of Glacier data plus an additional 8 KB of S3 standard storage data”.

Storage for this overhead is charged at standard Glacier and S3 prices. This makes Glacier completely unsuitable for small objects.

Break-even Point

After stopping to think about it, I realized that I was still saving money on the long term by moving objects in these S3 buckets to Glacier storage. This one-time up front cost was going to be compensated for slowly by my monthly savings, because Glacier is cheap, even compared to the reasonably cheap S3 storage costs, at least for larger files.

Here are the results of my calculations:

  • Monthly cost of storing in S3: 350 GB x $0.095/GB = $33.25

  • Monthly cost of storing in Glacier: $8.97

    • 350 GB x $0.01/GB = $3.50
    • Glacier overhead: 5.3 million * 32 KB * $0.01/GB = $1.62
    • S3 overhead: 5.3 million * 8 KB * $0.95/GB = $3.85
  • One time cost to transition 5.3 million objects from S3 to Glacier: $265

  • Months until I start saving money by moving to Glacier: 11

  • Savings per year after first 11 months: $291 (73%)

For this data’s purpose, everything eventually works out to an advantage, so thanks, Amazon! I will, however, think twice before doing this with other types of buckets, just to make sure that the data is large enough and is going to be sitting around long enough in Glacier to be worth the transition costs.

As it turns out, the primary factor in how long it takes to break even is the average size of the S3 objects. If the average size of my data files were larger, then I would start saving money sooner.

Here’s the formula… The number of months to break even and start saving money when transferring S3 objects to Glacier is:

break-even months = 631,613 / (average S3 object size in bytes - 13,011)

(units apologies to math geeks)

In my case, the average size of the S3 objects was 70,824 bytes (about 70 KB). Applying the above formula:

631,613 / (70,824 - 13,011) = 10.9

or about 11 months until the savings in Glacier over S3 covers the cost of moving my objects from S3 to Glacier.

If you are storing 1 KB data records in S3, it would take over 50 years to justify transitioning them to Glacier.

Looking closely at the above formula, you can see that any object 13 KB or smaller is going to cost more to transition to Glacier rather than leaving it in S3. Files approaching that size are going to save too little money to justify the transfer costs.

The above formula assumes an S3 storage cost of $0.095 per GB per month in us-east-1. If you are storing more than a TB, then you’re into the $0.08 tier or lower, so your break-even point will take longer and you’ll want to do more calculations to find your savings.

[Update 2012-12-19: Included additional S3 and Glacier storage overhead per item. Thanks to josh.monet for pointing us to this information buried in the S3 FAQ.]

Amazon has announced a new AWS region in Sydney, Australia with the name ap-southeast-2.

The official Ubuntu AMI lookup pages (1, 2) don’t seem to be showing the new location yet, but the official Ubuntu AMI query API does seem to be working, so the new ap-southeast-2 Ubuntu AMIs are available for lookup on Alestic.com.

[Update 2012-11-13: Canonical has fixed the primary Ubuntu AMI lookup page and I understand it should remain more up to date going forward, but the other page is still missing ap-southeast-2]

Point and Click

At the top right of most pages on Alestic.com is an “Ubuntu AMIs” section. Simply select the EC2 region from the pulldown (say “ap-southeast-2” for Sydney, Australia) and you will see a list of the official 64-bit Ubuntu AMI ids for the various active Ubuntu releases.

Both EBS boot and instance-store AMI ids are listed, but I recommend you start with EBS boot AMIs.

To launch a listed Ubuntu AMI, simply click on the orange arrow to the right of the AMI id and you will be taken to the EC2 section of the AWS console with the AMI id selected:

Ubuntu AMI ids on Alestic.com

The AWS console walks you through setting up required ssh keys and security groups and even has a point and click way to ssh to your instance, provided you have Java in your browser (I disable that).

Command Line

You can also launch Ubuntu AMIs with the EC2 command line tools. First, make sure you upload your ssh key to the new Sydney, Australia EC2 region using something like:

ec2-import-keypair --region ap-southeast-2 --public-key-file $HOME/.ssh/id_rsa.pub $USER

If you haven’t already, open the ssh port on your default security group:

ec2-authorize --region ap-southeast-2 default -p 22

Then, to launch Ubuntu 12.04 LTS Precise EBS boot, you would use a command like:

ec2-run-instances --region ap-southeast-2 --key $USER --instance-type t1.micro ami-fb8611c1

where you should always look up and use the most recent AMI id. Make a note of the instance id.

Wait a few seconds for the instance to be assigned an IP address and to start booting, then find out what the IP address was with:

ec2-describe-instances --region ap-southeast-2 <INSTANCEID>

If you ran the instance with your uploaded personal ssh key, you can then access the Ubuntu server using

ssh ubuntu@<IPADDRESS>

where is the public IP address of the instance (does not start with “10.”).

Always remember to terminate your temporary EC2 instances when you are done with them so you don’t keep paying charges:

ec2-terminate-instances --region ap-southeast-2 <INSTANCEID>

Note: The official Ubuntu AMI ids listed on Alestic.com are created, published, and supported by Canonical, an official sponsor of Ubuntu. (Alestic.com/Eric Hammond used to publish community Ubuntu AMIs for EC2 starting in 2007, but that fun job was transferred to Canonical back in 2009.)

You may be able to save on future EC2 expenses by selling an unused Reserved Instance for less than its true value or even $0.01, provided it is in the “Heavy Utilization” class.

In the description of the Heavy Utilization Reserved Instance, is this statement:

you pay […] a significantly lower hourly usage fee, and you’re charged that lower hourly rate for every hour in the Reserved Instance term you purchase [emphasis added]

What may not be clear to the casual reader is the fact that when you purchase a Heavy Utilization Reserved Instance, you commit not only to paying the one-time up front cost, but you are also committing to paying the hourly charge for every hour of every month, even if you are not running a matching instance!

The Light Utilization and Medium Utilization descriptions state:

Light [and Medium] Utilization RIs allow you to turn off your instance at any point and not pay the hourly fee.

This statement is conspicuously missing from the Heavy Utilization description.

If you buy a 12 month Heavy Utilization Reserved Instance and you only run the matching instance for 5 months and then terminate it, you still pay hourly instance charges each of the next 7 months as if you were still running the instance.

How to Save

Amazon just opened the Reserved Instance Marketplace, which lets you sell the remaining months on a Reserved Instance that you no longer need. After you list a Reserved Instance, it will be shown to EC2 customers when they look to purchase a Reserved Instance, right along with Amazon’s normal offerings.

When you list a Reserved Instance for sale, Amazon suggests a price that matches what the Reserved Instance would be worth to somebody purchasing the remaining months.

AWS Reserved Instance Marketplace purchase listing

Though you would surely prefer to get that value (minus Amazon’s 12% commission) for the Reserved Instance, you might consider listing it for substantially less, especially while the Marketplace has fewer buyers and listings take longer to sell.

For every month that your Heavy Utilization Reserved Instance does not sell on the Reserved Instance Marketplace, you end up paying Amazon the hourly instance charges for the month, even if you are not running a matching instance.

If you give away the Heavy Utilization Reserved Instance by listing it for $0.01 it should sell fairly quickly in a rational marketplace, and you will not have to pay the hourly instance charges for the remaining months.

Thus, giving away a Heavy Utilization Reserved Instance could save you a bundle if you no longer need it.

If there is a non-zero chance you may want to run a matching instance before the end of the term, then the Reserved Instance has some positive value to you and you might consider holding on to it or listing it for a higher price, even though you need to pay when the instance is not running.

Figuring out the optimal listing price between $0.01 and a price near Amazon’s suggested listing price is left as an exercise for the reader.

When you need an AWS command line toolset not provided by Ubuntu packages, you can download the tools directly from Amazon and install them locally.

In a previous article I provided instructions on how to install AWS command line tools using Ubuntu packages. That method is slightly easier to set up and easier to upgrade when Ubuntu releases updates. However, the Ubuntu packages aren’t always up to date with the latest from Amazon and there are not yet Ubuntu packages published for every AWS command line tools you might want to use.

Unfortunately, Amazon does not have one single place where you can download all the command line tools for the various services, nor are all of the tools installed in the same way, nor do they all use the same format for accessing the AWS credentials.

The following steps show how I install and configure the AWS command line tools provided by Amazon when I don’t use the packages provided by Ubuntu.

Prerequisites

Install required software packages:

sudo apt-get update
sudo apt-get install -y openjdk-6-jre ruby1.8-full rubygems     libxml2-utils libxml2-dev libxslt-dev     unzip cpanminus build-essential
sudo gem install uuidtools json httparty nokogiri

Create a directory where all AWS tools will be installed:

sudo mkdir -p /usr/local/aws

Now we’re ready to start downloading and installing all of the individual software bundles that Amazon has released and made available in scattered places on their web site and various S3 buckets.

Download and Install AWS Command Line Tools

These steps should be done from an empty temporary directory so you can afterwards clean up all of the downloaded and unpacked files.

Note: Some of these download URLs always get the latest version and some tools have different URLs every time a new version is released. Click through on the tool link to find the latest [Download] URL.

EC2 API command line tools:

wget --quiet http://s3.amazonaws.com/ec2-downloads/ec2-api-tools.zip
unzip -qq ec2-api-tools.zip
sudo rsync -a --no-o --no-g ec2-api-tools-*/ /usr/local/aws/ec2/

EC2 AMI command line tools:

wget --quiet http://s3.amazonaws.com/ec2-downloads/ec2-ami-tools.zip
unzip -qq ec2-ami-tools.zip
sudo rsync -a --no-o --no-g ec2-ami-tools-*/ /usr/local/aws/ec2/

IAM (Identity and Access Management) commmand line tools:

wget --quiet http://awsiammedia.s3.amazonaws.com/public/tools/cli/latest/IAMCli.zip
unzip -qq IAMCli.zip
sudo rsync -a --no-o --no-g IAMCli-*/ /usr/local/aws/iam/

RDS (Relational Database Service) command line tools:

wget --quiet http://s3.amazonaws.com/rds-downloads/RDSCli.zip
unzip -qq RDSCli.zip
sudo rsync -a --no-o --no-g RDSCli-*/ /usr/local/aws/rds/

ELB (Elastic Load Balancer) command line tools:

wget --quiet http://ec2-downloads.s3.amazonaws.com/ElasticLoadBalancing.zip
unzip -qq ElasticLoadBalancing.zip
sudo rsync -a --no-o --no-g ElasticLoadBalancing-*/ /usr/local/aws/elb/

AWS CloudFormation command line tools:

wget --quiet https://s3.amazonaws.com/cloudformation-cli/AWSCloudFormation-cli.zip
unzip -qq AWSCloudFormation-cli.zip
sudo rsync -a --no-o --no-g AWSCloudFormation-*/ /usr/local/aws/cfn/

Auto Scaling command line tools:

wget --quiet http://ec2-downloads.s3.amazonaws.com/AutoScaling-2011-01-01.zip
unzip -qq AutoScaling-*.zip
sudo rsync -a --no-o --no-g AutoScaling-*/ /usr/local/aws/as/

AWS Import/Export command line tools:

wget --quiet http://awsimportexport.s3.amazonaws.com/importexport-webservice-tool.zip
sudo mkdir /usr/local/aws/importexport
sudo unzip -qq importexport-webservice-tool.zip -d /usr/local/aws/importexport

CloudSearch command line tools:

wget --quiet http://s3.amazonaws.com/amazon-cloudsearch-data/cloud-search-tools-1.0.0.1-2012.03.05.tar.gz
tar xzf cloud-search-tools*.tar.gz
sudo rsync -a --no-o --no-g cloud-search-tools-*/ /usr/local/aws/cloudsearch/

CloudWatch command line tools:

wget --quiet http://ec2-downloads.s3.amazonaws.com/CloudWatch-2010-08-01.zip
unzip -qq CloudWatch-*.zip
sudo rsync -a --no-o --no-g CloudWatch-*/ /usr/local/aws/cloudwatch/

ElastiCache command line tools:

wget --quiet https://s3.amazonaws.com/elasticache-downloads/AmazonElastiCacheCli-2012-03-09-1.6.001.zip
unzip -qq AmazonElastiCacheCli-*.zip
sudo rsync -a --no-o --no-g AmazonElastiCacheCli-*/ /usr/local/aws/elasticache/

Elastic Beanstalk command line tools:

wget --quiet https://s3.amazonaws.com/elasticbeanstalk/cli/AWS-ElasticBeanstalk-CLI-2.1.zip
unzip -qq AWS-ElasticBeanstalk-CLI-*.zip
sudo rsync -a --no-o --no-g AWS-ElasticBeanstalk-CLI-*/ /usr/local/aws/elasticbeanstalk/

Elastic MapReduce command line tools:

wget --quiet http://elasticmapreduce.s3.amazonaws.com/elastic-mapreduce-ruby.zip
unzip -qq -d elastic-mapreduce-ruby elastic-mapreduce-ruby.zip
sudo rsync -a --no-o --no-g elastic-mapreduce-ruby/ /usr/local/aws/elasticmapreduce/

Simple Notification Serivice (SNS) command line tools:

wget --quiet http://sns-public-resources.s3.amazonaws.com/SimpleNotificationServiceCli-2010-03-31.zip
unzip -qq SimpleNotificationServiceCli-*.zip
sudo rsync -a --no-o --no-g SimpleNotificationServiceCli-*/ /usr/local/aws/sns/
sudo chmod 755 /usr/local/aws/sns/bin/*

Route 53 (DNS) command line tools:

sudo mkdir -p /usr/local/aws/route53/bin
for i in dnscurl.pl route53tobind.pl bindtoroute53.pl route53zone.pl; do
  sudo wget --quiet --directory-prefix=/usr/local/aws/route53/bin      http://awsmedia.s3.amazonaws.com/catalog/attachments/$i
  sudo chmod +x /usr/local/aws/route53/bin/$i
done
cpanm --sudo --notest --quiet Net::DNS::ZoneFile NetAddr::IP   Net::DNS Net::IP Digest::HMAC Digest::SHA1 Digest::MD5

CloudFront command line tool:

sudo wget --quiet --directory-prefix=/usr/local/aws/cloudfront/bin   http://d1nqj4pxyrfw2.cloudfront.net/cfcurl.pl
sudo chmod +x /usr/local/aws/cloudfront/bin/cfcurl.pl

S3 command line tools:

wget --quiet http://s3.amazonaws.com/doc/s3-example-code/s3-curl.zip
unzip -qq s3-curl.zip
sudo mkdir -p /usr/local/aws/s3/bin/
sudo rsync -a --no-o --no-g s3-curl/ /usr/local/aws/s3/bin/
sudo chmod 755 /usr/local/aws/s3/bin/s3curl.pl

AWS Data Pipeline command line tools:

wget --quiet https://s3.amazonaws.com/datapipeline-us-east-1/software/latest/DataPipelineCLI/datapipeline-cli.zip
unzip -qq datapipeline-cli.zip
sudo rsync -a --no-o --no-g datapipeline-cli/ /usr/local/aws/datapipeline/

Now that we have all of the software installed under /usr/local/aws we need to set up the AWS credentials and point the tools to where they can find everything.

Set up AWS Credentials and Envronment

Create a place to store the secret AWS credentials:

mkdir -m 0700 $HOME/.aws-default/

Copy your AWS X.509 certificate and private key to this subdirectory. These files will have names that look something like this:

$HOME/.aws-default/cert-7KX4CVWWQ52YM2SUCIGGHTPDNDZQMVEF.pem
$HOME/.aws-default/pk-7KX4CVWWQ52YM2SUCIGGHTPDNDZQMVEF.pem

Create the file $HOME/.aws-default/aws-credential-file.txt with your AWS access key id and secret access key in the following format:

AWSAccessKeyId=<insert your AWS access id here>
AWSSecretKey=<insert your AWS secret access key here>

Create the file $HOME/.aws-default/aws-credentials.json in the following format:

{
"access-id": "<insert your AWS access id here>",
"private-key": "<insert your AWS secret access key here>",
"key-pair": "<insert the name of your Amazon ec2 key-pair here>",
"key-pair-file": "<insert the path to the .pem file for your Amazon ec2 key pair here>",
"region": "<The region where you wish to launch your job flows. Should be one of us-east-1, us-west-1, us-west-2, eu-west-1, ap-southeast-1, or ap-northeast-1, sa-east-1>", 
  "use-ssl": "true",
  "log-uri": "s3://yourbucket/datapipelinelogs"
}

Create the file $HOME/.aws-secrets in the following format:

%awsSecretAccessKeys = (
  'default' => {
    id => '<insert your AWS access id here>',
    key => '<insert your AWS secret access key here>',
  },
);

Create a symbolic link for s3curl to find its hardcoded config file and secure the file permissions

ln -s $HOME/.aws-secrets $HOME/.s3curl
chmod 600 $HOME/.aws-default/* $HOME/.aws-secrets

Add the following lines to your $HOME/.bashrc file so that the AWS command line tools know where to find themselves and the credentials. We put the new directories in the front of the $PATH so that we run these instead of any similar tools installed by Ubuntu packages.

export JAVA_HOME=/usr
export EC2_HOME=/usr/local/aws/ec2
export AWS_IAM_HOME=/usr/local/aws/iam
export AWS_RDS_HOME=/usr/local/aws/rds
export AWS_ELB_HOME=/usr/local/aws/elb
export AWS_CLOUDFORMATION_HOME=/usr/local/aws/cfn
export AWS_AUTO_SCALING_HOME=/usr/local/aws/as
export CS_HOME=/usr/local/aws/cloudsearch
export AWS_CLOUDWATCH_HOME=/usr/local/aws/cloudwatch
export AWS_ELASTICACHE_HOME=/usr/local/aws/elasticache
export AWS_SNS_HOME=/usr/local/aws/sns
export AWS_ROUTE53_HOME=/usr/local/aws/route53
export AWS_CLOUDFRONT_HOME=/usr/local/aws/cloudfront

for i in $EC2_HOME $AWS_IAM_HOME $AWS_RDS_HOME $AWS_ELB_HOME   $AWS_CLOUDFORMATION_HOME $AWS_AUTO_SCALING_HOME $CS_HOME   $AWS_CLOUDWATCH_HOME $AWS_ELASTICACHE_HOME $AWS_SNS_HOME   $AWS_ROUTE53_HOME $AWS_CLOUDFRONT_HOME /usr/local/aws/s3
do
  PATH=$i/bin:$PATH
done
PATH=/usr/local/aws/elasticbeanstalk/eb/linux/python2.7:$PATH
PATH=/usr/local/aws/elasticmapreduce:$PATH
PATH=/usr/local/aws/datapipeline:$PATH

export EC2_PRIVATE_KEY=$(echo $HOME/.aws-default/pk-*.pem)
export EC2_CERT=$(echo $HOME/.aws-default/cert-*.pem)
export AWS_CREDENTIAL_FILE=$HOME/.aws-default/aws-credential-file.txt
export ELASTIC_MAPREDUCE_CREDENTIALS=$HOME/.aws-default/aws-credentials.json
export DATA_PIPELINE_CREDENTIALS=$HOME/.aws-default/aws-credentials.json

Set everything up in your current shell:

source $HOME/.bashrc

Test

Make sure that the command line tools are installed and have credentials set up correctly. These commands should not return errors:

ec2-describe-regions
ec2-ami-tools-version
iam-accountgetsummary
rds-describe-db-engine-versions
elb-describe-lb-policies
cfn-list-stacks
cs-describe-domain
mon-version
elasticache-describe-cache-clusters
eb --version
elastic-mapreduce --list --all
sns-list-topics
dnscurl.pl --keyname default https://route53.amazonaws.com/2010-10-01/hostedzone | xmllint --format -
cfcurl.pl --keyname default https://cloudfront.amazonaws.com/2008-06-30/distribution | xmllint --format -
s3curl.pl --id default http://s3.amazonaws.com/ | xmllint --format -
datapipeline  --list-pipelines

Are you aware of any other command line tools provided by Amazon? Let other readers know in the comments on this article.

[Update 2012-09-06: New URL for ElastiCache tools. Thanks iknewitalready]

[Upate 2012-12-21: Added AWS Data Pipeline command line tools. May break Elastic MapReduce due to Ruby version conflict.]

Amazon just announced two related features for getting super-fast, consistent performance with EBS volumes: (1) Provisioned IOPS EBS volumes, and (2) EBS-Optimized Instances.

Starting new instances and EBS volumes with these features is fine, but what if you already have some running instances you’d like to upgrade for faster and more consistent disk performance?

Given the two AWS features, there are two separate powers that need to be engaged to take full advantage:

  1. Convert the EBS volume(s) from standard EBS volumes into new Provisioned IOPS EBS volume(s).

  2. Convert the standard EC2 instance into an EBS-Optimized instance.

This article demonstrates how to take an existing EBS boot instance that is already running and convert it to use both of these two EBS performance features. Note that there will be some increased costs; please study Amazon’s published pricing before attempting.

Demo Setup

For this demo, we start a temporary EBS boot instance (Ubuntu 12.04 LTS). Save the instance id and EBS volume id:

    zone=us-east-1d
    ec2-run-instances --availability-zone $zone --key $USER --instance-type m1.small ami-013f9768
    instance_id=...

    ec2-describe-instances $instance_id
    volume_id=...

Steps

Here are the steps to take a running EBS boot instance and convert it into an EBS-Optimized Instance with a Provisioned IOPS EBS volume.

  1. Stop the EC2 instance (and wait for it to stop):

    ec2-stop-instances $instance_id
    
  2. Detach the original (non-Provisioned IOPS) EBS volume(s):

    ec2-detach-volume $volume_id
    
  3. Snapshot the original EBS volume(s) and save the snapshot ids:

    ec2-create-snapshot $volume_id
    snapshot_id=...
    
  4. Create Provisioned IOPS EBS volume from the snapshot(s) in the same availability zone as the instance, specifying the new size in GB, and specifying the IOPS level that you require:

    ec2-create-volume --availability-zone $zone --size 10 --type io1 --iops 100 --snapshot $snapshot_id
    new_volume_id=...
    
  5. Attach new Provisioned IOPS EBS volume(s) to the instance:

    ec2-attach-volume --instance $instance_id --device /dev/sda1 $new_volume_id
    
  6. If the instance type is not already one of the ones that supports EBS-Optimized instances, then you’ll need to change it to one that is. For this example, we’ll use m1.large:

    ec2-modify-instance-attribute --instance-type m1.large $instance_id
    
  7. Convert the EC2 instance to EBS-Optimized:

    ec2-modify-instance-attribute --ebs-optimized True $instance_id
    
  8. Start the EBS-Optimized EC2 instance with its new, attached Provisioned IOPS EBS volume(s):

    ec2-start-instances $instance_id
    

If you had an Elastic IP address associated with the instance before you stopped it, now’s the time to re-associate it.

Cleanup

When you’re comfortable with the new provisioned IOPS EBS volume, delete the original EBS volume and its snapshot:

ec2-delete-volume $volume_id
ec2-delete-snapshot $snapshot_id

Terminate any test instance you started to experiment with in this demo:

ec2-terminate-instances $instance_id

Since you manually created the new EBS volume that was attached to the test instance, it will not be automatically deleted when the instance is terminated, so you must delete it manually:

ec2-delete-volume $new_volume_id

Notes

  • In order for these commands to work with these features, you must be running the latest version of the EC2 API command line tools (or at least v1.6.1.1).

  • The new Provisioned IOPS EBS volume must be at least 10 GB.

  • The size of the new Provisioned IOPS EBS volume in GB must be at least 1/10th the value of the IOPS you are requesting. For example, 1000 IOPS requires an EBS volume of at least 100 GB.

  • If you’re running an official Ubuntu AMI, then your root file system will automatically be extended to the new size of the EBS volume. Other distros might need a little resize2fs or xfs_growfs to get the benefit.

Did you know that Amazon includes status messages about the health of availability zones in the output of the ec2-describe-availability-zones command, the associated API call, and the AWS console?

Right now, Amazon is restoring power to a “large number of instances” in one availability zone in the us-east-1 region due to “electrical storms in the area”.

Since the names used for specific availability zones differ between AWS accounts, Amazon can’t just say that the affected zone is us-east-1c as it might be us-east-1e in another account.

During this outage, you can find out what the name of the affected availability zone is in your AWS account by running this command (installation instructions):

ec2-describe-availability-zones

Here is the output for one of my accounts showing that the zone is named us-east-1b.

AVAILABILITYZONE    us-east-1a  available   us-east-1   
AVAILABILITYZONE    us-east-1b  impaired    us-east-1   EC2 and EBS APIs are once again operating normally. We are continuing to recover impacted instances and volumes.
AVAILABILITYZONE    us-east-1c  available   us-east-1   
AVAILABILITYZONE    us-east-1d  available   us-east-1   
AVAILABILITYZONE    us-east-1e  available   us-east-1

and here is the output for another account, showing that the zone is named us-east-1c.

AVAILABILITYZONE    us-east-1a  available   us-east-1   
AVAILABILITYZONE    us-east-1b  available   us-east-1   
AVAILABILITYZONE    us-east-1c  impaired    us-east-1   EC2 and EBS APIs are once again operating normally. We are continuing to recover impacted instances and volumes.
AVAILABILITYZONE    us-east-1d  available   us-east-1   
AVAILABILITYZONE    us-east-1e  available   us-east-1

If you’re not a command line person, you can also check on the AWS console, which for one of my accounts, shows this right now:

AWS console snapshot

You can generally find more details on the progression of Amazon’s investigation and repair on the AWS Status page. That page also has links for RSS feeds like this one: EC2 us-east-1 Service Status

Since the availability zone status information is available through the command line and API, has anybody written plugins for Nagios or similar monitoring software so that we can send alerts to our teams when Amazon marks availability zones as impaired?

Update 2012-06-30: Jim Browne has taken the challenge and created a Nagios plugin for ec2-describe-availability-zones.

Update 2012-06-30: It looks like the ec2-describe-availability messages are not updated nearly as frequently as the AWS status page. An hour ago the AWS status page changed to say “EC2 instances and EBS volumes are operating normally”, but ec2-describe-availability-zones still says “We are continuing to work to recover the remaining EC2 instances, EBS volumes and ELBs.”

Here are the steps for installing the AWS command line tools that are currently available as Ubuntu packages. These include:

  • EC2 API tools
  • EC2 AMI tools
  • IAM - Identity and Access Management
  • RDS - Relational Database Service
  • CloudWatch
  • Auto Scaling
  • ElastiCache

Starting with Ubuntu 12.04 LTS Precise, these are also available:

  • CloudFormation
  • ELB - Elastic Load Balancer

Install Packages

Enable the multiverse repository. This can be done through the Ubuntu Update Manager or by editing /etc/apt/sources.list Here are some commands that will enable multiverse on a new installation:

# 12.04 LTS Precise, 11.10 Oneiric
sudo perl -pi.orig -e   'next if /-backports/; s/^# (deb .* multiverse)$/$1/'   /etc/apt/sources.list

# 10.04 LTS Lucid
sudo perl -pi.orig -e   's/^(deb .* universe)$/$1 multiverse/'   /etc/apt/sources.list

Enable the awstools PPA and update the apt package index:

sudo apt-add-repository ppa:awstools-dev/awstools
sudo apt-get update

Install available AWS command line tool packages:

sudo apt-get install ec2-api-tools ec2-ami-tools iamcli rdscli moncli ascli elasticache

# Also available on Ubuntu 12.04 LTS Precise
sudo apt-get install aws-cloudformation-cli elbcli

Some of these tools support passing in credentials on the command line, but for regular use, you will want to store the AWS credentials in files.

Set up AWS Credentials

Create a place to store the AWS credentials:

mkdir -m 0700 $HOME/.aws/

Copy your AWS X.509 certificate and private key to this subdirectory. These files will have names that look something like this:

$HOME/.aws/cert-7KX4CVWWQ52YM2SUCIGGHTPDNDZQMVEF.pem
$HOME/.aws/pk-7KX4CVWWQ52YM2SUCIGGHTPDNDZQMVEF.pem

Create the file $HOME/.aws/aws-credential-file.txt with your AWS access key id and secret access key in the following format:

AWSAccessKeyId=YOURACCESSKEYIDHERE
AWSSecretKey=YOURPRIVATEACCESSKEYHERE

Add the following lines to your $HOME/.bashrc file so that the AWS command line tools know where to find the above files:

# AWS credentials
export EC2_PRIVATE_KEY=$(echo $HOME/.aws/pk-*.pem)
export EC2_CERT=$(echo $HOME/.aws/cert-*.pem)
export AWS_CREDENTIAL_FILE=$HOME/.aws/aws-credential-file.txt

Make sure these are set in your current shell(s):

source $HOME/.bashrc

Test

Make sure that the command line tools are installed and have credentials set up correctly. These commands should not return errors:

ec2-describe-regions 
ec2-ami-tools-version
iam-accountgetsummary
rds-describe-db-engine-versions
mon-version
as-version

# Ubuntu 12.04 LTS Precise and higher
cfn-list-stacks
elb-describe-lb-policies

AWS Command Line Tools

The table below shows some of the various AWS products, whether Amazon publishes command line tools, and whether these are available in key Ubuntu releases. Some of the packages are available in the standard apt repositories, some require adding multiverse, and some are published in the awstools PPA. The awstools PPA also has newer versions of some of the packages released by Amazon after the official Ubuntu release.

AWS Service Amazon Command Line Tools Ubuntu 12.04 LTS Precise Ubuntu 11.10 Oneiric Ubuntu 10.04 LTS Lucid
EC2 API Tools AWS CLI multiverse multiverse
PPA updates
multiverse
PPA updates
EC2 AMI Tools AWS CLI multiverse multiverse
PPA updates
multiverse
PPA updates
IAM - Identity and Access Management AWS CLI main main PPA
RDS - Relational Database Service AWS CLI main main PPA
CloudWatch AWS CLI PPA PPA PPA
Auto Scaling AWS CLI PPA PPA PPA
ElastiCache AWS CLI PPA PPA PPA
ELB - Elastic Load Balancing AWS CLI PPA - -
AWS CloudFormation AWS CLI PPA - -
AWS Import/Export AWS CLI - - -
CloudFront AWS CLI - - -
CloudSearch AWS CLI - - -
Elastic Beanstalk AWS CLI - - -
SNS - Simple Notification Service AWS CLI - - -
EMR - Elastic MapReduce AWS CLI - - -
Route 53 AWS CLI - - -
S3 - Simple Storage Service AWS CLI - - -
SES - Simple Email Service - - -
Direct Connect - - - -
DynamoDB - - - -
SimpleDB - - - -
SQS - Simple Queue Service - - - -
Storage Gateway - - - -
SWF (Simple Workflow Service) - - - -
VPC (Virtual Private Cloud) - - - -

As you can see, there are a number of command line tools that are not (yet) packaged in Ubuntu. These can be downloaded directly from Amazon and installed manually.

There are also a number of AWS services that do not have command line tools available from Amazon, though some third parties have provided helpful alternatives.

[Update 2012-09-03: Added links to command line tools for S3, SNS]
[Update 2013-03-10: Added CloudWatch, Auto Scaling, ElastiCache]

I will be attending the Ubuntu Developer Summit (UDS) next week in Oakland, CA.  This event brings people from around the world together in one place every six months to discuss and plan for the next release of Ubuntu.  The May 2012 UDS is for Ubuntu-Q which will eventually be named and become Ubuntu 12.10 when it is released in October (2012-10).

I’ve attended two UDS in person prior to this, one held at Google (Mountain View) for Ubuntu Jaunty (9.04) and one in Dallas for Ubuntu Lucid (10.04). UDS wanders around the world to mix it up and get input from a wide variety of contributors. I’m not a fan of flying long distances, so I tend to wait until UDS comes to within a couple hours of my home in Los Angeles.

My primary involvement at UDS is to contribute my perspectives to the plans for Ubuntu as it relates to running on Amazon EC2 and interacting with other features of AWS, though I also have interest in general Ubuntu server functionality.  I’ve been running Ubuntu on servers since 2005, and Ubuntu servers on EC2 since 2007.

I am grateful to Canonical for sponsoring my trip to and stay at UDS as they do for many community members.  I continue to be impressed by how Ubuntu is developed in such an open fashion with Canonical’s support.

All community members interested in learning about how Ubuntu is developed and/or interested in helping give input to the future of Ubuntu are welcome to participate in UDS. You can either attend in person as I will be, or you can participate online.  Be sure to register (free) at the UDS site.

Taking a full week off for UDS is a little much for me, so I’ll be attending three full days (Wed-Fri). Will I see you there or online? What feedback and suggestions would you have for running Ubuntu on EC2?

Ubuntu AMIs

Ubuntu AMIs for EC2:


More Entries

Uploading Known ssh Host Key in EC2 user-data Script
The ssh protocol uses two different keys to keep you secure: The user ssh key is the one we normally…
Seeding Torrents with Amazon S3 and s3cmd on Ubuntu
Amazon Web Services is such a huge, complex service with so many products and features that sometimes very simple but…
CloudCamp
There are a number of CloudCamp events coming up in cities around the world. These are free events, organized around…
Use the Same Architecture (64-bit) on All EC2 Instance Types
A few hours ago, Amazon AWS announced that all EC2 instance types can now run 64-bit AMIs. Though t1.micro, m1.small,…
ec2-consistent-snapshot on GitHub and v0.43 Released
The source for ec2-conssitent-snapshot has historically been available here: ec2-consistent-snapshot on Launchpad.net using Bazaar For your convenience, it is now…
You Should Use EBS Boot Instances on Amazon EC2
EBS boot vs. instance-store If you are just getting started with Amazon EC2, then use EBS boot instances and stop…
Retrieve Public ssh Key From EC2
A serverfault poster had a problem that I thought was a cool challenge. I had so much fun coming up…
Running EC2 Instances on a Recurring Schedule with Auto Scaling
Do you want to run short jobs on Amazon EC2 on a recurring schedule, but don’t want to pay for…
AWS Virtual MFA and the Google Authenticator for Android
Amazon just announced that the AWS MFA (multi-factor authentication) now supports virtual or software MFA devices in addition to the…
Updated EBS boot AMIs for Ubuntu 8.04 Hardy on Amazon EC2 (2011-10-06)
Canonical has released updated instance-store AMIs for Ubuntu 8.04 LTS Hardy on Amazon EC2. Read Ben Howard’s announcement on the…
New Release of Alestic Git Server
New AMIs have been released for the Alestic Git Server. Major upgrade points include: Base operating system upgraded to Ubuntu…
Using ServerFault.com for Amazon EC2 Q&A
The Amazon EC2 Forum has been around since the beginning of EC2 and has always been a place where you…
Rebooting vs. Stop/Start of Amazon EC2 Instance
When you reboot a physical computer at your desk it is very similar to shutting down the system, and booting…
Upper Limits on Number of Amazon EC2 Instances by Region
[Update: As predicted, these numbers are already out of date and Amazon has added more public IP address ranges for…
Unavailable Availability Zones on Amazon EC2
I’m taking a class about using Chef with EC2 by Florian Drescher today and Florian mentioned that he noticed one…
Desktop AMI login security with NX
Update 2011-08-04: Amazon Security did more research and investigated the desktop AMIs. They have confirmed that their software incorrectly flagged…
Updated EBS boot AMIs for Ubuntu 8.04 Hardy on Amazon EC2
For folks still using the old, reliable Ubuntu 8.04 LTS Hardy from 2008, Canonical has released updated AMIs for use…
Creating Public AMIs Securely for EC2
Amazon published a tutorial about best practices in creating public AMIs for use on EC2 last week: How To Share…
Canonical Releases Ubuntu 11.04 Natty for Amazon EC2
As steady as clockwork, Ubuntu 11.04 Natty is released on the day scheduled at least eleven months ago; and thanks…
EC2 Reserved Instance Offering IDs Change Over Time
This article is a followup to Matching EC2 Availability Zones Across AWS Accounts written back in 2009. Please read that…