I was chatting with Kevin Boyd (aka Beryllium) on the ##aws Freenode IRC channel about the challenge of invalidating a large number of CloudFront objects (35,000) due to a problem where the cached copies of the objects were out of date and the system had not been designed with versioning in the object path or name.

In addition to the work to perform all of these invalidations (in batches of up to 1,000 in each request with at most 3 request outstanding) there is also the issue of cost. The first thousand CloudFront invalidations are free in a month, but the remainder of the invalidations in this case would cost $170 (at $0.005 for each object).

It occurred to me that one could take advantage of the on-demand nature of AWS by using the following approach:

  1. Create a new CloudFront distribution, set up exactly like the existing distribution (except that the new distribution caches would be empty).

  2. Change the application to point to the new CloudFront distribution domain when referring to the objects.

Step 2 consists of a simple DNS change, assuming that you use your own domain name (e.g., cdn.example.com) when referring to the CloudFront objects in your web site or application, and where that domain name is a CNAME reference to the actual CloudFront distribution.

As soon as this is completed (preferably with a short DNS TTL) then the new CloudFront distribution will be hit by clients and will be filled up with the new versions of the objects.

After a while, you would then destroy and stop paying for the original CloudFront distribution that is no longer being referenced or used.

Simply replacing the CloudFront distribution effectively “invalidates” all of the objects at once, with no charges for invalidation requests and very little effort.

Once again, AWS wins with the principles of on-demand, pay for what you use, throw away what you don’t need.

using CloudWatch and SNS to send yourself email messages when AWS costs accrue past limits you define

The Amazon documentation describes how to use the AWS console to monitor your estimated charges using Amazon CloudWatch and includes some pointers for folks using the command line. Unfortunately, that article leaves out the commands to set up the SNS (Simple Notification Service) topics and SNS subscriptions, so I present here the complete steps I use.

I like using the command line tools as they let me automate and repeat actions without having to do lots of pointing, clicking, and re-entering data. For example, I want to set up a number of billing alerts in each new account, sometimes at $10 increments, and sometimes at $100 or $1000 increments. The steps below let me do this in seconds with a simple copy and paste.

When I get one of these billing alert emails, I glance at the day of the month to see if that account’s charges are progressing on an appropriate pace or if they require further investigation.

This was the mechanism that recently alerted me to the extra charges involved in automating the transition of S3 objects to Glacier.

Once you’ve installed the AWS command line tools here are the steps to set up automated billing alert emails.

Billing Alerts

Create an SNS topic where billing alert notifications will be sent.

snstopic=$(sns-create-topic BillingAlert)
echo snstopic=$snstopic

Subscribe your email address to the SNS topic so you receive email messages when your AWS billing estimates exceed each trigger point.

email=YOURNAME@YOURDOMAIN.com
sns-subscribe "$snstopic" --protocol email --endpoint "$email"

Check your mailbox for a confirmation email from Amazon and click on the link to complete your subscription to this SNS topic.

Create CloudWatch monitor alarms for the AWS billing estimated charges at each dollar figure where you want to be alerted. This example sets alarms at every $100 increment up to $1000, but you can change this to any values you’d like.

for amount in 100 200 300 400 500 600 700 800 900 1000
do
  mon-put-metric-alarm "awsbilling-$amount"     --alarm-description "AWS billing alarm: \$$amount"     --namespace AWS/Billing     --metric-name EstimatedCharges     --evaluation-periods 1     --period 21600     --statistic Maximum     --comparison-operator GreaterThanOrEqualToThreshold     --dimensions "Currency=USD"     --threshold "$amount"     --actions-enabled true     --alarm-actions "$snstopic"
done

See the CloudWatch monitor alarms you have created:

mon-describe-alarms --headers

Now spend lots of money with AWS and monitor your inbox for email alerts.

Cleanup

The above sample commands may incur minimal charges in your account (SNS Pricing, CloudWatch Pricing). If you don’t want to keep these alerts in place, you will need to undo what was set up.

Delete the alarms you created (replace with your specific trigger values used above).

for amount in 100 200 300 400 500 600 700 800 900 1000
do
  mon-delete-alarms "awsbilling-$amount" --force
done

Delete the SNS topic.

sns-delete-topic "$snstopic" --force

Notes

  • In order to follow these examples, you will need to install the AWS command line tools, at least for SNS and CloudWatch.

  • It may take half a day or more for a billing alarm to be triggered based on how AWS collects billing data and how the alarms are set.

  • Make sure you confirm your subscription to the SNS topic by clicking on the link in the confirmation email AWS sends to you, or Amazon will send no email billing alerts.

  • Each alert email will have an unsubscribe link in it for your convenience. This will unsubscribe you from all of the alerts, not just the specific cost level in that particular email.

  • Amazon’s documentation states that you must first “enable the monitoring of estimated charges” in each account. I just tested this with a new account and found that this was not necessary, so the documentation may be a bit out of date.

how I was surprised by a large AWS charge and how to calculate the break-even point

Glacier Archival of S3 Objects

Amazon recently introduced a fantastic new feature where S3 objects can be automatically migrated over to Glacier storage based on the S3 bucket, the key prefix, and the number of days after object creation.

This makes it trivially easy to drop files in S3, have fast access to them for a while, then have them automatically saved to long-term storage where they can’t be accessed as quickly, but where the storage charges are around a tenth of the price.

…or so I thought.

S3 Lifecycle Rule

My first use of this feature was on some buckets where I store about 350 GB of data that fits the Glacier use pattern perfectly: I want to save it practically forever, but expect to use it rarely.

It was straight forward to use the S3 Console to add a lifecycle rule to the S3 buckets so that all objects are archived to Glacier after 60 days:

S3 Lifecycle Rule

(Long time readers of this blog may be surprised I didn’t list the command lines to accomplish this task, but Amazon has not yet released useful S3 tools that include the required functionality.)

Since all of the objects in the buckets were more than 60 days old, I expected them to be transitioned to Glacier within a day, and true to Amazon’s documentation, this occurred on schedule.

Surprise Charge

What I did not expect was an email alert from my AWS billing alarm monitor on this account letting me know that I had just passed $200 for the month, followed a few hours later by an alert for $300, followed by an alert for a $400 trigger.

This is one of my personal accounts, so a rate of several hundred dollars a day is not sustainable. Fortunately, a quick investigation showed that this increase was due to one time charges, so I wasn’t about to run up a $10k monthly bill.

The line item on the AWS Activity report showed the source of the new charge:

$0.05 per 1,000 Glacier Requests x 5,306,220 Requests = $265.31

It had not occurred to me that there would be much of a charge for transitioning the objects from S3 to Glacier. I should have read the S3 Pricing page, where Amazon states:

Glacier Archive and Restore Requests: $0.05 per 1,000 requests

This is five times as expensive as the initial process of putting objects into S3, which is $0.01 per 1,000 PUT requests.

There is one “archive request” for each S3 object that is transitioned from S3 to Glacier, and I had over five million objects in these buckets, something I didn’t worry about previously because my monthly S3 charges were based on the total GB, not the number of objects.5306220

Overhead per Glacier Object

josh.monet has pointed out in the comments that Amazon has documented some Glacier storage overhead:

For each S3 object migrated to Glacier, Amazon adds “an additional 32 KB of Glacier data plus an additional 8 KB of S3 standard storage data”.

Storage for this overhead is charged at standard Glacier and S3 prices. This makes Glacier completely unsuitable for small objects.

Break-even Point

After stopping to think about it, I realized that I was still saving money on the long term by moving objects in these S3 buckets to Glacier storage. This one-time up front cost was going to be compensated for slowly by my monthly savings, because Glacier is cheap, even compared to the reasonably cheap S3 storage costs, at least for larger files.

Here are the results of my calculations:

  • Monthly cost of storing in S3: 350 GB x $0.095/GB = $33.25

  • Monthly cost of storing in Glacier: $8.97

    • 350 GB x $0.01/GB = $3.50
    • Glacier overhead: 5.3 million * 32 KB * $0.01/GB = $1.62
    • S3 overhead: 5.3 million * 8 KB * $0.95/GB = $3.85
  • One time cost to transition 5.3 million objects from S3 to Glacier: $265

  • Months until I start saving money by moving to Glacier: 11

  • Savings per year after first 11 months: $291 (73%)

For this data’s purpose, everything eventually works out to an advantage, so thanks, Amazon! I will, however, think twice before doing this with other types of buckets, just to make sure that the data is large enough and is going to be sitting around long enough in Glacier to be worth the transition costs.

As it turns out, the primary factor in how long it takes to break even is the average size of the S3 objects. If the average size of my data files were larger, then I would start saving money sooner.

Here’s the formula… The number of months to break even and start saving money when transferring S3 objects to Glacier is:

break-even months = 631,613 / (average S3 object size in bytes - 13,011)

(units apologies to math geeks)

In my case, the average size of the S3 objects was 70,824 bytes (about 70 KB). Applying the above formula:

631,613 / (70,824 - 13,011) = 10.9

or about 11 months until the savings in Glacier over S3 covers the cost of moving my objects from S3 to Glacier.

If you are storing 1 KB data records in S3, it would take over 50 years to justify transitioning them to Glacier.

Looking closely at the above formula, you can see that any object 13 KB or smaller is going to cost more to transition to Glacier rather than leaving it in S3. Files approaching that size are going to save too little money to justify the transfer costs.

The above formula assumes an S3 storage cost of $0.095 per GB per month in us-east-1. If you are storing more than a TB, then you’re into the $0.08 tier or lower, so your break-even point will take longer and you’ll want to do more calculations to find your savings.

[Update 2012-12-19: Included additional S3 and Glacier storage overhead per item. Thanks to josh.monet for pointing us to this information buried in the S3 FAQ.]

Amazon has announced a new AWS region in Sydney, Australia with the name ap-southeast-2.

The official Ubuntu AMI lookup pages (1, 2) don’t seem to be showing the new location yet, but the official Ubuntu AMI query API does seem to be working, so the new ap-southeast-2 Ubuntu AMIs are available for lookup on Alestic.com.

[Update 2012-11-13: Canonical has fixed the primary Ubuntu AMI lookup page and I understand it should remain more up to date going forward, but the other page is still missing ap-southeast-2]

Point and Click

At the top right of most pages on Alestic.com is an “Ubuntu AMIs” section. Simply select the EC2 region from the pulldown (say “ap-southeast-2” for Sydney, Australia) and you will see a list of the official 64-bit Ubuntu AMI ids for the various active Ubuntu releases.

Both EBS boot and instance-store AMI ids are listed, but I recommend you start with EBS boot AMIs.

To launch a listed Ubuntu AMI, simply click on the orange arrow to the right of the AMI id and you will be taken to the EC2 section of the AWS console with the AMI id selected:

Ubuntu AMI ids on Alestic.com

The AWS console walks you through setting up required ssh keys and security groups and even has a point and click way to ssh to your instance, provided you have Java in your browser (I disable that).

Command Line

You can also launch Ubuntu AMIs with the EC2 command line tools. First, make sure you upload your ssh key to the new Sydney, Australia EC2 region using something like:

ec2-import-keypair --region ap-southeast-2 --public-key-file $HOME/.ssh/id_rsa.pub $USER

If you haven’t already, open the ssh port on your default security group:

ec2-authorize --region ap-southeast-2 default -p 22

Then, to launch Ubuntu 12.04 LTS Precise EBS boot, you would use a command like:

ec2-run-instances --region ap-southeast-2 --key $USER --instance-type t1.micro ami-fb8611c1

where you should always look up and use the most recent AMI id. Make a note of the instance id.

Wait a few seconds for the instance to be assigned an IP address and to start booting, then find out what the IP address was with:

ec2-describe-instances --region ap-southeast-2 <INSTANCEID>

If you ran the instance with your uploaded personal ssh key, you can then access the Ubuntu server using

ssh ubuntu@<IPADDRESS>

where is the public IP address of the instance (does not start with “10.”).

Always remember to terminate your temporary EC2 instances when you are done with them so you don’t keep paying charges:

ec2-terminate-instances --region ap-southeast-2 <INSTANCEID>

Note: The official Ubuntu AMI ids listed on Alestic.com are created, published, and supported by Canonical, an official sponsor of Ubuntu. (Alestic.com/Eric Hammond used to publish community Ubuntu AMIs for EC2 starting in 2007, but that fun job was transferred to Canonical back in 2009.)

You may be able to save on future EC2 expenses by selling an unused Reserved Instance for less than its true value or even $0.01, provided it is in the “Heavy Utilization” class.

In the description of the Heavy Utilization Reserved Instance, is this statement:

you pay […] a significantly lower hourly usage fee, and you’re charged that lower hourly rate for every hour in the Reserved Instance term you purchase [emphasis added]

What may not be clear to the casual reader is the fact that when you purchase a Heavy Utilization Reserved Instance, you commit not only to paying the one-time up front cost, but you are also committing to paying the hourly charge for every hour of every month, even if you are not running a matching instance!

The Light Utilization and Medium Utilization descriptions state:

Light [and Medium] Utilization RIs allow you to turn off your instance at any point and not pay the hourly fee.

This statement is conspicuously missing from the Heavy Utilization description.

If you buy a 12 month Heavy Utilization Reserved Instance and you only run the matching instance for 5 months and then terminate it, you still pay hourly instance charges each of the next 7 months as if you were still running the instance.

How to Save

Amazon just opened the Reserved Instance Marketplace, which lets you sell the remaining months on a Reserved Instance that you no longer need. After you list a Reserved Instance, it will be shown to EC2 customers when they look to purchase a Reserved Instance, right along with Amazon’s normal offerings.

When you list a Reserved Instance for sale, Amazon suggests a price that matches what the Reserved Instance would be worth to somebody purchasing the remaining months.

AWS Reserved Instance Marketplace purchase listing

Though you would surely prefer to get that value (minus Amazon’s 12% commission) for the Reserved Instance, you might consider listing it for substantially less, especially while the Marketplace has fewer buyers and listings take longer to sell.

For every month that your Heavy Utilization Reserved Instance does not sell on the Reserved Instance Marketplace, you end up paying Amazon the hourly instance charges for the month, even if you are not running a matching instance.

If you give away the Heavy Utilization Reserved Instance by listing it for $0.01 it should sell fairly quickly in a rational marketplace, and you will not have to pay the hourly instance charges for the remaining months.

Thus, giving away a Heavy Utilization Reserved Instance could save you a bundle if you no longer need it.

If there is a non-zero chance you may want to run a matching instance before the end of the term, then the Reserved Instance has some positive value to you and you might consider holding on to it or listing it for a higher price, even though you need to pay when the instance is not running.

Figuring out the optimal listing price between $0.01 and a price near Amazon’s suggested listing price is left as an exercise for the reader.

Ubuntu AMIs

Ubuntu AMIs for EC2:


More Entries

Installing AWS Command Line Tools from Amazon Downloads
When you need an AWS command line toolset not provided by Ubuntu packages, you can download the tools directly from…
Convert Running EC2 Instance to EBS-Optimized Instance with Provisioned IOPS EBS Volumes
Amazon just announced two related features for getting super-fast, consistent performance with EBS volumes: (1) Provisioned IOPS EBS volumes, and…
Which EC2 Availability Zone is Affected by an Outage?
Did you know that Amazon includes status messages about the health of availability zones in the output of the ec2-describe-availability-zones…
Installing AWS Command Line Tools Using Ubuntu Packages
Here are the steps for installing the AWS command line tools that are currently available as Ubuntu packages. These include:…
Ubuntu Developer Summit, May 2012 (Oakland)
I will be attending the Ubuntu Developer Summit (UDS) next week in Oakland, CA.  This event brings people from around…
Uploading Known ssh Host Key in EC2 user-data Script
The ssh protocol uses two different keys to keep you secure: The user ssh key is the one we normally…
Seeding Torrents with Amazon S3 and s3cmd on Ubuntu
Amazon Web Services is such a huge, complex service with so many products and features that sometimes very simple but…
CloudCamp
There are a number of CloudCamp events coming up in cities around the world. These are free events, organized around…
Use the Same Architecture (64-bit) on All EC2 Instance Types
A few hours ago, Amazon AWS announced that all EC2 instance types can now run 64-bit AMIs. Though t1.micro, m1.small,…
ec2-consistent-snapshot on GitHub and v0.43 Released
The source for ec2-conssitent-snapshot has historically been available here: ec2-consistent-snapshot on Launchpad.net using Bazaar For your convenience, it is now…
You Should Use EBS Boot Instances on Amazon EC2
EBS boot vs. instance-store If you are just getting started with Amazon EC2, then use EBS boot instances and stop…
Retrieve Public ssh Key From EC2
A serverfault poster had a problem that I thought was a cool challenge. I had so much fun coming up…
Running EC2 Instances on a Recurring Schedule with Auto Scaling
Do you want to run short jobs on Amazon EC2 on a recurring schedule, but don’t want to pay for…
AWS Virtual MFA and the Google Authenticator for Android
Amazon just announced that the AWS MFA (multi-factor authentication) now supports virtual or software MFA devices in addition to the…
Updated EBS boot AMIs for Ubuntu 8.04 Hardy on Amazon EC2 (2011-10-06)
Canonical has released updated instance-store AMIs for Ubuntu 8.04 LTS Hardy on Amazon EC2. Read Ben Howard’s announcement on the…
New Release of Alestic Git Server
New AMIs have been released for the Alestic Git Server. Major upgrade points include: Base operating system upgraded to Ubuntu…
Using ServerFault.com for Amazon EC2 Q&A
The Amazon EC2 Forum has been around since the beginning of EC2 and has always been a place where you…
Rebooting vs. Stop/Start of Amazon EC2 Instance
When you reboot a physical computer at your desk it is very similar to shutting down the system, and booting…
Upper Limits on Number of Amazon EC2 Instances by Region
[Update: As predicted, these numbers are already out of date and Amazon has added more public IP address ranges for…
Unavailable Availability Zones on Amazon EC2
I’m taking a class about using Chef with EC2 by Florian Drescher today and Florian mentioned that he noticed one…