runurl - A Tool and Approach for Simplifying user-data Scripts on EC2

| 6 Comments

Many Ubuntu and Debian images for Amazon EC2 include a hook where scripts passed as user-data will be run as root on the first boot.

At Campus Explorer, we’ve been experimenting with an approach where the actual user-data is a very short script which downloads and runs other scripts. This idea is not new, but I have simplified the process by creating a small tool named runurl which adds a lot of flexibility and convenience when configuring new servers.

Usage

The basic synopsis looks like:

runurl URL [ARGS]...

The first argument to the runurl command is the URL of a script or program which should be run. All following options and arguments are passed verbatim to the program as its options and arguments. The exit code of runurl is the exit code of the program.

The runurl command is a very short and simple script, but it makes the user-data startup scripts even shorter and simpler themselves.

Example 1

If the following content is stored at http://run.alestic.com/demo/echo

#!/bin/bash
echo "$@"

then this command:

runurl run.alestic.com/demo/echo "hello, world"

will itself output:

hello, world

You can specify the “http://” in the URLs, but since it’s using wget to download them, the specifier is not necessary and the code might be easier to read without it.

Example 2

Here’s a more substantial sample user-data script which invokes a number of other remote scripts to upgrade the Ubuntu packages, install the munin monitoring software, install and run the Folding@Home application using origami with credit going to Team Ubuntu. It finally sends an email back home that it’s active.

This sample assumes that runurl is installed on the AMI (e.g., Ubuntu AMIs published on http://alestic.com>). For other AMIs, see below for additional commands to add to the start of the script.

#!/bin/bash -ex
runurl run.alestic.com/apt/upgrade
runurl run.alestic.com/install/munin
cd /root
runurl run.alestic.com/install/folding@home -u ec2 -t 45104 -b small
runurl run.alestic.com/email/start youremail@example.com

Note that the last command passes a parameter to the script, identifying where the email should be sent. Please change this if you test the script.

With the above content stored in a file named folding.user-data, you could start 5 new c1.medium instances running the Folding@Home software using the command:

ec2-run-instances   --user-data-file folding.user-data   --key [KEYPAIR]   --instance-type c1.medium  --instance-count 5   ami-ed46a784

You can log on to an instance and monitor the installation with

tail -f /var/log/syslog

Once the Folding@Home application is running, you can monitor its progress with:

/root/origami/origami status

and after 15 minutes, check out the Munin system stats at

http://ec2-HOSTNAME/munin/

Expiring URLs

One of the problems with normal user-data scripts is that the contents exist as long as the instance is running and any user on the instance can read the contents of the user-data. This puts any private or confidential information in the user-data at risk.

If you put your actual startup code in private S3 buckets, you can pass runurl a URL to the contents, where the URL expires shortly after it is run. Or, the script could even delete the contents itself if you set it up correctly. This reduces the exposure to the time it takes for the instance to start up and does not let anybody else access the URL during that time.

Updating

Another benefit of keeping the actual startup code separate from the user-data content itself is that you can modify the startup code stored at the URL without modifying the user-data content.

This can be useful with services like EC2 Auto Scaling, where the specified user-data cannot be dynamically changed in a launch configuration without creating a whole new launch configuration.

If you modify the runurl scripts, the next server to be launched will automatically pick up the new instructions.

Bootstrapping

The runurl tool is pre-installed in the latest Ubuntu AMIs published on http://alestic.com. If you are using an Ubuntu image which does not include this software, you can install it from the Alestic PPA using the following commands at the top of your user-data script:

sudo add-apt-repository ppa:alestic/ppa &&
sudo apt-get update &&
sudo apt-get install -y runurl

If you are using an Ubuntu release without the add-apt-repository command or a Linux distro other than Ubuntu, you can install runurl using the following commands:

sudo wget -qO/usr/bin/runurl run.alestic.com/runurl
sudo chmod 755 /usr/bin/runurl

The subsequent commands in the user-data script can then use the runurl command as demonstrated in the above example.

SSL

To improve your certainty that you are talking to the right server and getting the right data, you could use SSL (https) in your URLs. If you are talking to S3 buckets, however, you’ll need to use the old style S3 bucket access style like:

runurl https://s3.amazonaws.com/run.alestic.com/demo/echo "hello, mars"

This is probably not as critical when accessing it from an EC2 instance as you’re operating over Amazon’s trusted network.

Caveats

There are a number of things which can go wrong when using a tool like runurl. Here are some to think about:

  • Only run content which you control or completely trust.

  • Just because you like the content of a URL when you look at it in your browser does not mean that it will still look like that when your instance goes to run it. It could change at any point to something that is broken or even malicious unless it is under your control.

  • If you depend on this approach for serious applications, you need to make sure that the content you are downloading is coming from a reliable server. S3 is reasonable (with retries) but you also need to consider the DNS server if you are depending on a non-AWS hostname to access the S3 bucket.

The name run.alestic.com points to an S3 bucket, but the DNS for this name is not redundant or worthy of use by applications with serious uptime requirements. This particular service should be considered my playground for ideas and there is no commitment on my part to make sure that it is up or that the content remains stable.

If you like what you see, please feel free to copy any of the open source content on run.alestic.com and store it on your own reliable and trusted servers. It is all published under the Apache2 license.

Project

I’m using this simple script as an opportunity to come up to speed with hosting projects on Launchpad. You can access the source code and submit bugs at

https://launchpad.net/runurl

You can also use launchpad and bazaar to branch the source into parallel projects and/or submit requests to merge patches into the main development branch.

[Update 2009-10-11: Document use of Alestic PPA]
[Update 2010-01-25: Simplify boostrap instructions for Ubuntu]
[Update 2010-08-17: Switch to using “add-apt-repository” for bootstrap instructions]

6 Comments


I just tried this with ami-714ba518, which is the 32bit ubuntu ebs image for Lucid 10.04. Didn't seem to work.

Upon login, runurl is not available after "updatedb" and then "locate runurl". I also looked in /etc/init.d and ec2-run-user-data seems to be missing.

http://ec2ubuntu.googlecode.com/svn/trunk/etc/init.d/ec2-run-user-data

Is this an oversight, or are some ami's built with these and some without?

Thanks,
Sean

Sean: The Ubuntu images published by Canonical (which I recommend using) do not come with runurl installed. You can install it on bootup inside a user-data script. See "Bootstrapping" above for ideas to get started.

ec2-run-user-data is legacy code and is replaced by other software in the ec2-init package.


So user data scripts in general should start with this line: (ec2-init or cloud-init seems to already be there)

wget -O- run.alestic.com/install/runurl | bash


Also with permissions on S3 objects set to Authenticated Users, only pre-signed urls seem to work. Are all of your server spinup install scripts public?

Thanks again for help & suggestions. Great software!

-Sean

Sean: run.alestic.com is my sandbox for fooling around with runurl. I recommend against depending on it for production purposes. You can install runurl from the PPA on Launchpad if you want to use it to run your own startup programs.

For some situations I do use pre-signed URLs and even expiring URLs.

Eric:

This is great stuff. Interestingly, I have over the winter spent some time implementing something similar to this and was about to publish it when I found this. Good that it already exists. Runurl is probably more general purpose than what I had done.

The Ubuntu hook that runs user-data as a shell script. How common is that behavior outside the Ubuntu world? With runurl, I can essentially quit building my own images and just use a short user-data script plus scripts on my web server. But for other Linux flavors, will I still have to create custom images of, say, CentOS to put in the ec2-init or cloud-init functionality? Or can I find such images readily of any Linux distribution?

hingo:

Support for user-data scripts is growing across distributions. Amazon Linux now supports user-data scripts with the inclusion of the Ubuntu cloud-init package in their AMIs. I believe the new RightScale AMIs also support this.

I'm not aware of any distributions that include runurl software in the AMIs yet.

Leave a comment

Ubuntu AMIs

Ubuntu AMIs for EC2:


More Entries

EC2 create-image Does Not Fully "Stop" The Instance
The EC2 create-image API/command/console action is a convenient trigger to create an AMI from a running (or stopped) EBS boot instance. It takes a snapshot of the instance’s EBS volume(s)…
Finding the Region for an AWS Resource ID
use concurrent AWS command line requests to search the world for your instance, image, volume, snapshot, … Background Amazon EC2 and many other AWS services are divided up into various…
Changing The Default "ubuntu" Username On New EC2 Instances
configure your own ssh username in user-data The official Ubuntu AMIs create a default user with the username ubuntu which is used for the initial ssh access, i.e.: ssh ubuntu@<HOST>…
Default ssh Usernames For Connecting To EC2 Instances
Each AMI publisher on EC2 decides what user (or users) should have ssh access enabled by default and what ssh credentials should allow you to gain access as that user.…
New c3.* Instance Types on Amazon EC2 - Nice!
Worth switching. Amazon shared that the new c3.* instance types have been in high demand on EC2 since they were released. I finally had a minute to take a look…
Query EC2 Account Limits with AWS API
Here’s a useful tip mentioned in one of the sessions at AWS re:Invent this year. There is a little known API call that lets you query some of the EC2…
Using aws-cli --query Option To Simplify Output
My favorite session at AWS re:Invent was James Saryerwinnie’s clear, concise, and informative tour of the aws-cli (command line interface), which according to GitHub logs he is enhancing like crazy.…
Reset S3 Object Timestamp for Bucket Lifecycle Expiration
use aws-cli to extend expiration and restart the delete or archive countdown on objects in an S3 bucket Background S3 buckets allow you to specify lifecycle rules that tell AWS…
Installing aws-cli, the New AWS Command Line Tool
consistent control over more AWS services with aws-cli, a single, powerful command line tool from Amazon Readers of this tech blog know that I am a fan of the power…
Using An AWS CloudFormation Stack To Allow "-" Instead Of "+" In Gmail Email Addresses
Launch a CloudFormation template to set up a stack of AWS resources to fill a simple need: Supporting Gmail addresses with “-” instead of “+” separating the user name from…
New Options In ec2-expire-snapshots v0.11
The ec2-expire-snapshots program can be used to expire EBS snapshots in Amazon EC2 on a regular schedule that you define. It can be used as a companion to ec2-consistent-snapshot or…
Replacing a CloudFront Distribution to "Invalidate" All Objects
I was chatting with Kevin Boyd (aka Beryllium) on the ##aws Freenode IRC channel about the challenge of invalidating a large number of CloudFront objects (35,000) due to a problem…
Email Alerts for AWS Billing Alarms
using CloudWatch and SNS to send yourself email messages when AWS costs accrue past limits you define The Amazon documentation describes how to use the AWS console to monitor your…
Cost of Transitioning S3 Objects to Glacier
how I was surprised by a large AWS charge and how to calculate the break-even point Glacier Archival of S3 Objects Amazon recently introduced a fantastic new feature where S3…
Running Ubuntu on Amazon EC2 in Sydney, Australia
Amazon has announced a new AWS region in Sydney, Australia with the name ap-southeast-2. The official Ubuntu AMI lookup pages (1, 2) don’t seem to be showing the new location…
Save Money by Giving Away Unused Heavy Utilization Reserved Instances
You may be able to save on future EC2 expenses by selling an unused Reserved Instance for less than its true value or even $0.01, provided it is in the…
Installing AWS Command Line Tools from Amazon Downloads
When you need an AWS command line toolset not provided by Ubuntu packages, you can download the tools directly from Amazon and install them locally. In a previous article I…
Convert Running EC2 Instance to EBS-Optimized Instance with Provisioned IOPS EBS Volumes
Amazon just announced two related features for getting super-fast, consistent performance with EBS volumes: (1) Provisioned IOPS EBS volumes, and (2) EBS-Optimized Instances. Starting new instances and EBS volumes with…
Which EC2 Availability Zone is Affected by an Outage?
Did you know that Amazon includes status messages about the health of availability zones in the output of the ec2-describe-availability-zones command, the associated API call, and the AWS console? Right…
Installing AWS Command Line Tools Using Ubuntu Packages
See also: Installing AWS Command Line Tools from Amazon Downloads Here are the steps for installing the AWS command line tools that are currently available as Ubuntu packages. These include:…
Ubuntu Developer Summit, May 2012 (Oakland)
I will be attending the Ubuntu Developer Summit (UDS) next week in Oakland, CA. ┬áThis event brings people from around the world together in one place every six months to…
Uploading Known ssh Host Key in EC2 user-data Script
The ssh protocol uses two different keys to keep you secure: The user ssh key is the one we normally think of. This authenticates us to the remote host, proving…
Seeding Torrents with Amazon S3 and s3cmd on Ubuntu
Amazon Web Services is such a huge, complex service with so many products and features that sometimes very simple but powerful features fall through the cracks when you’re reading the…
CloudCamp
There are a number of CloudCamp events coming up in cities around the world. These are free events, organized around the various concepts, technologies, and services that fall under the…
Use the Same Architecture (64-bit) on All EC2 Instance Types
A few hours ago, Amazon AWS announced that all EC2 instance types can now run 64-bit AMIs. Though t1.micro, m1.small, and c1.medium will continue to also support 32-bit AMIs, it…
ec2-consistent-snapshot on GitHub and v0.43 Released
The source for ec2-conssitent-snapshot has historically been available here: ec2-consistent-snapshot on Launchpad.net using Bazaar For your convenience, it is now also available here: ec2-consistent-snapshot on GitHub using Git You are…
You Should Use EBS Boot Instances on Amazon EC2
EBS boot vs. instance-store If you are just getting started with Amazon EC2, then use EBS boot instances and stop reading this article. Forget that you ever heard about instance-store…
Retrieve Public ssh Key From EC2
A serverfault poster had a problem that I thought was a cool challenge. I had so much fun coming up with this answer, I figured I’d share it here as…
Running EC2 Instances on a Recurring Schedule with Auto Scaling
Do you want to run short jobs on Amazon EC2 on a recurring schedule, but don’t want to pay for an instance running all the time? Would you like to…
AWS Virtual MFA and the Google Authenticator for Android
Amazon just announced that the AWS MFA (multi-factor authentication) now supports virtual or software MFA devices in addition to the physical hardware MFA devices like the one that’s been taking…