November 2009 Archives

Ubuntu Karmic Desktop on EC2


As Thilo Maier pointed out in comments on my request for UDS input, I have been publishing both server and desktop AMIs for running Ubuntu on EC2 up through Jaunty, but the official Karmic AMIs on EC2 only support server installations by default.

Ubuntu makes it pretty easy to install the desktop software on a server, and NX from NoMachine makes it pretty easy to access that desktop remotely, with near real-time interactivity even over slowish connections.

Here’s a quick guide to setting this up, starting with an Ubuntu 9.10 Karmic AMI on Amazon EC2:

  1. Create a user-data script which installs runurl (not on Karmic AMIs by default) and then runs a simple desktop and NX server installation script. Examine the desktop script to see what it’s doing to install the software.

    cat <<EOM >install-desktop
    #!/bin/bash -ex
    wget -qO/usr/bin/runurl
    chmod 755 /usr/bin/runurl
  2. Start an instance on EC2 telling it to run the above user-data script on first boot. The following example uses the current 32-bit Karmic server AMI. Make sure you’re using the latest AMI id.

    ec2-run-instances                     --key YOURKEY                       --user-data-file install-desktop    ami-1515f67c
  3. Connect to the new instance and wait for it to complete the desktop software installation (when sshd is restarted). This takes about 30 minutes on an m1.small instance and 10 minutes on a c1.medium instance. Then generate and set a secure password for the ubuntu user using copy/paste from the pwgen output. Save the secure password so you can enter it into the NX client later.

    ssh -i YOURKEY.pem ubuntu@THEHOST
    tail -f /var/log/syslog | egrep --line-buffer user-data:
    pwgen -s 16 1
    sudo passwd ubuntu

    If anybody knows how to use ssh keys with NX, I’d love to do this instead of using passwords.

  4. Back on your local system, install and run the NX client. For computers not running Ubuntu, download the appropriate software from NoMachine.

    sudo dpkg -i nxclient_3.4.0-5_i386.deb
    /usr/NX/bin/nxclient --wizard &

    Point the NX Client to the external hostname of your EC2 instance. Enter the Login “ubuntu” and the Password from above. Choose the “Gnome” desktop.

If all goes well, you should have a complete and fresh Ubuntu desktop filling most of your screen, available for you to mess around with and then throw away.

ec2-terminate-instances INSTANCEID

If you want to have a persistent desktop with protection from crashes, you’ll need to learn how to do things like placing critical directories on EBS volumes.

If you’d like to run KDE on EC2, replace the package “ubuntu-desktop” with “kubuntu-desktop” in the installation script.

Ubuntu Developer Summit - EC2 Lucid


For the last year I have been working with Canonical and the Ubuntu server team, helping to migrate over to a more official process what I’ve been doing for the community in supporting Ubuntu on EC2. The Ubuntu 9.10 Karmic EC2 images are a fantastic result of this team’s work: An Ubuntu image running on real Ubuntu kernels with official support available.

For the next two days (Nov 14-15) I will be participating in the Ubuntu Developer Summit (UDS) in Dallas, Texas. Developers from around the world will be gathering this week to scope and define the next version of Ubuntu which will be released in April 2010.

As part of this, I think it would be helpful to gather input from the community. Please use the comment section of this article to share what you would like to see happen with the direction of Ubuntu on EC2 in the coming release(s).

Are there any features which you find missing? Functionality which would be helpful? Problems which keep getting in your way?

Feel free to brainstorm and toss out ideas big and small, even if they are not completely thought through or if they would also take help from Amazon to complete. It may already be too late to start off some of the proposals for the Lucid release cycle, but having ideas to think about for future releases never hurts.

The ec2-consistent-snapshot software tries its best to flush and lock a MySQL database on an EC2 instance while it initiates the EBS snapshot, and for many environments it does a pretty good job.

However, there are situations where the database may spend time performing crash recovery from the log file when it is started from a copy of the snapshot. We are seeing this behavior at where the database is constantly active and we have innodb_log_file_size set (probably too) high. The delay is doubtless exacerbated by the fact that the blocks on the new EBS volume are being recovered from S3 as it is being built from the snapshot.

Google has created an innodb_disallow_writes MySQL patch which I think points out the problem we may be hitting.

“Note that it is not sufficient to run FLUSH TABLES WITH READ LOCK as there are background IO threads used by InnoDB that may still do IO.”

It would be very nice to have this patch incorporated in MySQL on Ubuntu. It looks like the OurDelta folks have already incorporated the patch. [Update: See rsimmons’ comment below which explains why this particular patch might not be the answer.]

In any case, when we bring up a database using an EBS volume created from an EBS snapshot of an active database, it can take up to 45 minutes recovering before it lets normal clients connect. This is too long for us so we’re trying a new approach.

The ec2-consistent-snapshot now has a --mysql-stop option which shuts down the MySQL server, initiates the snapshot, and then restarts the database. Our hope is that this will get us a snapshot which can be restored and run without delay. If any MySQL experts can point out the potential flaws in this, please do.

Since we obviously can’t stop and start our production database every hour, we are performing this snapshot activity on a replication slave that is dedicated to snapshots and backups.

We continue to perform occasional snapshots on the production database EBS volume just to help keep it reliable per Amazon’s instructions, but we don’t expect to be able to restore it without crash recovery.

If you’d like to test the new --mysql-stop option, please upgrade your ec2-consistent-snapshot package from the Alestic PPA and let me know how it goes.

Amazon Web Services (AWS) has a dizzying proliferation of credentials, keys, ids, usernames, certificates, passwords, and codes which are used to access and control various account and service features and functionality. I have never met an AWS user who, when they started, did not have trouble figuring out which ones to use when and where, much less why.

Amazon is fairly consistent across the documentation and interfaces in the specific terms they use for the different credentials, but nowhere have I found these all listed in one place. (Update: Shlomo pointed out Mitch Garnaat’s article on this topic which, upon examination, may even have been my subconscious inspiration for this. Mitch goes into a lot of great detail in his two part post.)

Pay close attention to the exact names so that you use the right credentials in the right places.

(1) AWS Email Address and (2) Password. This pair is used to log in to your AWS account on the AWS web site. Through this web site you can access and change information about your account including billing information. You can view the account activity. You can control many of the AWS services through the AWS console. And, you can generate and view a number of the other important access keys listed in this article. You may also be able to order products from with this account, so be careful. You should obviously protect your password. What you do with your email address is your business. Both of these values may be changed as needed.

(3) MFA Authentication Code. If you have ordered and activated a multi-factor authentication device, then parts of the AWS site will be protected not only by the email address and password described above, but also by an authentication code. This is a 6 digit code displayed on your device which changes every 30 seconds or so. The AWS web site will prompt you for this code after you successfully enter your email address and password.

(4) AWS Account Number. This is a 12 digit number separated with dashes in the form 1234-5678-9012. You can find your account number under your name on the top right of most pages on the AWS web site (when you are logged in). This number is not secret and may be available to other users in certain circumstances. I don’t know of any situation where you would use the number in this format with dashes, but it is needed to create the next identifier:

(5) AWS User ID. This is a 12 digit number with no dashes. In fact, it is simply the previously mentioned AWS Account Number with the dashes removed (e.g., 12345678912). Your User ID is needed by some API and command line tools, for example when bundling a new image with ec2-bundle-vol. It can also be entered in to the ElasticFox plugin to help display the owners of public AMIs. Again, your User ID does not need to be kept private. It is shown to others when you publish an AMI and make it public, though it might take some detective work to figure out who the number really belongs to if you don’t publicize that, too.

(6) AWS Access Key ID and (7) Secret Access Key. This is the first of two pairs of credentials which can be used to access and control basic AWS services through the API including EC2, S3, SimpleDB, CloudFront, SQS, EMR, RDS, etc. Some interfaces use this pair, and some use the next pair below. Pay close attention to the names requested. The Access Key ID is 20 alpha-numeric characters like 022QF06E7MXBSH9DHM02 and is not secret; it is available to others in some situations. The Secret Access Key is 40 alpha-numeric-slash-plus characters like kWcrlUX5JEDGM/LtmEENI/aVmYvHNif5zB+d9+ct and must be kept very secret.

You can change your Access Key ID and Secret Access Key if necessary. In fact, Amazon recommends regular rotation of these keys by generating a new pair, switching applications to use the new pair, and deactivating the old pair. If you forget either of these, they are both available from AWS.

(8) X.509 Certificate and (9) Private Key. This is the second pair of credentials that can be used to access the AWS API (SOAP only). The EC2 command line tools generally need these as might certain 3rd party services (assuming you trust them completely). These are also used to perform various tasks for AWS like encrypting and signing new AMIs when you build them. These are the largest credentials, taking taking the form of short text files with long names like cert-OHA6ZEBHMCGZ66CODVHEKKVOCYWISYCS.pem and pk-OHA6ZEBHMCGZ66CODVHEKKVOCYWISYCS.pem respectively.

The Certificate is supposedly not secret, though I haven’t found any reason to publicize it. The Private Key should obviously be kept private. Amazon keeps a copy of the Certificate so they can confirm your requests, but they do not store your Private Key, so don’t lose it after you generate it. Two of these pairs can be associated with your account at any one time, so they can be rotated as often as you rotate the Access Key ID and Secret Access Key. Keep records of all historical keys in case you have a need for them, like unencrypting an old AMI bundle which you created with an old Certificate.

(10) Linux username. When you ssh to a new EC2 instance you need to connect as a user that already exists on that system. For almost all public AMIs, this is the root user, but on Ubuntu AMIs published by Canonical, you need to connect using the ubuntu user. Once you gain access to the system, you can create your own users.

(11) public ssh key and (12) private ssh key. These are often referred to as a keypair in EC2. The ssh keys are used to make sure that only you can access your EC2 instances. When you run an instance, you specify the name of the keypair and the corresponding public key is provided to that instance. When you ssh to the above username on the instance, you specify the private key so the instance can authenticate you and let you in.

You can have multiple ssh keypairs associated with a single AWS account; they are created through the API or with tools like the ec2-add-keypair command. The private key must be protected as anybody with this key can log in to your instances. You generally never see or deal with the public key as EC2 keeps this copy and provides it to the instances. You download the private key and save it when it is generated; Amazon does not keep a record of it.

(13) ssh host key. Just to make things interesting, each EC2 instance which you run will have its own ssh host key. This is a private file generated by the host on first boot which is used to protect your ssh connection to the instance so it cannot be intercepted and read by other people. In order to make sure that your connection is secure, you need to verify that the (14) ssh host key fingerprint, which is provided to you on your first ssh attempt, matches the fingerprint listed in the console output of the EC2 instance.

At this point you may be less or more confused, but here are two rules which may help:

A. Create a file or folder where you jot down and save all of the credentials associated with each AWS account. You don’t want to lose any of these, especially the ones which Amazon does not store for you. Consider encrypting this information with (yet another) secure passphrase.

B. Pay close attention to what credentials are being asked for by different tools. A “secret access key” is different from a “private key” is different from a “private ssh key”. Use the above list to help sort things out.

Use the comment section below to discuss what I missed in this overview.

Ubuntu AMIs

Ubuntu AMIs for EC2: