Ubuntu 10.10 Maverick Released for Amazon EC2

| 20 Comments

Does anybody really need me to tell them that you can now run a copy of the newly released Ubuntu 10.10 Maverick on Amazon EC2 with official AMIs published by Canonical?

Or, by now, perhaps you have come to expect—like I have—that the smoothly oiled machine will naturally pump out Ubuntu AMIs for EC2 on the same pre-scheduled date that the larger Ubuntu machine churns out yet another smooth launch of yet another clean Ubuntu release.

The bigger question, I guess, might be:

Should I upgrade to Ubuntu 10.10 on EC2?

The first release after an LTS (Ubuntu 10.04 Lucid) is always tough choice for me.

If I’m on a desktop, I like to upgrade to the latest a month or so after each semi-annual release. However, upgrading servers every six months tends to get tiresome, so I generally stick with the LTS until a newer release contains a software version that I need to use, or until the next LTS comes out two years later.

There have been a few minor problems with Ubuntu 10.04 Lucid on EC2, but the important ones for me have either been fixed with the release of updated AMIs or have simple workarounds. Other than that, I am pleased with Lucid on EC2 and don’t see an urgent need to upgrade beyond this LTS just yet.

I’d love to hear what you think.

20 Comments

The software in Lucid is still cutting edge for my needs. I don't plan on upgrading my instances until the next LTS release.

I agree on not wanting to refresh every 6 months. :) But I have a question back at you.

If you are about to embark on your major refresh today, would you start with 10.04 or 10.10?

I only ask because it is about time I refresh my base image, and I'm wondering what your thoughts are on which one to base it on.

I have many 8.04 in production serving web or dns every day. I am sure that I will install 10.04 server at long 2011 too. "Production server" means "if it works, do not touch it" and take LTS if you can.

jedberg: I guess it depends on whether you intend to upgrade frequently. If you haven't refreshed your base image in a long time, then one might suspect that you won't want to for a long time again, so perhaps 10.04 LTS would be best for you.

Christopher Tozzi made the same question about 8.10 after 8.04 LTS, interesting reflection to read when we know what happens http://www.thevarguy.com/2008/11/10/ubuntu-server-edition-810-nice-but-who-uses-it/, or still have opened questions.

I'm sticking with Lucid for servers for a while. I have 3 machines I'm managing now and there's no need to upgrade.

If I were just starting from scratch however, I would probably use 10.10 - why not.

adamn: The primary reason "why not" to use 10.10 on EC2 servers would be that it is only going to be supported with security patches for 18 months. For folks who don't like upgrading that often, 10.04 will be supported on the server for 60 months (5 years).

My company still has production servers running Ubuntu 8.04 LTS Hardy, 30 months after its release. We are slowly migrating over to 10.04 LTS Lucid, but haven't needed to hurry because Hardy is LTS (Long Term Support).

Assuming one wanted to upgrade 10.04 to 10.10 on EC2 (running on EBS, of course) would the proper course be to run "sudo do-release-upgrade"?

Thanks in advance, Jay

Scott provided some information on upgrading EBS boot instances:

http://ubuntu-smoser.blogspot.com/2010/04/upgrading-ebs-instance.html

Upgrading from pre-10.10 you need to worry about restarting the instance with the appropriate kernel image.

I'd also recommend testing on a dev instance, perhaps started from a snapshot AMI, just to make sure it goes smoothly.

One reason for migrating to Maverick from Lucid is the OpenSSL library: Maverick includes newer version, which supports secure TLS renegotation. So if you run https-enabled web server on Lucid, you can get warnings from browsers like Opera or Chrome: "This server does not support secure TLS renegotiation. The site owner should upgrade the server." It looks like this message is gone after upgrading to Maverick, because of newer openssl library version (at least that's what my early tests indicate).

Looks like you may safely ignore may previous comment about openssl/libssl version. From what I see, the change was backported to Lucid in version 0.9.8k-7ubuntu8.1. I've just upgraded those libraries and it seems it did a trick - Opera doesn't complain anymore.

grzegorzbor: Thanks for following up.

(BTW: I'm a long time fan of your site. Thank you /so much/ for all you've done for this community!)

So, I run a website that has over a million unique users per day, tons of page views per user visit, and numerous backend APIs used by developers to access the data on behalf of users I'm not counting in the above figures. All of my backend webservers run on Ubuntu EC2 Cloud Edition.

*I cannot recommend more strongly that people do not use these AMIs: they are prone to locking up while using them.*

Unfortunately, I do not have the time and possibly-EC2-specific kernel debugging skill to be able to figure out /exactly/ what is going on, but I currently do have a theory (after spending numerous hours in the middle of the night staring at this and trying to isolate what causes I could).

Specifically, I believe that the move from ext3 (used in all previous AMIs) to ext4 (in these new Maverick AMIs) is the culprit: the JBD2 journal writer locks up, causing other processes relying on I/O to get stuck with it.

Booting these up you start to see a bunch of errors from Xen before the kernel even starts.

Failed to read /local/domain/0/backend/vbd/.../.../feature-barrier.
Failed to read /local/domain/0/backend/vbd/.../.../feature-flush-cache.

Then, during operation you will see error messages like the following.

JBD: barrier-based sync failed on sda1-8 - disabling barriers

Finally, during use, the system will lock up occasionally, and if you are "lucky" the kernel will detect this and scream at you.

INFO: task jbd2/sda1-8:235 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
jbd2/sda1-8 D ffff880003bcd980 0 235 2 0x00000000
ffff8801b5f49b20 0000000000000246 0000000000000000 0000000000015980
ffff8801b5f49fd8 0000000000015980 ffff8801b5f49fd8 ffff8801b571db80
0000000000015980 0000000000015980 ffff8801b5f49fd8 0000000000015980

All of these are related to the new JBD2 journaling backend of ext4. While other processes then end up stalling, but they do so inside of I/O routines for journal commits and writes.

(For the record, I cannot recommend the Lucid images either, including the most recently posted ones: there is a bug in the kernel that they are bundled with that causes poorly distributed load, leading to twice as much CPU time being used to handle the same work and doubled latency. My recommendation currently is to install Karmic with a runtime dist-upgrade to Lucid.)

(Also, unrelatedly: is there a good reason for the ephemeral drives to now be mounted with "nobootwait"? I'd imagine most people would want to be storing things like the website that they are serving on those drives, which means they have to be ready before the web server is started.)

Jay: Thanks for your comments. I haven't experienced these problems, but perhaps it depends on the server usage patterns. The best place to report issues with Ubuntu is by reporting bugs in Launchpad, perhaps using the ubuntu-bug tool.

Eric: Interesting! From having followed your blog for a while I know you switched to using EBS boot a while ago. I'm still using S3-backed instances. (For the record, on c1.xlarge hardware.)

(I have absolutely no data on these machines I care about: I build them with a simple shell script that takes a couple minutes to run; not using EBS seemed simpler, and possibly cheaper; maybe I'm wrong, but that's a matter for another day.)

So, if I'm right about this being an issue with ext4 on the root partition, we are going to be looking at rather fundamentally different setups: maybe the EBS drives do not have problems with the JBS2 barrier.

I am going to do some testing today with EBS-backed device to see if I get similar behavior. I'm betting I don't. ;P

Also, I really don't think this is a "specific kind of load issue": most of the errors I'm showing happen on bootup without any load, and after futzing with it for a while I've seen these lockups occur in situations where I haven't even introduced load yet.

(In case anyone is curious: I'm not installing any custom daemons or software to the device; I setup pgbouncer, postfix, and Apache2/mod_python, do absolutely minimal configuration changes, and keep a git checkout of my website to a folder on the ephemeral drive.)

Hi, very nice blog.
A good reason to use 10.10 could be that it's effortless to upgrade kernels on it.
The 'nobootwait' thing on ephemeral drives could be related to this https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/634102

I'm having the same problems with maverick on ec2 and created an issue on launchpad:
https://bugs.launchpad.net/ubuntu/+source/linux-meta/+bug/666211

Jay: Thank you for your explanation – it seems we have exactly the same errors.

I'm quite hesitant to respond here, because this isn't a medium where we can reasonably have a conversation. The "right place" to do this is in launchpad (thank you teemow for opening the bug).

I'll answer a couple of the straightforward questions here, though.

Regarding "nobootwait", that is now present in fstab entries to address the possibility of the user starting an instance as a m1.large (that has /dev/sdb) and then stopping the instance and starting it as a t1.micro (where that device isn't present). Without 'nobootwait', the boot on a t1.micro will hang eternally waiting for the appearence of /dev/sdb, which is never going to show up.

in the case where the device is a ephemeral device and it is going to show up, it is (at least in all my experiences) going to be there on boot, so there is no harm in 'nobootwait'.

Regarding the following messages:
Failed to read /local/domain/0/backend/vbd/.../.../feature-barrier.
Failed to read /local/domain/0/backend/vbd/.../.../feature-flush-cache.

I'm almost certain that those are coming from pv-grub which does the loading of the maverick kernels. pv-grub is an amazon offering that allows ubuntu to manage its own kernels via /boot/grub/menu.lst rather than registering explicit aki ids.

Your input is appreciated, and there are very real kernel issues exposed above. Please do move the conversation to launchpad.

Does anyone know the status of the problems described above? I've had 2 instances lock up and I'm wondering if I can continue to use 10.10 on ec2.

In the first case, I was using an ebs backed small instance with another ebs volume mounted for data as described here:

http://aws.amazon.com/articles/1663?_encoding=UTF8&jiveRedirect=1

In the second case, I was using an ephemeral small instance and everything was going fine for a week. When everything seemed to be going ok, I tried mounting an ebs volume and moving my data onto it and within an hour got the same crash.

Are others still having this problem? How are you dealing with it? Thanks.

edoornav: Comments on this blog are not the right place to discuss Ubuntu bugs. Please check on Launchpad and submit a new bug if you are having problems not identified yet.

Leave a comment

Ubuntu AMIs

Ubuntu AMIs for EC2:


More Entries

Replacing a CloudFront Distribution to "Invalidate" All Objects
I was chatting with Kevin Boyd (aka Beryllium) on the ##aws Freenode IRC channel about the challenge of invalidating a…
Email Alerts for AWS Billing Alarms
using CloudWatch and SNS to send yourself email messages when AWS costs accrue past limits you define The Amazon documentation…
Cost of Transitioning S3 Objects to Glacier
how I was surprised by a large AWS charge and how to calculate the break-even point Glacier Archival of S3…
Running Ubuntu on Amazon EC2 in Sydney, Australia
Amazon has announced a new AWS region in Sydney, Australia with the name ap-southeast-2. The official Ubuntu AMI lookup pages…
Save Money by Giving Away Unused Heavy Utilization Reserved Instances
You may be able to save on future EC2 expenses by selling an unused Reserved Instance for less than its…
Installing AWS Command Line Tools from Amazon Downloads
When you need an AWS command line toolset not provided by Ubuntu packages, you can download the tools directly from…
Convert Running EC2 Instance to EBS-Optimized Instance with Provisioned IOPS EBS Volumes
Amazon just announced two related features for getting super-fast, consistent performance with EBS volumes: (1) Provisioned IOPS EBS volumes, and…
Which EC2 Availability Zone is Affected by an Outage?
Did you know that Amazon includes status messages about the health of availability zones in the output of the ec2-describe-availability-zones…
Installing AWS Command Line Tools Using Ubuntu Packages
Here are the steps for installing the AWS command line tools that are currently available as Ubuntu packages. These include:…
Ubuntu Developer Summit, May 2012 (Oakland)
I will be attending the Ubuntu Developer Summit (UDS) next week in Oakland, CA.  This event brings people from around…
Uploading Known ssh Host Key in EC2 user-data Script
The ssh protocol uses two different keys to keep you secure: The user ssh key is the one we normally…
Seeding Torrents with Amazon S3 and s3cmd on Ubuntu
Amazon Web Services is such a huge, complex service with so many products and features that sometimes very simple but…
CloudCamp
There are a number of CloudCamp events coming up in cities around the world. These are free events, organized around…
Use the Same Architecture (64-bit) on All EC2 Instance Types
A few hours ago, Amazon AWS announced that all EC2 instance types can now run 64-bit AMIs. Though t1.micro, m1.small,…
ec2-consistent-snapshot on GitHub and v0.43 Released
The source for ec2-conssitent-snapshot has historically been available here: ec2-consistent-snapshot on Launchpad.net using Bazaar For your convenience, it is now…
You Should Use EBS Boot Instances on Amazon EC2
EBS boot vs. instance-store If you are just getting started with Amazon EC2, then use EBS boot instances and stop…
Retrieve Public ssh Key From EC2
A serverfault poster had a problem that I thought was a cool challenge. I had so much fun coming up…
Running EC2 Instances on a Recurring Schedule with Auto Scaling
Do you want to run short jobs on Amazon EC2 on a recurring schedule, but don’t want to pay for…
AWS Virtual MFA and the Google Authenticator for Android
Amazon just announced that the AWS MFA (multi-factor authentication) now supports virtual or software MFA devices in addition to the…
Updated EBS boot AMIs for Ubuntu 8.04 Hardy on Amazon EC2 (2011-10-06)
Canonical has released updated instance-store AMIs for Ubuntu 8.04 LTS Hardy on Amazon EC2. Read Ben Howard’s announcement on the…