Amazon EC2 currently has a limit of 1,000 GB (1 TB) for EBS volumes (Elastic Block Store). It is possible to create file systems larger than this limit using RAID 0 across multiple EBS volumes. Using RAID 0 can also improve the performance of the file system reducing total IO wait as demonstrated in a number of published EBS performance tests.
The following instructions walk through one way to set up RAID 0 across multiple EBS volumes. Note that there is a limit on the size of a file system on 32-bit instances, but 64-bit instances can get unreasonably large. This test was run with 40 EBS volumes of 1,000 GB each for a total of 40,000 GB (40 TB) in the resulting file system.
Actual command line output showing the size of the RAID:
# df /vol
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/md0 41942906368 1312 41942905056 1% /vol
# df -h /vol
Filesystem Size Used Avail Use% Mounted on
/dev/md0 40T 1.3M 40T 1% /vol
These commands can run in less than 10 minutes and this could probably be reduced further by parallelizing the creation and attaching of the EBS volumes.
Note that the default limit is 20 EBS volumes per EC2 account. You can request an increase from Amazon if you need more.
Caution: 40 TB of EBS storage on EC2 will cost $4,000 per month plus usage charges.
Instructions
Start a 64-bit instance (say, Ubuntu 8.04 Hardy from http://alestic.com). Use your own KEYPAIR:
ec2-run-instances --key KEYPAIR --instance-type c1.xlarge --availability-zone us-east-1a ami-0772946e
Configurable parameters (set on both local host and on EC2 instance):
instanceid=i-XXXXXXXX
volumes=40
size=1000
mountpoint=/vol
On the local host (with EC2 API tools installed)…
Create and attach EBS volumes:
devices=$(perl -e 'for$i("h".."k"){for$j("",1..15){print"/dev/sd$i$j\n"}}'|
head -$volumes)
devicearray=($devices)
volumeids=
i=1
while [ $i -le $volumes ]; do
volumeid=$(ec2-create-volume -z us-east-1a --size $size | cut -f2)
echo "$i: created $volumeid"
device=${devicearray[$(($i-1))]}
ec2-attach-volume -d $device -i $instanceid $volumeid
volumeids="$volumeids $volumeid"
let i=i+1
done
echo "volumeids='$volumeids'"
On the EC2 instance (after setting parameters as above)…
Install software:
sudo apt-get update &&
sudo apt-get install -y mdadm xfsprogs
Set up the RAID 0 device:
devices=$(perl -e 'for$i("h".."k"){for$j("",1..15){print"/dev/sd$i$j\n"}}'|
head -$volumes)
yes | sudo mdadm --create /dev/md0 --level 0 --metadata=1.1 --chunk 256 --raid-devices $volumes $devices
echo DEVICE $devices | sudo tee /etc/mdadm.conf
sudo mdadm --detail --scan | sudo tee -a /etc/mdadm.conf
Create the file system (pick your preferred file system type)
sudo mkfs.xfs /dev/md0
Mount:
echo "/dev/md0 $mountpoint xfs noatime 0 0" | sudo tee -a /etc/fstab
sudo mkdir $mountpoint
sudo mount $mountpoint
Check it out:
df -h $mountpoint
When you’re done with it and want to destroy the data and stop paying for storage, tear it down:
sudo umount $mountpoint
sudo mdadm --stop /dev/md0
Terminate the instance:
sudo shutdown -h now
On the local host (with EC2 API tools installed)…
Detach and delete volumes:
for volumeid in $volumeids; do
ec2-detach-volume $volumeid
done
for volumeid in $volumeids; do
ec2-delete-volume $volumeid
done
Credits
This article was originally posted on the EC2 Ubuntu group.
Thanks to M. David Peterson for the basic mdadm instructions:
[Update 2012-01-21: Added —chunk 256 based on community recognized best practices.]



Follow Eric Hammond on Twitter
Lovely! I'm thinking on an incremental storage system using LVM2 (for adding storage capacity) and RAID (for IO optimization and fault tolerance).
Using resources Amazon provides, such as EBS, this storage system could be resized on-demand.
Have you ever experienced LVM and RAID implementations using EBS?
I'd love to share experiences on such implementations. :)
ramses: I've set up LVM on EBS as a test but not in production. With this approach, RAID isn't even needed as LVM can do the striping across EBS volumes.
To do a snapshot, can one just do "xfs_freeze -f /vol", then snap each EBS vol, and the unfreeze? Or does one need to run some other command?
Thanks,
Don
Don: Effectively, yes. You can read about EBS volumes and snapshots here:
http://ec2ebs-mysqlnotlong.com
I've also written a tool which can be used to generate consistent snapshots with RAID (especially when using XFS and optionally MySQL):
http://alestic.com/2009/09/ec2-consistent-snapshot
Tracking snapshots and reassembling the RAID array is left as an exercise for the reader and should be practiced before needed.
Hi Eric,
Thanks for all your useful articles!
I have posted a question in the AWS dev forum (http://developer.amazonwebservices.com/connect/thread.jspa?threadID=45444) regarding LVM over RAID setup on EC2 issues.
As it seems that you have been able to do it in a test environment I would like to know what I'm missing in order to make everything work after an image rebundling and rebooting.
Regards,
David
Is it possible to grow a RAID0 using EBS? My reading suggests you would have to set things up as a RAID1. I set-up a 2 TB RAID0 following the instructions here - is there anyway to grow this by adding additional 1 TB EBS vols?
basscakes:
That would be a question for a RAID forum. Since EBS volumes are just like hard drives to the RAID, you can do whatever is normally possible with your RAID software.
Hi Eric,
First of all, thanks for such a nice writeup and all the good stuff you have been giving back to community to help noobs like me to catch up to latest trends and tech.
Second, i tried a setup like this and after a reboot RAID would just disappear. I had discussed it with couple of people on ##aws and later someone else on ##aws asked you almost same question while i was away.
I have posted my research results at:
error404notfound.posterous.com/experience-with-raid-and-ephemeral-devices
If its any help, it would be a pleasure. Let me know if you figure out why this happens and what is a possible workaround for this.
Thanks
Hi Eric,
Have you or anyone else solved the reboot issue in Ubuntu Lucid? Seems to be an issue with UUIDs, but I cannot find a solution. On reboot, the instance is "stuck" with a console message and you are unable to ssh into the instance. Message is:
"the disk for/mnt/md0 is not ready yet or not present. Continue to wait; or Press S to skip mount or M for manual recover"
AItOawn3dGSacFn5XCITiDKWmwzrKAV5760yzyY:
What is the launchpad bug id for your reboot issue? If there isn't one, then it's a good bet nobody is working on it.
Hi Eric,
It did really shed light on how to scale EBS volumes, but my concern is about the case of EBS outages during RAID. Is it possible to take EBS snapshots of all the EBS volumes at same RAID state at that moment, then recover from the snapshots successfully...
Unni:
If you can quiesce and freeze the file system, flushing blocks to the RAID volumes, you can then snapshot all of the volumes at the same point in time. In theory, this should give you a consistent snapshot across volumes that could be re-assembled later.
My ec2-consistent-snapshot tool supports these actions and is used by some folks for creating consistent snapshots across RAID volumes:
http://alestic.com/2009/09/ec2-consistent-snapshot
I don't guarantee that this will work for your particular RAID setup, but it might be worth testing.