New --mysql-stop option for ec2-consistent-snapshot

| 11 Comments

The ec2-consistent-snapshot software tries its best to flush and lock a MySQL database on an EC2 instance while it initiates the EBS snapshot, and for many environments it does a pretty good job.

However, there are situations where the database may spend time performing crash recovery from the log file when it is started from a copy of the snapshot. We are seeing this behavior at CampusExplorer.com where the database is constantly active and we have innodb_log_file_size set (probably too) high. The delay is doubtless exacerbated by the fact that the blocks on the new EBS volume are being recovered from S3 as it is being built from the snapshot.

Google has created an innodb_disallow_writes MySQL patch which I think points out the problem we may be hitting.

“Note that it is not sufficient to run FLUSH TABLES WITH READ LOCK as there are background IO threads used by InnoDB that may still do IO.”

It would be very nice to have this patch incorporated in MySQL on Ubuntu. It looks like the OurDelta folks have already incorporated the patch. [Update: See rsimmons’ comment below which explains why this particular patch might not be the answer.]

In any case, when we bring up a database using an EBS volume created from an EBS snapshot of an active database, it can take up to 45 minutes recovering before it lets normal clients connect. This is too long for us so we’re trying a new approach.

The ec2-consistent-snapshot now has a --mysql-stop option which shuts down the MySQL server, initiates the snapshot, and then restarts the database. Our hope is that this will get us a snapshot which can be restored and run without delay. If any MySQL experts can point out the potential flaws in this, please do.

Since we obviously can’t stop and start our production database every hour, we are performing this snapshot activity on a replication slave that is dedicated to snapshots and backups.

We continue to perform occasional snapshots on the production database EBS volume just to help keep it reliable per Amazon’s instructions, but we don’t expect to be able to restore it without crash recovery.

If you’d like to test the new --mysql-stop option, please upgrade your ec2-consistent-snapshot package from the Alestic PPA and let me know how it goes.

11 Comments

It would be great to get an option for mysql_socket. I have a non-standard socket location in my setup and I've had to modify ec2-consistent-snapshot so that mysql_host was "localhost:mysql_socket=/path/to/mysql.sock". I don't know enough Perl, or I'd do it myself.

Thanks for all the great tools and articles
-David

I don't think there's a way for you to avoid doing a recovery or clean shutdown. If I understand correctly, the innodb log is basically a way of borrowing against the future, by doing fast sequential writes to the end of a log file instead of random writes to the data files. But eventually the writes need to be applied to the data files. A typical busy mysql server will have plenty of dirty blocks in the buffer pool, which means they have been written to the log but not the data files. At some point you will need to apply these changes to the data files, whether through clean shutdown or recovery. As far as I know the standard options for a _hot_ consistent backup are:

1) filesystem level snapshot and then recovery. you can do the recovery on a spare host immediately after the snapshot so you can quickly restore later in case of disaster
2) percona xtrabackup. this cleverly avoids needs a FS level snapshot, but requires doing a "log apply" step (basically recovery) before being able to use the backup.

You should definitely experiment with reducing innodb_log_file_size as far as you can before it has a negative performance impact. Your recovery time should be directly proportional to it.

As far as I understand, the innodb_disallow_writes patch is just an alternative when you can't do a filesystem level snapshot. It allows you to get a clean copy of the data and logs without shutting down the server, but since it doesn't flush dirty pages you would still need to do a recovery after copying the files.

Performing backups from a slave is fine, but there are some bugs I have encountered with replication that can cause the slave to accumulate small differences from the master over time. So I would recommend checking for that (mk-table-checksum) and/or periodically resyncing the slave from the master.

David: Can you just set the complete value using the --mysql-host option? Please add comments to this ticket: https://bugs.launchpad.net/ec2-consistent-snapshot/+bug/481477

rsimmons: Thanks. A lot of useful information! We'll try reducing innodb_log_file_size and will keep an eye on the replication issues which we've also heard about.

Without the work Eric has done, a lot of us would not be using AWS as much as we do because Eric has made it so much easier.

One thing I am still struggling with is how to set up some form of MySQL replication, Multi-Master and/or Master-Slave configuration on EC2.

It would be wonderful if you could share some of the techniques / best practices you have discovered if not some code!

In any case, thanks for all the stuff you do to help all of us.
Rob

Rob: I learned most of what I know about setting up master-slave replication here:
http://dev.mysql.com/doc/refman/5.0/en/replication-howto.html

Eric,

I got the latest version of the ec2-consistent-snapshot to use with the --mysql-stop option.

I am using the following command:

ec2-consistent-snapshot
--aws-access-key-id key
--aws-secret-access-key secret
--region us-east-1
--mysql
--mysql-host ec2-xxx-yyy-zzz-www.compute-1.amazonaws.com
--mysql-username madmin
--mysql-password madminpassword
--xfs-filesystem /vol vol-123456789


However, I am getting the following error:

DBI connect(';host=localhost','',...) failed: Access denied for user 'root'@'localhost' (using password: NO) at /usr/bin/ec2-consistent-snapshot line 179
Use of uninitialized value $mysql_username in concatenation (.) or string at /usr/bin/ec2-consistent-snapshot line 179.
ec2-consistent-snapshot: ERROR: Unable to connect to MySQL on localhost as at /usr/bin/ec2-consistent-snapshot line 179.

As you can see I am NOT using the root user to login to the db, so I am not sure where it is picking up the credentials.

Is this a known bug or am I doing something so obviously silly that it escapes me? Apologies if the latter...

- Bill

Eric,

I figured it out...The latest version of the ec2-consistent-snapshot seems to ignore the mysql command line options. If I provide the options in the $HOME/.my.cnf file, it seems to work.

Thanks,

- Bill

Bill: Please upgrade to the latest version of ec2-consistent-snapshot and see if that fixes things. Rod Vagg submitted a patch which fixes a bug when you don't have a .my.cnf file.

By the way, it is rare that mysqlhost should be anything but localhost as you are almost always snapshotting a local file system.

Eric, is there a good reason why snapshots should not be run from an external source? I have been thinking about the creation of a snapshot console I can run on my local system that lets me execute consistent snapshots, without having to install and maintain all of the perl, EC2 API, and keypair bits (as well as your awesome utility) on each EC2 instance. Obviously, if there is an exceptionally good reason why snapshots should only be performed locally, then it will override the convenience of keeping the snapshot functionality in one place. But from a configuration management point of view, a localized utility that manages remote instances is ideal.

Thanks,

Earl

Earl: It's ok to run a snapshot from outside the instance, except that this makes it difficult to flush/freeze the file system and lock a database while the snapshot is started. This means that the snapshot might not be a consistent representation of the file system and application data. If you really don't want credentials stored on the instance, you might be able to work out a separate process with an ssh to the instance to perform the locking steps, but it starts to get a little complicated to implement.

Leave a comment

Ubuntu AMIs

Ubuntu AMIs for EC2:


More Entries

Ubuntu Developer Summit, May 2012 (Oakland)
I will be attending the Ubuntu Developer Summit (UDS) next week in Oakland, CA.  This event brings people from around…
Uploading Known ssh Host Key in EC2 user-data Script
The ssh protocol uses two different keys to keep you secure: The user ssh key is the one we normally…
Seeding Torrents with Amazon S3 and s3cmd on Ubuntu
Amazon Web Services is such a huge, complex service with so many products and features that sometimes very simple but…
CloudCamp
There are a number of CloudCamp events coming up in cities around the world. These are free events, organized around…
Use the Same Architecture (64-bit) on All EC2 Instance Types
A few hours ago, Amazon AWS announced that all EC2 instance types can now run 64-bit AMIs. Though t1.micro, m1.small,…
ec2-consistent-snapshot on GitHub and v0.43 Released
The source for ec2-conssitent-snapshot has historically been available here: ec2-consistent-snapshot on Launchpad.net using Bazaar For your convenience, it is now…
You Should Use EBS Boot Instances on Amazon EC2
EBS boot vs. instance-store If you are just getting started with Amazon EC2, then use EBS boot instances and stop…
Retrieve Public ssh Key From EC2
A serverfault poster had a problem that I thought was a cool challenge. I had so much fun coming up…
Running EC2 Instances on a Recurring Schedule with Auto Scaling
Do you want to run short jobs on Amazon EC2 on a recurring schedule, but don’t want to pay for…
AWS Virtual MFA and the Google Authenticator for Android
Amazon just announced that the AWS MFA (multi-factor authentication) now supports virtual or software MFA devices in addition to the…
Updated EBS boot AMIs for Ubuntu 8.04 Hardy on Amazon EC2 (2011-10-06)
Canonical has released updated instance-store AMIs for Ubuntu 8.04 LTS Hardy on Amazon EC2. Read Ben Howard’s announcement on the…
New Release of Alestic Git Server
New AMIs have been released for the Alestic Git Server. Major upgrade points include: Base operating system upgraded to Ubuntu…
Using ServerFault.com for Amazon EC2 Q&A
The Amazon EC2 Forum has been around since the beginning of EC2 and has always been a place where you…
Rebooting vs. Stop/Start of Amazon EC2 Instance
When you reboot a physical computer at your desk it is very similar to shutting down the system, and booting…
Upper Limits on Number of Amazon EC2 Instances by Region
[Update: As predicted, these numbers are already out of date and Amazon has added more public IP address ranges for…
Unavailable Availability Zones on Amazon EC2
I’m taking a class about using Chef with EC2 by Florian Drescher today and Florian mentioned that he noticed one…
Desktop AMI login security with NX
Update 2011-08-04: Amazon Security did more research and investigated the desktop AMIs. They have confirmed that their software incorrectly flagged…
Updated EBS boot AMIs for Ubuntu 8.04 Hardy on Amazon EC2
For folks still using the old, reliable Ubuntu 8.04 LTS Hardy from 2008, Canonical has released updated AMIs for use…
Creating Public AMIs Securely for EC2
Amazon published a tutorial about best practices in creating public AMIs for use on EC2 last week: How To Share…
Canonical Releases Ubuntu 11.04 Natty for Amazon EC2
As steady as clockwork, Ubuntu 11.04 Natty is released on the day scheduled at least eleven months ago; and thanks…