Several of my clients, and a few of my own projects, have local instances of relational databases, usually MySQL (or MariaDB/Percona derivatives) and there is growing interest in moving these into the cloud. Amazon’s almost-drop-in replacement is the obvious candidate. You can find plenty of detailed information online about AWS RDS, typically addressing matters that concern large database installations. Maybe such matters sound more exciting than moving a small database that has probably been ticking away nicely on that small box in the corner for years. So what about the small scale cases? Most of my consultancy clients would be using single database instances with just a few gig of data. Small fry. Would RDS suit them and what are the pros/cons?
There are several reasons why a small business with a modest database would be looking to move it to the cloud, which mainly come down to avoiding the need to worry about:
- Maintaining hardware for the local database.
- Checking for security patches or other software updates.
- Making regular backups of the live database.
- Restoring the database in case of disaster (e.g. server power failure).
- Testing the restore process, rather than waiting for the disaster.
- Tuning the database.
- Adding space or additional processing capacity.
- Protecting the database against intrusion.
This is a lot for a small business, especially those with only a few (or just one!) IT staff. While there are obvious costs in the above list, there are also obvious risks, so moving all this to a cloud service might not do much for costs but could do a lot of good for the risks.
The obvious cost savings come from:
- No need to own or operate hardware to support the database.
- No need to purchase or license software.
- Avoiding time-consuming and costly maintenance work.
The obvious new costs are:
- Ongoing fees for the replacement cloud service.
- Training associated with the introduction of the service.
- Increased Internet traffic.
At the time of writing, Amazon can offer a basic RD instance for about €$£50 per month giving you 2Gb RAM, 10Gb storage, low intensity network usage, enough to keep 2k of data on every single citizen of New Zealand; woman, man and child (but not sheep). A modest database in the cloud isn’t going to break the budget.
Migrating to the cloud can be a bit of a hit-and-miss affair. Let’s suppose you’re only maintaining 1Gb of data on your local MySQL instance, and you want to move it to a similar instance in RDS (also known as a homogeneous migration). You have two main options:
- Shut down your server, do a mysqldump to create an SQL file representing your entire database, then pump that into RDS, connect your application to the database in RDS and go back online.
- Use the AWS Data Migration Service to replicate your data to RDS without shutting down your current system, and when done just reconfigure your application to point to RDS instead.
I recently tried the mysqldump route. I loaded a compressed version of the dump onto an EC2 instance, and then piped the uncompressed record stream through the mysql tool to restore the database to a freshly minted RD instance. To my horror, this experiment with an image under 1Gb took over a day to complete! There’s no way I could be expected to shut down a system for a day while the data is moved.
The alternative is to move the data while it is still live, while making sure that any changes made to the live system during the migration are also copied over. The AWS DMS can do this. It will use a connection to your current live database and a connection to an RDS instance and gradually copy across. This approach might take a little longer since the data isn’t compressed (and presumably is also going via an encrypted connection) but the only downtime you will need is at the end when you have to switch your application from using the current local database to using the RDS instance in the cloud.
Once your data is in the cloud, you get additional features such as automatic backups, a hot standby server in case your main instance gets into trouble, or you can introduce data replication, clusters, geographic distribution and more. You scale according to your needs, up or down, and often just by making some configuration adjustments via a Web interface.
Are there any gotchas? Sure there are. Here’s a few you might want to consider, though for modest database situations these probably won’t be much to worry about:
- If you need BLOBs over 25Mb, think again (suggestion: put them into S3 and replace the database fields with S3 references).
- If you regularly use MySQL logs written to the file system, be prepared to survive without them.
- If you need SUPER privileges then you will need to figure out alternative ways to achieve what you want. You need SUPER, for example, to kill an ongoing query, which can be a problem when you have some unintentionally compute-intensive select that just keeps going, and going, and going… (Note: Amazon knows about this particular limitation, which is why they offer an alternative via rds_kill_query.)
- The underlying file system is case sensitive, so beware if you have been relying on case insensitivity (e.g. your current MySQL server has a Windows OS).
- Backups to places outside AWS have to be done via mysqldump because you don’t have access to the underlying MySQL files. AWS snapshots are efficient, and can be automated to occur at whatever rate you want, but they stay in Amazon. If you are used to the idea of restoring a MySQL database by dropping the raw files into place, be advised that you can’t do this with the snapshots.
- Only the InnoDB engine can be used if you want to take advantage of Amazon’s crash recovery.
- There are plenty of other limitations (like the 6Tb size limit, the 10,000 table limit, the lack of Federated Storage Engine support etc.) but none of these are going to be of any interest to someone running a modest database.
If you want an encrypted RDS database, make sure to set it up as encrypted when creating it, before you put any data into it. The nice thing about using this approach is that your application doesn’t need to know that the database is encrypted. You just need to manage your access keys properly, and naturally there’s a cloud service for managing keys. Again, these are benefits that the staff-challenged small business might bear in mind.
So, why Amazon? In my case, partly because I’ve been using their cloud services for years, and have been impressed by how the service has constantly improved. Objectively, I note the Forrester report (PDF, download from Amazon) earlier this year (2017) that put AWS RDS on top of the pile.
Of course, if you are going to have your database in the cloud, you might as well put your applications there too, but that’s another story.