I need to transfer a rather big MySQL dump from one host to another, but since I need to make it the fastest I can, as the service rely on this database and I cannot do otherwise (like a replica).
So I was wondering myself what could be the fastest way to generate, transfer and restore the dump. The classic steps I would take are:
- dump the db;
- gzip the files;
- copy them via scp/rsync;
But then I though: is it all necessary? For example the gzipping part, is it really required? It takes CPU and IO on both the source and the destination, but of course you don’t want to move Gigabyte worth of text file over internet.
rsync and scp are both capable of optimizing the file for transfer, why not letting them deal with it?
So I made some tests on an SSD VPS on Contabo.
This part cannot be skipped, but we can make it a lot faster by using the –tabs parameter (thanks to Percona): instead of producing a SQL dump it creates an SQL just for schema (and triggers, eventually), and a CSV files for data. It’s incredibly faster!
time mysqldump --quick --skip-lock-tables --single-transaction zabbix > /tmp/zabbix-dump.sql real 10m44.663s user 1m15.380s sys 0m27.624s
So the dump takes ten minutes.
gzipping the dump
As in my traditional approach I would gzip the file.
time gzip /tmp/zabbix-dump.sql real 7m11.232s user 5m21.996s sys 0m20.096s
Seven minutes, nearly as much as for the dump itself. In my experience dumping directly to gzip (that is mysqldump | gzip > destfile) is slower than doing the two separate tasks.
scping the compressed dump
Once compressed let’s copy the files over scp:
time scp /tmp/zabbix-dump.sql.gz root@dest:/tmp 1314MB 10.0MB/s 02:12 real 2m12.440s user 0m8.356s sys 0m7.092s
I copied to another Contabo VPS, so bandwidth is pretty good. We managed to reach a good speed, and moved over 1GB in some more than two minutes.
Compressing with scp
scp can take an optional parameter -C that will perform compression over the transmitted files. Let’s see if we can get any improvement in transfering the gzipped files. It’s a little nonsense to try to compress compressed files, like zipping a jpg file: you can get some improvements but it’s usually not worth the effort.
time scp -P30022 -C -o 'IPQoS throughput' -c aes256-ctr /tmp/zabbix-dump.sql.gz firstname.lastname@example.org:/tmp 1314MB 10.8MB/s 02:02 real 2m2.637s user 1m6.144s sys 0m13.892s
As you can see we’ve gained ~8% of transfer speed. Not much but we’re on a good path.
We’ve added some parameters. -C enables compression, the -o enables QoS for throughput and finally -c forces a specific cypher.
For the last one, ssh offer several options. In the past they used arcfour for faster performance. While one can test all the available options himself, I relied on this blog post.
Transfer uncompressed dump
The last test I did is to transfer the uncompressed dump file, letting ssh -C do all the job.
time scp -C -o 'IPQoS throughput' -c aes256-ctr /tmp/zabbix-dump.sql root@dest:/tmp 5358MB 11.7MB/s 07:40 real 7m41.099s user 5m47.464s sys 0m48.364s
Noticed? For nearly the same time taken for the gzip only we managed to transfer all the dump, even faster than with compressed files! And we don’t need to uncompress them on the destination host, which means faster recovery!
Undoubtly I’ll go for this last option!