Logical Versus Physical (Raw) Backups
Logical backups save information represented as logical database structure (CREATE DATABASE, CREATE TABLE statements) and content (INSERT statements or delimited-text files). Physical backups consist of raw copies of the directories and files that store database contents.
Logical backup methods have these characteristics:
- The backup is done by querying the MySQL server to obtain database structure and content information.
- Backup is slower than physical methods because the server must access database information and convert it to logical format. If the output is written on the client side, the server must also send it to the backup program.
- Output is larger than for physical backup, particularly when saved in text format.
- Backup and restore granularity is available at the server level (all databases), database level (all tables in a particular database), or table level. This is true regardless of storage engine.
- The backup does not include log or configuration files, or other database-related files that are not part of databases.
- Backups stored in logical format are machine independent and highly portable.
- Logical backups are performed with the MySQL server running. The server is not taken offline.
- Logical backup tools include the mysqldump program and the SELECT ... INTO OUTFILE statement. These work for any storage engine, even MEMORY.
- To restore logical backups, SQL-format dump files can be processed using the mysql client. To load delimited-text files, use the LOAD DATA INFILE statement or the mysqlimport client.
Physical backup methods have these characteristics:
- The backup consists of exact copies of database directories and files. Typically this is a copy of all or part of the MySQL data directory. Data from MEMORY tables cannot be backed up this way because their contents are not stored on disk.
- Physical backup methods are faster than logical because they involve only file copying without conversion.
- Output is more compact than for logical backup.
- Backup and restore granularity ranges from the level of the entire data directory down to the level of individual files. This may or may not provide for table-level granularity, depending on storage engine. (Each MyISAM table corresponds uniquely to a set of files, but an InnoDB table shares file storage with other InnoDB tables.)
- In addition to databases, the backup can include any related files such as log or configuration files.
- Backups are portable only to other machines that have identical or similar hardware characteristics.
- Backups can be performed while the MySQL server is not running. If the server is running, it is necessary to perform appropriate locking so that the server does not change database contents during the backup.
- Physical backup tools include file system-level commands (such as cp, scp, tar, rsync), mysqlhotcopy for MyISAM tables, ibbackup for InnoDB tables, or START BACKUP for NDB tables.
- For restore, files copied at the file system level or with mysqlhotcopy can be copied back to their original locations with file system commands; ibbackup restores InnoDB tables, and ndb_restore restores NDB tables.
Online Versus Offline Backups
Online backups take place while the MySQL server is running so that the database information can be obtained from the server. Offline backups take place while the server is stopped. This distinction can also be described as “hot” versus “cold” backups; a “warm” backup is one where the server remains running but locked against modifying data while you access database files externally.
Online backup methods have these characteristics:
- The backup is less intrusive to other clients, which can connect to the MySQL server during the backup and may be able to access data depending on what operations they need to perform.
- Care must be taken to impose appropriate locking so that data modifications do not take place that would compromise backup integrity.
Offline backup methods have these characteristics:
- Clients can be affected adversely because the server is unavailable during backup.
- The backup procedure is simpler because there is no possibility of interference from client activity.
A similar distinction between online and offline applies for recovery operations, and similar characteristics apply. However, it is more likely that clients will be affected for online recovery than for online backup because recovery requires stronger locking. During backup, clients might be able to read data while it is being backed up. Recovery modifies data and does not just read it, so clients must be prevented from accessing data while it is being restored.
Local Versus Remote Backups
A local backup is performed on the same host where the MySQL server runs, whereas a remote backup is done from a different host. For some types of backups, the backup can be initiated from a remote host even if the output is written locally on the server. host.
- mysqldump can connect to local or remote servers. For SQL output (CREATE and INSERT statements), local or remote dumps can be done and generate output on the client. For delimited-text output (with the --tab option), data files are created on the server host.
- mysqlhotcopy performs only local backups: It connects to the server to lock it against data modifications and then copies local table files.
- SELECT ... INTO OUTFILE can be initiated from a local or remote client host, but the output file is created on the server host.
- Physical backup methods typically are initiated locally on the MySQL server host so that the server can be taken offline, although the destination for copied files might be remote.
Full Versus Incremental Backups
- A full backup includes all data managed by a MySQL server at a given point in time. An incremental backup consists of the changes made to the data during a given time span (from one point in time to another). MySQL has different ways to perform full backups, such as those described earlier in this section. Incremental backups are made possible by enabling the server's binary log, which the server uses to record data changes.
Full Versus Point-in-Time (Incremental) Recovery
A full recovery restores all data from a full backup. This restores the server instance to the state that it had when the backup was made. If that state is not sufficiently current, a full recovery can be followed by recovery of incremental backups made since the full backup, to bring the server to a more up-to-date state.
Incremental recovery is recovery of changes made during a given time span. This is also called point-in-time recovery because it makes a server's state current up to a given time. Point-in-time recovery is based on the binary log and typically follows a full recovery from the backup files that restores the server to its state when the backup was made. Then the data changes written in the binary log files are applied as incremental recovery to redo data modifications and bring the server up to the desired point in time
Making Backups by Copying Table Files
For storage engines that represent each table using its own files, tables can be backed up by copying those files. For example, MyISAM tables are stored as files, so it is easy to do a backup by copying files (*.frm, *.MYD, and *.MYI files). To get a consistent backup, stop the server or do a LOCK TABLES on the relevant tables followed by FLUSH TABLES for the tables. See Section 12.4.5, “LOCK TABLES and UNLOCK TABLES Syntax”, and Section 12.5.6.2, “FLUSH Syntax”. You need only a read lock; this allows other clients to continue to query the tables while you are making a copy of the files in the database directory. The FLUSH TABLES statement is needed to ensure that the all active index pages are written to disk before you start the backup.
Making Delimited-Text File Backups
To create a text file containing a table's data, you can use SELECT * INTO OUTFILE 'file_name' FROM tbl_name. The file is created on the MySQL server host, not the client host. For this statement, the output file cannot already exist because allowing files to be overwritten constitutes a security risk. See Section 12.2.8, “SELECT Syntax”. This method works for any kind of data file, but saves only table data, not the table structure.
Another way to create text data files (along with files containing CREATE TABLE statements for the backed up tables) is to use mysqldump with the --tab option.
To reload the output data file, use LOAD DATA INFILE or mysqlimport.
Making Backups with mysqldump or mysqlhotcopy
The mysqldump program and the mysqlhotcopy script can make backups. mysqldump is more general because it can back up all kinds of tables. mysqlhotcopy works only with some storage engines.
Create a full backup of your database using mysqldump:
shell> mysqldump db_name > dump_file
shell> mysqldump --tab=/path/to/some/dir db_name
The first command dumps the database to the named file as CREATE TABLE and INSERT statements. The second command creates two files per table in the named output directory. One file contains the table contents as tab-delimited text. Other other contains a CREATE TABLE statement for the table.
Create a full backup of your database using mysqlhotcopy:
shell> mysqlhotcopy db_name /path/to/some/dir
You can also create a binary backup simply by copying all table files, as long as the server isn't updating anything. The mysqlhotcopy script uses this method. (But note that table file copying methods do not work if your database contains InnoDB tables. mysqlhotcopy does not work for InnoDB tables because InnoDB does not necessarily store table contents in database directories. Also, even if the server is not actively updating data, InnoDB may still have modified data cached in memory and not flushed to disk.
For InnoDB tables, it is possible to perform an online backup that takes no locks on tables using the --single-transaction option to mysqldump.
Making Incremental Backups by Enabling the Binary Log
MySQL supports incremental backups: You must start the server with the --log-bin option to enable binary logging. The binary log files provide you with the information you need to replicate changes to the database that are made subsequent to the point at which you performed a backup. At the moment you want to make an incremental backup (containing all changes that happened since the last full or incremental backup), you should rotate the binary log by using FLUSH LOGS. This done, you need to copy to the backup location all binary logs which range from the one of the moment of the last full or incremental backup to the last but one. These binary logs are the incremental backup; at restore time, you apply them as explained in Section 6.4, “Point-in-Time (Incremental) Recovery Using the Binary Log”. The next time you do a full backup, you should also rotate the binary log using FLUSH LOGS, mysqldump --flush-logs, or mysqlhotcopy --flushlog..
Making Backups Using Replication Slaves
If you are backing up a slave replication server, you should back up its master.info and relay-log.info files when you back up the slave's databases, regardless of the backup method you choose. These information files are always needed to resume replication after you restore the slave's data. If your slave is replicating LOAD DATA INFILE commands, you should also back up any SQL_LOAD-* files that may exist in the directory specified by the --slave-load-tmpdir option. (This location defaults to the value of the tmpdir system variable if not specified.) The slave needs these files to resume replication of any interrupted LOAD DATA INFILE operations.
If you have performance problems with your master server while making backups, one strategy that can help is to set up replication and perform backups on the slave rather than on the master.
Recovering Corrupt Tables
If you have to restore MyISAM tables that have become corrupt, try to recover them using REPAIR TABLE or myisamchk -r first. That should work in 99.9% of all cases. If myisamchk fails, try the following procedure. It is assumed that you have enabled binary logging by starting MySQL with the --log-bin option.
- Restore the table from a mysqldump backup or binary backup.
- Execute the following command to re-run the updates in the binary logs:
- shell> mysqlbinlog binlog.[0-9]* | mysql
In some cases, you may want to re-run only certain binary logs, from certain positions (usually you want to re-run all binary logs from the date of the restored backup, excepting possibly some incorrect statements).
Making Backups Using a File System Snapshot
If you are using a Veritas file system, you can make a backup like this:
- From a client program, execute FLUSH TABLES WITH READ LOCK.
- From another shell, execute mount vxfs snapshot.
- From the first client, execute UNLOCK TABLES.
- Copy files from the snapshot.
- Unmount the snapshot.
Similar snapshot capabilities may be available in other file systems, such as LVM or ZFS.
Backup Policy
Assume that we make a full backup of all our InnoDB tables in all databases using the following command on Sunday at 1 p.m., when load is low:
shell> mysqldump --single-transaction --all-databases > backup_sunday_1_PM.sql
The resulting .sql file produced by mysqldump contains a set of SQL INSERT statements that can be used to reload the dumped tables at a later time.
Assume that we make a full backup of all our InnoDB tables in all databases using the following command on Sunday at 1 p.m., when load is low:
shell> mysqldump --single-transaction --all-databases > backup_sunday_1_PM.sql
The resulting .sql file produced by mysqldump contains a set of SQL INSERT statements that can be used to reload the dumped tables at a later time.
This backup operation acquires a global read lock on all tables at the beginning of the dump (using FLUSH TABLES WITH READ LOCK). As soon as this lock has been acquired, the binary log coordinates are read and the lock is released. If long updating statements are running when the FLUSH statement is issued, the backup operation may stall until those statements finish. After that, the dump becomes lock-free and does not disturb reads and writes on the tables.
It was assumed earlier that the tables to back up are InnoDB tables, so --single-transaction uses a consistent read and guarantees that data seen by mysqldump does not change. (Changes made by other clients to InnoDB tables are not seen by the mysqldump process.) If the backup operation includes nontransactional tables, consistency requires that they do not change during the backup. For example, for the MyISAM tables in the mysql database, there must be no administrative changes to MySQL accounts during the backup.
Full backups are necessary, but it is not always convenient to create them. They produce large backup files and take time to generate. They are not optimal in the sense that each successive full backup includes all data, even that part that has not changed since the previous full backup. It is more efficient to make an initial full backup, and then to make incremental backups. The incremental backups are smaller and take less time to produce. The tradeoff is that, at recovery time, you cannot restore your data just by reloading the full backup. You must also process the incremental backups to recover the incremental changes.
To make incremental backups, we need to save the incremental changes. In MySQL, these changes are represented in the binary log, so the MySQL server should always be started with the --log-bin option to enable that log. With binary logging enabled, the server writes each data change into a file while it updates data. Looking at the data directory of a MySQL server that was started with the --log-bin option and that has been running for some days, we find these MySQL binary log files:
-rw-rw---- 1 guilhem guilhem 1277324 Nov 10 23:59 gbichot2-bin.000001
-rw-rw---- 1 guilhem guilhem 4 Nov 10 23:59 gbichot2-bin.000002
-rw-rw---- 1 guilhem guilhem 79 Nov 11 11:06 gbichot2-bin.000003
-rw-rw---- 1 guilhem guilhem 508 Nov 11 11:08 gbichot2-bin.000004
-rw-rw---- 1 guilhem guilhem 220047446 Nov 12 16:47 gbichot2-bin.000005
-rw-rw---- 1 guilhem guilhem 998412 Nov 14 10:08 gbichot2-bin.000006
-rw-rw---- 1 guilhem guilhem 361 Nov 14 10:07 gbichot2-bin.index
Each time it restarts, the MySQL server creates a new binary log file using the next number in the sequence. While the server is running, you can also tell it to close the current binary log file and begin a new one manually by issuing a FLUSH LOGS SQL statement or with a mysqladmin flush-logs command. mysqldump also has an option to flush the logs. The .index file in the data directory contains the list of all MySQL binary logs in the directory.
The MySQL binary logs are important for recovery because they form the set of incremental backups. If you make sure to flush the logs when you make your full backup, the binary log files created afterward contain all the data changes made since the backup. Let's modify the previous mysqldump command a bit so that it flushes the MySQL binary logs at the moment of the full backup, and so that the dump file contains the name of the new current binary log:
shell> mysqldump --single-transaction --flush-logs --master-data=2 \
--all-databases > backup_sunday_1_PM.sql
After executing this command, the data directory contains a new binary log file, gbichot2-bin.000007, because the --flush-logs option causes the server to flush its logs. The --master-data option causes mysqldump to write binary log information to its output, so the resulting .sql dump file includes these lines:
-- Position to start replication or point-in-time recovery from
-- CHANGE MASTER TO MASTER_LOG_FILE='gbichot2-bin.
Because the mysqldump command made a full backup, those lines mean two things:
Using Backups for Recovery
Now, suppose that we have a catastrophic crash on Wednesday at 8 a.m. that requires recovery from backups. To recover, first we restore the last full backup we have (the one from Sunday 1 p.m.). The full backup file is just a set of SQL statements, so restoring it is very easy:
shell> mysql <>
At this point, the data is restored to its state as of Sunday 1 p.m.. To restore the changes made since then, we must use the incremental backups; that is, the gbichot2-bin.000007 and gbichot2-bin.000008 binary log files. Fetch the files if necessary from where they were backed up, and then process their contents like this:
shell> mysqlbinlog gbichot2-bin.000007 gbichot2-bin.000008 | mysql
We now have recovered the data to its state as of Tuesday 1 p.m., but still are missing the changes from that date to the date of the crash. To not lose them, we would have needed to have the MySQL server store its MySQL binary logs into a safe location (RAID disks, SAN, ...) different from the place where it stores its data files, so that these logs were not on the destroyed disk. (That is, we can start the server with a --log-bin option that specifies a location on a different physical device from the one on which the data directory resides. That way, the logs are safe even if the device containing the directory is lost.) If we had done this, we would have the gbichot2-bin.000009 file (and any subsequent files) at hand, and we could apply them using mysqlbinlog and mysql to restore the most recent data changes with no loss up to the moment of the crash
No comments:
Post a Comment