PostgreSQL Point-in-time Recovery (using WAL)

PostgreSQL is an opensource and free-to-use Object-Relational Database Management System (ORBMS) and is controlled by a group of developers and development companies. PostgeSQL has three modes of backup, i.e. dump, dump all and Write Ahead Log (WAL). I would not focus here on the difference between the three, but will introduce WAL of postgresql. Write Ahead Log is a section of Postgresql where all the psql commands that have been executed are stored. Creating a copy of these logs and executing them again will restore any crashed database. This method is called Point-in-Time recovery, where WAL’s from a certain point in time are re-run to restore (crashed or lost)¬†databases. The tutorial will walk you through the important steps for both backup and recovery of postgresql database using this method.

Backup

In Point-in-Time recovery we need to first get a backup of segment files of WAL and then run the server in recovery mode. For the backup part we will use a simple linux copy command for copying the files to a folder outside the data directory of postgresql.

Now In order to move Postgres recovery mode, we have to alter its configuration file — postgresql.conf.

I worked on different versions of postgres; sometimes with the same version with different installation options. The location of different directories is heavily dependent on the individual setup.

I found mine at /etc/postgresql/8.4/main and on one of the other version of postgresql it was located at /opt/Postgres/8.4/data. In general, you can use locate or find commands to find the configuration files and then follow the rest of the tutorial.

Now, we have to edit postgresql.conf (through sudo, if needed). Go to the archiving portion of it and uncomment a lines. You will also need to make some changes. At the end, it should look something like this:

<br /><br />
# - Archiving -<br /><br />
archive_mode = on        # allows archiving to be done</p>
<p># (change requires restart)<br /><br />
archive_command = 'cp %p /mnt/backup/%f'        # command to use to archive a logfile segment<br /><br />
archive_timeout = 30          # force a logfile segment switch after this</p>
<p># number of seconds; 0 disables<br /><br />

Note: /mnt/backup is the directory i want the backup to be stored. The two Variables %p and %f will be replaced automatically by the server when backup is performed: %p would be the directory of data cluster and %f would be the name of backup file.

We are done with the prerequisites and now in order start the backup, we have to switch to psql console and commit following commands.

<br /><br />
# SELECT pg_start_backup('MY BACKUP_1')<br /><br />

it will will show something like this

<br /><br />
pg_start_backup<br /><br />
-----------------<br /><br />
0/7000020<br /><br />
(1 row)<br /><br />

The backup of WAL segment files is started and the files from pg_xlog are been copied to /mnt/backup/

In order to stop the backup:

<br /><br />
# SELECT pg_stop_backup();<br /><br />
pg_stop_backup<br /><br />
----------------<br /><br />
0/7000080<br /><br />
(1 row)<br /><br />

This shows the backup process is successfully closed.

Important Note: The directory /mnt/backup should have permission for postgres user to read and write (owner). Otherwise, you get an error like this at runtime (as seen in the log file):

<br /><br />
2010-08-11 21:54:46 PKST LOG:  archive command failed with exit code 1<br /><br />
2010-08-11 21:54:46 PKST DETAIL:  The failed archive command was: cp pg_xlog/000000010000000000000000 /mnt/backup/000000010000000000000000<br /><br />
cp: cannot create regular file `/mnt/backup/000000010000000000000000': Permission denied<br /><br />

Recovery

So now comes the recovery part, much easier than i thought but a little bit tricky. In order to continue with Point-in-Time recovery we first create a file named ‘Recovery.conf’ and place it in the data cluster, which in my case was /var/lib/postgresql/8.4/main

So what does the recovery file contains. Actually it has just another copy command for copying back files to data cluster from backup directory(this will be opposite to the previous backup command).

restore_command = 'cp /mnt/backup/%f "%p"'

When the recovery gets completed, postgresql will rename ‘recovery.conf’ to ‘recovery.done’.

Now stop the server by /init.d/postgresql-8.4 stop. We presume that there was a database crash down all the information in the data directory is lost. After which we will let postgresql to recover the files from the backup directory.

The next step is to delete files from the pg_xlog directory.

Now start the server by writing in terminal /init.d/postgresql-8.4 start which will automatically trigger a recovery. So what happened is that postgresql server found a recovery.conf thus shifted to its recovery mode.

You can assure proper recovery by checking the name of the recovery.conf file and by looking at the log of postgresql server. The log would Look Like this:

<br /><br />
2010-08-12 12:29:06 PKST LOG:  database system is shut down<br /><br />
2010-08-12 12:29:48 PKST LOG:  database system was shut down at 2010-08-12 12:29:06 PKST<br /><br />
2010-08-12 12:29:48 PKST LOG:  creating missing WAL directory "pg_xlog/archive_status"<br /><br />
2010-08-12 12:29:48 PKST LOG:  starting archive recovery<br /><br />
2010-08-12 12:29:48 PKST LOG:  restore_command = 'cp /mnt/backup/%f %p'<br /><br />
2010-08-12 12:29:48 PKST LOG:  automatic recovery in progress<br /><br />
2010-08-12 12:29:48 PKST LOG:  record with zero length at 0/A000064<br /><br />
2010-08-12 12:29:48 PKST LOG:  redo is not required<br /><br />
2010-08-12 12:29:48 PKST LOG:  selected new timeline ID: 2<br /><br />
2010-08-12 12:29:49 PKST LOG:  archive recovery complete<br /><br />
2010-08-12 12:29:49 PKST LOG:  autovacuum launcher started<br /><br />
2010-08-12 12:29:49 PKST LOG:  database system is ready to accept connections<br /><br />

Some useful links you might want to refer to:

http://developer.postgresql.org/pgdocs/postgres/continuous-archiving.html#BACKUP-ARCHIVING-WAL

http://scale-out-blog.blogspot.com/2009/02/simple-ha-with-postgresql-point-in-time.html

http://www.mkyong.com/database/postgresql-point-in-time-recovery-incremental-backup/

About Author:

Mudassar Ali Khan is a young researcher working in the field of security of computer and Internet. His research interests include exploration of innovative frameworks in services oriented architectures along with keen interest in programming the web. He has a Bachelors (Hons) in Computer Sciences and is currently enrolled for M. Phil. in Computer Sciences at QAU, Islamabad. He is also a member of a production house, Mash Productions providing graphic and web services. He blogs at madilator.wordpress.com and is accessible at madilator(at)gmail(dot)com.

Leave a comment

1 Comments.

Leave a Reply

Your email address will not be published. Required fields are marked *


You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped="">