Method
Backup Method of the RAID Disks
Directory structure preservation
The DLT S4 tapes are used to backup the Pulsar Archive disks. We aim to store the full content of main directories present on the RAID disk on the same tape. However if the data size on the disk is too big to fit on one tape, we split the content into 730-760 Gb chunks and store them on different tapes.
The index of the directory structure of the archive disks is preserved and stored in the /psr/backup/tape*.* files.
Tape archive
We use the tar command to write backup tapes, with the blocking factor and the input list of files specified. The tape drive that can read DLT S4 tapes is connected to the DOLBY workstation. We use 4Gb link, which currently provides the best write and read speed across the network.
Dividing the data into tape files
The input files for the tar command are created by using the perl script 'chunker.pl' written by Vince McIntyre. The script splits any directory into blocks of files with the maximum total size specified by the user. These blocks define the separate tape files. The actual size of blocks depends on the location of the split which does not subdivide the lowest level directory of data files. The 'chunker.pl' creates the output file for each block in the form of the list of files used as the input file in tar command. Currently the typical size of tape files is close to 100Gb. However smaller tape files may be used in the future if are decided to be more convenient. Using the smaller tape files allows to save time if only a subset on data needs to be read. During unintended interuption in writing or reading data, when typically only the last tape file is affected, the recovery is faster.
Specific instructions
- Login to DOLBY.
- cd /pulsar/psr/backup/chunks
- Run chunker.pl on the data you want to backup. For example:
chunker.pl -s 100G -b archive00 /pulsar/archive00
This will create blocks of data in directory /pulsar/archive00 with maximum size of 100 Gb. The lists of the files in these blocks will be written in directory /pulsar/psr/backup/chunks in files named: archive00.0000, archive00.0001, archive00.0002 etc.
For further help on this script, type:
chunker.pl -h - The script chunker.pl produces a summary output on the screen:
wrote archive10.0000, 61949 items, total 99.698 Gbytes
wrote archive10.0001, 51132 items, total 99.815 Gbytes
wrote archive10.0002, 51186 items, total 99.861 Gbytes etc..
This output can be written to a file archivexx.out for future reference. - Insert the DLT tape in the tape drive.
- Since the backup of the 780 Gb will take between 24 and 30 hours, create an executable file with tar command with each input file, or insert all input files in one tar command, whichever more convenient.
- Type: tar cvpbf 1024 /dev/nst0 -T /pulsar/psr/backup/chunks/archive00.0000 | tee /pulsar/psr/backup/tape0.00
This will set blocking factor to 1024, and take the input file archive00.0000. This command will also create an index of the read files in /pulsar/psr/backup/tape0.00.
To create separate index files for an each tape file, the executable file such as /pulsar/psr/backup/comm_tar is used, which is just a list of tar commands for different tape files. - Before you start new backup, move all the tape1.00, tape1.01, ..., tape14.00, tape14.01... etc. to the directory previous.[date] (eg. previous.08.2007). This directory can be removed after the new backup is successfully completed.