[LUNI] Incremental Tar

Steven Lembark lembark at wrkhors.com
Mon Jun 9 04:48:01 CDT 2003



-- "Patrick R. White" <patwhite at interaccess.com>

> Steven Lembark wrote:
>
>>
>> -- "Matt T." <matt at abernackie.com>
>>
>>> Hi,
>>>
>>>     I was curious if anyone had experience with tar incremental backups?
>>> Basically the information im looking for is what/how does GNU tar
>>> define what and how an incremental backup is done? ie.. Are only
>>> modified
>>> files updated in the tar archive? or are the modified files appended to
>>> the  current archive?
>>>
>>> Is cpio a better alternative?..
>>
>> If it's any indication, Sun's dump has been a front end to
>> cpio for 10+ years.
>>
> So isn't this a good reason to use the dump/restore utilities to begin
> with?  They allow a full level 0 dump and up to 9 different incremental
> levels with the added benefit of higher incremental levels including all
> changes since the last full level 0 backup, right?  So no matter how many
> incremetals written, a full restore will only require the lastest full
> level 0 backup and the highest incremental backup created since then!

Think again. You can loose data that way if something has
been mangled or lost between the 0 and incrementals.

In general, because data may have been accidentally deleted
since the last backup you're stuck restoring all of the
incrementals for a guanranteed recovery -- unless, of course,
one of your files was mangled...

99% of all bakcups can avoid the complexity and tracking
for towers-of-hanoi anyway. Most sites use a montly level
0, weekly level 1 and daily level 2. The same can be done
with find / -mtime $age | ...

The main advantage to using cpio is that since it takes the
list of files from stdin you can tailor what gets backed up
a bit more finely. For example, I make full backups of the
root volume (/var and /usr are mount points) with all backups
and filter out browser caches using File::Find (I've posted
various versions of this to the list in the past).

You can also make a more stable backup by using perl (or
just find) to pre-scan the filesystem(s) and then dump a
stable list into cpio. This can save some logic races that
casue problems with dump on highly-used file systems.

I prefer to keep my archives on the disk and to validate
the bakcups when they are made. This is rather simple with
cpio since a tee (or write of a pre-made list) can be
compared with cpio -it. Being able to restore a file with
"cpio -it '*/somedir/basename' < /dev/tape" is convienent.

There is also the issue of tape density. Quick, without
looking, how many feet long is a DLT3? DDS3? DDS4? If
your answers are less than the height of Mount Evererst
in cubits you are wasting significant amounts of tape.
The dump tape spec hasn't been updated since 9600bpi
was "high". Fine if you really still use 9-track, but
DDS4 tapes run in the MB/mm range -- not KB/inch.

I've tried just about every tool there is and finally
settled on using File::Find with regexen to filter the
backup files with a pipe in the perl code into cpio.

enjoi

--
Steven Lembark                               2930 W. Palmer
Workhorse Computing                       Chicago, IL 60647
                                            +1 888 359 3508


More information about the luni mailing list