Is there anyway to extract a tar.gz file faster than tar -zxvf filenamehere?
We have large files, and trying to optimize the operation.
Is there anyway to extract a tar.gz file faster than tar -zxvf filenamehere?
We have large files, and trying to optimize the operation.
pigz is a parallel version of gzip. Although it only uses a single thread for decompression, it starts 3 additional threads for reading, writing, and check calculation. Your results may vary but we have seen significant improvement in decompression of some of our datasets. Once you install pigz, the tar file can be extracted with:
pigz -dc target.tar.gz | tar xf -
tar -xvf --use-compress-program=pigz filenamehere. (-z amounts to --use-compress-program=gzip.) Alternatively, you can even make gzip be a symlink to pigz, and keep using -zxvf.
– ruakh
Nov 19 '12 at 22:36
-xf after --use-compress-program=pigz, or I got an error. For some reason, it was no faster than using gzip though.
– jonderry
Mar 07 '15 at 22:39
bzip2 there is pbzip2 (p for parallel). tar --use-compress-program=pbzip2 -xvf file.tar.bz2.
– alfC
Sep 22 '15 at 17:56
pv command to show progress, or an equivilant, while also using the --use-compress-program=pigz flag? During compression, I can do gnutar --use-compress-program="pigz | pv" -cf target.tar.gz YourData, but not sure how to do this during untar/uncompression.
– Stefan Lasiewski
Jul 11 '18 at 00:57
if there are many many many small files in the tar ball, cancel the ‘v’ parameter, try again!
--checkpoint=NUMBER (display progress messages every NUMBERth record) instead of -v.
– Stefan Lasiewski
Jun 21 '18 at 18:42
If you want to see progress use something like pv.
Here is an Example:
pigz -dc mysql-binary-backup.tar.gz | pv | tar xf -
With pigz you can speed up extraction and with pv you can view the progress of the extraction. Storing the output of pv to a file allows you to run your command in the background and view the progress at any time.
pigz -dc mysql-binary-backup.tar.gz | pv --force 2> progress.txt | tar xf - &
pv typically will not output a visual display if stderr is not a terminal. --force ensures that pv is forced to do so. Then we simply redirect stderr using 2> to progress.txt and pipe the stdout to the tar command. & at the end will run the command in the background.
You can see the progress of the extraction by simply running tail -f progress.txt
$ tar -zxvfmethod is IO or CPU bound? – EEAA May 18 '11 at 04:07vmstat 1 100or every 1 second, for 100 seconds, vmastat outputs. pigz was really helpful, I decompressed 108GB gz file in minutes that was taking over an hour previously. – j0h Dec 06 '20 at 04:25