If you use the official “GNU Zip” (file compression tool, widely known as “gzip”) for compressing files in the GNU/Linux and have a powerful multi-core (and multithreaded) CPU unit, then just like me (though my CPU unit ain’t that powerful :D), you too might be frustrated by the fact “gzip” only using a single CPU core when running.
Because, if it used all the available CPU power while running (compressing files that is), then it should speed up the process quite significantly. But, it does not do that. Luckily though, just like with the previously discussed “pbzip2” utility, there’s a tool called “pigz” which basically is the multithreaded version of the “gzip” utility.
So, if you have a multicore CPU unit, and need to compress files as fast as possible, by using all of your available CPU cores, then “pigz” will come in handy.
Main features …
*. By default, “pigz” will automatically detect your CPU cores and threads and will adjusts itself, so at the end, all of your CPU cores will be used while compressing.
However, if you don’t like to use all the available CPU cores (say for maintaining a stable system), then you can tell “pigz” to use only a certain number of cores instead.
*. According to its manual page, it’s the “almost compatible” version of “gzip”. Meaning that all the commands used in “gzip” are usable in “pigz”. It also has few additional ones as well.
*. Just like with “gzip”, “bzip2” or many other compression tools in GNU/Linux, “pigz” only compresses single files. Thus, if you have a folder, then you’ll have to put it into a single file container such as ‘tar’, before you can compress it using “pigz”.
You can install “pigz” in Ubuntu 12.04 Precise Pangolin, 11.10 Oneiric Ocelot, 11.04 Natty Narwhal, 10.10 and 10.04 by using the below command in your Terminal window.
sudo apt-get install pigz
How to use it?
Now I’m assuming that you are already comfortable with ‘gzip’. And as said before,”pigz” has the exact same functionality of “gzip”, so you can use it as you’d do with “gzip”.
But, since it has few options of its own, you can use the below command for getting a list of supported commands.
If you want to read its manual (nothing much to read there), use the below one instead.
A simple example ….
If you’re someone a bit new, then below is a simple example of compressing a file named “testing”.
pigz --best -k testing
At the end, it’ll output a file named “testing.gz”, in the same directory. But please remember to replace your source file’s name and its path with “testing”, (the “-k” parameter is used not to let “pigz” to delete the source file and “--best” means using the best possible compression levels).
Decompression is pretty easy as the default file compression tool that comes with Ubuntu (called “File Roller”) or almost all the other GUI tools available in the GNU/Linux platform, should be able to read and extract the archives created by “pigz” as it’s based on the “gzip”.
However, if you’d like to use “pigz” for decompressing as well, then …
To decompress (extract) the above mentioned compressed file, I’ll use the below command.
pigz -d testing.gz
Again, make sure to replace “testing.gz” with your source file name and its path.
I can’t compress folders, why?
As mentioned before, you cannot compress folders with “pigz” as it only supports dealing with single files.
So as a fix, first you’ll have to use the “tar” utility (built into your GNU/Linux distribution) to put that folder and its content into a single file (not compressing it, just coping the content into a single file, almost like an another folder) and then you can compress that file with “pigz”.
Let’s say that I have a folder called “temp”, then I’ll use the below command in my Terminal to put that folder and its content into a single file called “my-single-file”. Make sure to replace “temp” & “my-single-file” with your desired files and their locations. Don’t change anything else.
tar -cf my-single-file temp
Then later you can use the “my-single-file” with “pigz” and compress it. That’s it.
10 thoughts on “pigz: Multithreaded File Compression tool for Ubuntu Linux”
There is a multi-threaded version of tar itself here: https://github.com/johnno1962/mtar
Thanks for sharing this. Been interesting in the topic for quite awhile though finally getting around to messing with it further. Much appreciated and imo, very well written. Thanks again …
You’re welcome, and thank you for the kind words 🙂 .
But when you decompress the pigz you get a single file, how do you use tar to get it back to multiple files again?
Nevermind! tar -xvf file-name did the trick!
If you have multiple files and you’d prefer not to make an interim temp file you can use the tar command without the -f option to write to stdout instead of file and then pipe that as stdin to pigz (where the -p flag sets the number of CPUs to use) rather than giving it a file path input and write out to a file like so:
tar -cv [file name(s)/dir name(s)] | pigz -p 4 > [filename.tar.gz]
is there a way to pipe pbzip2 and pigz or the other way?
Thanks for explaining pigz