‘bzip2’ is perhaps one of the most widely used file compression format in GNU/Linux. It compresses data well, runs fast and does not need a lot of memory to run. However, because newer computers come with 4GB or more RAM and powerful CPUs, there are other data compression tools that one can use as they give high compression ratios, in comparison.
In that sense, you can try ‘lrzip’ utility (‘Long Range ZIP). By default it uses the ‘LZMA’ algorithm, combined with ‘rzip’, as they are known for excellent compression ratios, and according to its creator, Con Kolivas (who is also the author of the BFS task scheduler), it is more effective if the source file is about 100MB or more in size.
Another benefit of ‘lrzip’ is that it is multithreaded. Meaning that, if you have a CPU with multiple cores, then all of the cores will be used (can be changed) while compressing and decompressing thus significantly reducing the time it takes to get the job done.
I ran few simple benchmarking tests, and tested its performance against the ‘bzip2’ format and was really impressed by its ability not only to compress data far better than ‘bzip2’ did, but once done few tweaks, although it will reduce the compression ratio a bit, it ran as fast as ‘bzip2’ did!!.
However, as mentioned earlier, there is a certain price that one will have to pay when using it. And that is the extremely high memory usage. When running, ‘bzip2’ compressor kept it around 40MB or less but, while using ‘lrzip’ (on the same file), it used as high as 1.1GB!!. This however, can be adjusted and the memory usage can be reduced, but then the compression ratio won’t be that impressive.
Details of the ‘Test’
For the test, I used a folder filled with 20,043 items, and totaling 973.7MB of size on the disk. I had to put it into a ‘tar’ archive (uncompressed) because the ‘pbzip2’, multithreaded version of ‘bzip2’, does not support directly adding a folder.
What I did was simple. I compressed and then decompressed this ‘tar’ archive that hold 20,043 items in it, using both ‘pbzip2’ and ‘lrzip’ (in their default settings).
To get a more conservative results, I compressed the file 4 times did the same for decompressing, under each utility. And between each ‘test’, I rebooted the computer to avoid any caching, to get more accurate results.
I have a Core i3-2330M CPU, 4GB of DDR3 RAM, 320GB Toshiba SATA hard disk (7200 RPM). The tests were run under Ubuntu 12.10.
Now as you can see below, though when used with its default settings, ‘lrzip’ used as much as four times (almost) the time for compressing the file than ‘pbzip2’ did, it was about 9% percent more efficient. That might not look like much, but that is a 90.6MB size reduction.
However, as mentioned above, ‘lrzip’ used as much as 1.1GB of my RAM (it goes up and down automatically), where ‘pbzip2’ kept it below 40MB. Nevertheless, according to its help page, ‘lrzip’ automatically adjusts the amount of memory it uses, according to the hardware of your computer (and the file that is being compressed).
I was also quite happy with its automated calculations, as I did not encounter any stability issues etc.
Interestingly, while decompressing, ‘lrzip’ was actually a few seconds faster than ‘pbzip2’!. Still, the memory usage was around 960MB for ‘lrzip’ where ‘pbzip2’ uses very little.
After tweaking …
Because I thought that the default compression level used by ‘lrzip’ was a bit high (uses ‘7’ by default, ‘9’ is the highest), I reduced it to ‘5’ and ran the same tests again to see whether if it could affect the compression times.
And as you can from the below graph, this time, though the memory usage was slightly lower (960MB was the maximum) and the compression was not as good as before, ‘lrzip’ still compressed data better than ‘pbzip2’ did (around 7% more efficiently).
I also tested the decompression times, and there were no differences.
Now I of course tested it with other more harder to compress files such as audios etc, and the compression was not as good as above, but it still compressed data better than ‘pbzip2’ did.
Wanna give it a go?
If interested, you can install ‘lrzip’ in Ubuntu 12.10, 12.04 Precise Pangolin, 11.10 Oneiric Ocelot, 11.04 Natty Narwhal … by using the below command in your Terminal window.
sudo apt-get install lrzip
How to use it?
You actually do not need to install ‘lrzip’ as Ubuntu’s built in archive manager (‘File-Roller’) does support both compressing and decompressing using the ‘.lzr’ format.
However, because by using ‘lrzip’ you can tweak it (such as changing the compression level, memory usage etc), plus it is a very simple and intuitive tool, I like to use that instead.
Few examples …
Let’s say that I have a file called ‘lib.tar’ (it could be anything, text file, audio, video etc) and want to compress it with the default settings. Then I’ll enter the below command.
Simple replace ‘lib.tar‘ with the path of your file.
If I wanted to fasten up the process by reducing the compression level, then I’ll use it in the below format.
lrzip -L 5 lib.tar
The compression level is denoted by the number used which ranges from 0-9 (higher values mean better compressions).
Anyhow, once compressed, it will create a new file, with the same name, except a ‘.lzr’ extension will be added to it. So in this case, it will be called ‘lib.tar.lzr’.
To decompress it, I’ll use the below command.
lrzip -d lib.tar.lzr
Again, make sure to replace ‘lib.tar.lzr‘ with your file and its path.
If you have a folder that needs to be compressed, then you will have to use another tool that comes with the ‘lrzip’ package, called ‘lrztar’. Let’s say that want to compress the folder called ‘Documents’, then I’ll use it in the below format.
For decompressing it later, use the below one.
lrztar -d Documents.lzr
All these details are listed in its manual. For that, please use the below command.
Few final words …
So, if you usually compress large files and looking for a way to achieve high compression ratios, and have a computer that can live up to the requirements of the ‘lrzip’ tool, then this is certainly a handy tool.