Skip to content

DotNetZip vs. .NET Deflate

August 21, 2010

I compared two deflate algorithms, and found that DotNetZip compresses better, while .NET is faster.

I’ve been using the DotNetZip library to manage zip files. It’s also been called the Ionic library because that is the name of its DLL and namespace (although the origin of Ionic is unexplained, so maybe the authors just liked it).  In addition to its zip functionality, DotNetZip provides a deflate/gzip stream implementation that is API-equivalent to the one in .NET System.IO.Compression DeflateStream.

The DotNetZip web page makes the claim: “These streams support compression levels and deliver much better performance that [sic] the built-in classes.”  My project also uses deflate, so this made me wonder if some performance may be gained by switching to DotNetZip deflate.  I already depend on the library, so to switch is simply a namespace change.  I’d be crazy not to if the claim is correct, except I couldn’t find any data to back it up.

So before making the switch, I measured the difference in performance between the two implementations by compressing 150 batches of XML, typical of the data used in my project.  Each XML is roughly 3350 bytes, so the total batch ends up just over 500 Kb before compressing.

The .NET implementation compresses 502 KB of XML down to 108 Kb, a 21.5% compression ratio.  With DotNetZip, the same data compressed to 81 KB, an even better 16.2% compression ratio… not bad!

Now this seems great, but that extra compression can’t come for free.  So I measured the time it took to compress about 50 MB of data.  .NET completed in a zippy 3.0 seconds compared to the 5.5 seconds it took DotNetZip.

As the quote above mentions, DotNetZip also supports a selectable compression level (a feature .NET is lacking).  I figured the difference in performance must be due to compression level, so curious to determine what level is hard-coded in .NET, I repeated the test for all levels of DotNetZip.

Here’s the results for all levels:

LibLevelCompression ratioRate (MB/s)

DotNetZip

0 (none)

1.00

20.3

1

0.206

12.6

2

0.192

12.8

3

0.185

12.5

4

0.181

9.29

5

0.168

9.01

6 (default)

0.162

8.66

7

0.160

8.26

8

0.157

5.93

9

0.156

4.90

.NET

default

0.215

16.2

Compression level is a trade-off between compression ratio and speed–the more you want to shrink, the slower it goes. This is totally expected… what I find interesting is that the .NET compression level lies outside the range of DotNetZip. Even at the fastest setting, DotNetZip compresses more. Apparently, the .NET authors decided that speed was much more important than size.

I should note that I also measured decompression time. .NET managed just around 85 MB/s, and DotNetZip measured between 75-95 MB/s (depending on the compression ratio). Decompression is much faster than compression, and I didn’t think the libraries differed enough to sway the comparison either way.

When DotNetZip claims they perform better, it appears that their benchmark is size, not speed. If you want fast compression, the .NET library fares better, but if you need more compression (with a knob to tune it a bit) DotNetZip is the way to go.

My application resides in the Azure cloud, where it gets billed for CPU, data transfer, and storage.  In this model, it makes sense to use higher compression:  I have to pay to keep the data, so storage is an ongoing expense… add to that the bandwidth to move the data when I need it, and compression saves there too.  So I spend slightly more for CPU?  This is not going to hurt, as I pay whether idle or not… my CPUs are rarely going to be 100% occupied, because they must wait for I/O.  The only real downside I see is higher latency waiting on the compression algorithm, but amongst other I/O and processing delays it’s likely going unnoticed.

Note:  for these tests I used the DotNetZip release “v1.9 packed Thu-02-25-2010-222909.78” and .NET 4.0.

2 Comments leave one →
  1. October 7, 2010 10:05 AM

    You might be interested to read:

    http://www.zlib.net/manual.html#Advanced

    The C library for zlib provides several knobs in addition to the compression level:

    – level
    – windowBits
    – memLevel
    – strategy

    My guess would be that the .NET library is using different default values of some of these other parameters. The strategy in particular can make a big difference in the performance (both CPU and compression ratio) depending on the particular characteristics of the data set.

    Most wrappers provide some subset of these. For example in Java there is Deflater that lets you set the level and strategy:

    http://download.oracle.com/javase/6/docs/api/java/util/zip/Deflater.html

    boost::iostreams::zlib allows all of them to be adjusted:

    http://www.boost.org/doc/libs/1_44_0/libs/iostreams/doc/classes/zlib.html#constants

    At first glance it looks like DotNetZip allows the strategy and windowBits to be configured via the ZlibCodec class:

    http://cheeso.members.winisp.net/DotNetZipHelp/html/9d362320-a6cb-ce61-8864-8b720501b0e5.htm

Trackbacks

  1. Json.NET vs .NET DataContractJsonSerializer | Deprogramming

Leave a Reply

Your email address will not be published. Required fields are marked *