How much improvement can you see with compression? The difference in measured download times on a very lightly loaded server indicates that the time to download the Base Page (the initial HTML file) improved by between 1.3 and 1.6 seconds across a very slow connection when compression was used.

Base Page Performance
There is a slightly slower time for the server to respond to a client requesting a compressed page. Measurements show that the median response time for the server averaged 0.23 seconds for the uncompressed page and 0.27 seconds for the compressed page. However, most Web server administrators should be willing to accept a 0.04 increase in response time to achieve a 1.5 second improvement in file transfer time.

First Byte Performance
Web pages are not completely HTML. How do improved HTML (and CSS) download times affect overall performance? The graph below shows that overall download times for the test page were 1 to 1.5 seconds better when the HTML files were compressed.

Total Page Performance
To further emphasize the value of compression, I ran a test on a Web server to see what the average compression ratio would be when requesting a very large number of files. As well, I wanted to determine what the affect on server response time would be when requesting large numbers of compressed files simultaneously.
There were 1952 HTML files in the test directory and I checked the results using CURL across my local LAN.[1]
Large sample of File Requests (1952 HTML Files)
mod_gzip
Uncompressed | Compressed | ||
First Byte | |||
Mean | 0.091 | 0.084 | |
Median | 0.030 | 0.036 | |
Total Time | |||
Mean | 0.280 | 0.128 | |
Median | 0.173 | 0.079 | |
Bytes per Page | |||
Mean | 6349 | 2416 | |
Median | 3750 | 1543 | |
Total Bytes | 12392318 | 4716160 |
mod_deflate[2]
Uncompressed | Compressed | ||
First Byte | |||
Mean | 0.044 | 0.046 | |
Median | 0.028 | 0.031 | |
Total Time | |||
Mean | 0.241 | 0.107 | |
Median | 0.169 | 0.050 | |
Bytes per Page | |||
Mean | 6349 | 2418 | |
Median | 3750 | 1544 | |
Total Bytes | 12392318 | 4720735 |
mod_gzip | mod_deflate | |
Average Compression | 0.433 | 0.438 |
Median Compression | 0.427 | 0.427 |
As expected, the First Byte download time was slightly higher with the compressed files than it was with the uncompressed files. But this difference was in milliseconds, and is hardly worth mentioning in terms of on-the-fly compression. It is unlikely that any user, especially dial-up users, would notice this difference in performance.
That the delivered data was transformed to 43% of the original file size should make any Web administrator sit up and notice. The compression ratio for the test files ranged from no compression for files that were less than 300 bytes, to 15% of original file size for two of the Linux SCSI Programming HOWTOs.
Compression ratios do not increase in a linear fashion when compared to file size; rather, compression depends heavily on the repetition of content within a file to gain its greatest successes. The SCSI Programming HOWTOs have a great deal of repeated characters, making them ideal candidates for extreme compression.
Smaller files also did not compress as well as larger files, exactly for this reason. Fewer bytes means a lower probability of repeated bytes, resulting in a lower compression ratio.
Average Compression by File Size
| mod_gzip | mod_deflate | 0-999 | 0.713 | 0.777[3] | 1000-4999 | 0.440 | 0.440 | 5000-9999 | 0.389 | 0.389 | 10000-19999 | 0.369 | 0.369 | 20000-49999 | 0.350 | 0.350 | 50000 and up | 0.329 | 0.331 | ||||||||||||||||||||||
mod_gzip | mod_deflate | |||||||||||||||||||||||||||||||||||||||||
0-999 | 0.713 | 0.777[3] | ||||||||||||||||||||||||||||||||||||||||
1000-4999 | 0.440 | 0.440 | ||||||||||||||||||||||||||||||||||||||||
5000-9999 | 0.389 | 0.389 | ||||||||||||||||||||||||||||||||||||||||
10000-19999 | 0.369 | 0.369 | ||||||||||||||||||||||||||||||||||||||||
20000-49999 | 0.350 | 0.350 | ||||||||||||||||||||||||||||||||||||||||
50000 and up | 0.329 | 0.331 |
The data shows that compression works best on files larger than 5000 bytes; after that size, average compression gains are smaller, unless a file has a large number of repeated characters. Some people argue that compressing files below a certain size is a wasteful use of CPU cycles. If you agree with these folks, using the 5000 byte value as floor value for compressing files should be a good starting point. I am of the opposite mindset: I compress everything that comes off my servers because I consider myself an HTTP overclocker, trying to squeeze every last bit of download performance out of the network.
Conclusion
With a few simple commands, and a little bit of configuration, an Apache Web server can be configured to deliver a large amount of content in a compressed format. These benefits are not simply limited to static pages; dynamic pages generated by PHP and other dynamic content generators can be compressed by using the Apache compression modules. When added other performance tuning mechanisms and appropriate server-side caching rules, these modules can substantially reduce the bandwidth for a very low cost.
[1] The files were the top level HTML files from the Linux Documentation Project. They were installed on an Apache 1.3.27 server running mod_gzip and an Apache 2.0.44 server using mod_deflate. Minimum file size was 80 bytes and maximum file size was 99419 bytes.
[2] mod_deflate for Apache/2.0.44 and earlier comes with the compression ratio set for Best Speed, not Best Compression. This configuration can be modified using the tips found here; and starting with Apache/2.0.45, there will be a configuration directive that will allow admins to configure the compression ratio that they want.
In this example, the compression ratio was set to Level 6.
[3] mod_deflate does not have a lower bound for file size, so it attempts to compress files that are too small to benefit from compression. This results in files smaller than approximately 120 bytes becoming larger when processed by mod_deflate.
0 Comments
2 Pingbacks