no, not totally. The directory at the end of the archive points backwards to local headers, which in turn include all the necessary information, e.g. the compressed size inside the archive, compression method, the filename and even a checksum.
If the archive isn't some recursive/polyglot nonsense as in the article, it's essentially just a tightly packed list of compressed blobs, each with a neat, local header in front (that even includes a magic number!), the directory at the end is really just for quick access.
If your extraction program supports it (or you are sufficiently motivated to cobble together a small C program with zlib....), you can salvage what you have by linearly scanning and extracting the archive, somewhat like a fancy tarball.
This works great on campus, but when everyone went remote during COVID it wasn't anymore. It went from three minutes to like twenty minutes.
However. Most files change only rarely. I don't need all the files, just the ones which are different. So I wrote a scanner thing which compares the zip file's filesize and checksum to the checksum of the local file. If they're the same, we skip it, otherwise, we decompress out of the zip file. This cut the time to get the daily build from 20 minutes to 4 minutes.
Obviously this isn't resilient to an attacker, crc32 is not secure, but as an internal tool it's awesome.
No, its purpose was to allow multi floppy disks archives. You would insert the last disk, then the other ones, one by one…
This redundant information has lead to multiple vulnerabilities over the years. As having redundant information means that a maliciously crafted zip file with conflicting headers can have 2 different interpretations when processed by 2 different parsers.
The PKZIP tools came with PKZIPFIX.EXE, which would scan the file from the beginning and rebuild a missing central archive. You could extract any files up to the truncated file where your download stopped.
[1]: https://forum.videohelp.com/threads/393096-Fixing-Partially-Download-MP4-Files unzip zbsm.zip
Archive: zbsm.zip
inflating: 0
error: invalid zip file with overlapped components (possible zip bomb)
This seems to have been done in a patch to address https://nvd.nist.gov/vuln/detail/cve-2019-13232https://sources.debian.org/patches/unzip/6.0-29/23-cve-2019-...
Someone shared a link to that site in a conversation earlier this year on HN. For a long time now, I've had a gzip bomb sitting on my server that I provide to people that make a certain categories of malicious calls, such as attempts to log in to wordpress, on a site not using wordpress. That post got me thinking about alternative types of bombs, particularly as newer compression standards have become ubiquitous, and supported in browsers and http clients.
I spent some time experimenting with brotli as a compression bomb to serve to malicious actors: https://paulgraydon.co.uk/posts/2025-07-28-compression-bomb/
Unfortunately, as best as I can see, malicious actors are all using clients that only accept gzip, rather than brotli'd contents, and I'm the only one to have ever triggered the bomb when I was doing the initial setup!
Like bomb the CPU time instead of memory.
That's how self extraction archives and installers work and are also valid zip files. The extractor part is just a regular executable that is a zip decompresser that decompresses itself.
This is specific to zip files, not the deflate algorithm.
import zlib
zlib.decompress(b"\x00\x00\x00\xff\xff" * 1000 + b"\x03\x00", wbits=-15)
If you want to spin more CPU, you'd probably want to define random huffman trees and then never use them.> A final plea
It's time to put an end to Facebook. Working there is not ethically neutral: every day that you go into work, you are doing something wrong. If you have a Facebook account, delete it. If you work at Facebook, quit.
And let us not forget that the National Security Agency must be destroyed.
https://sources.debian.org/patches/unzip/6.0-29/23-cve-2019-...
The detection maintains a list of covered spans of the zip files
so far, where the central directory to the end of the file and any
bytes preceding the first entry at zip file offset zero are
considered covered initially. Then as each entry is decompressed
or tested, it is considered covered. When a new entry is about to
be processed, its initial offset is checked to see if it is
contained by a covered span. If so, the zip file is rejected as
invalid.
So effectively it seems as though it just keeps track of which parts of the zip file have already been 'used', and if a new entry in the zip file starts in a 'used' section then it fails.I.e. an advanced compressor could abuse the zip file format to share base data for files which only incrementally change (get appended to, for instance).
And then this patch would disallow such practice.
1. A exceeds some unreasonable threshold
2. A/B exceeds some unreasonable threshold
On the other hand, zip bomb described in this blog post relies on decompressing the same data multiple times - so it wouldn't trigger your A/B heuristics necessarily.
Finally, A just means "you can't compress more than X bytes with my file format", right? Not a desirable property to have. If deflate authors had this idea when they designed the algorithm, I bet files larger than "unreasonable" 16MB would be forbidden.
A better zip bomb [WOOT '19 Paper] [pdf] - https://news.ycombinator.com/item?id=20685588 - Aug 2019 (2 comments)
A better zip bomb - https://news.ycombinator.com/item?id=20352439 - July 2019 (131 comments)
I use zip bombs to protect my server - https://news.ycombinator.com/item?id=43826798 - April 2025 (452 comments)
How to defend your website with ZIP bombs (2017) - https://news.ycombinator.com/item?id=38937101 - Jan 2024 (75 comments)
The Most Clever 'Zip Bomb' Ever Made Explodes a 46MB File to 4.5 Petabytes - https://news.ycombinator.com/item?id=20410681 - July 2019 (5 comments)
Defending a website with Zip bombs - https://news.ycombinator.com/item?id=14707674 - July 2017 (183 comments)
Zip Bomb - https://news.ycombinator.com/item?id=4616081 - Oct 2012 (108 comments)
It is a much easier problem to solve than you would expect. No need to drag in a data centre when heuristics can get you close enough.
[0] https://sources.debian.org/patches/unzip/6.0-29/23-cve-2019-...