, 40 tweets, 7 min read
My Authors
Read all threads
Happy Thanksgiving! 🦃

After putting 5 kids to bed (all fighting off the flu) on this grateful day, I've decided to pontificate about one of my open source projects.

This is mostly for posterity, but I'll Tweet it in case it is interesting to anyone.

At some point, I'd like to yap about the projects I've open sourced while working at Twitter, but for today, we'll stick with a project that predates my time at Twitter (both as an employee and as a user).

In the mid and late "oughts", I was working at a certain gigantic tech company developing (amoung other things) cross platform frameworks and libraries in C++ that would run on Windows, Mac OS X, Linux and (in certain libraries) on embedded Linux kernel based firmware.
I learned a great deal from the talented and seasoned devs there. One thing that impressed me most was seeing how the most talented devs would make their solutions scalable and leverageable.
With up front intentionality, they could take a problem and decompose it into the fundamental parts, and then solve each part so robustly, it would never have to be dealt with again, unless there was an architecture change like moving to 64-bit.
It meant that the problem was just "solved" from then on, and since they controlled the stack, they could ensure component reuse was maximized and prevent the bespoke reimplementing of solutions for existing fundamental components.
I could go into the weeds on this philosophy itself and how it has shaped me professionally, but it would just be me rehashing things spoken of hundreds of times since dev practices were well defined in the 80s & 90s (before being reinvented again in the 00s 🤦‍♂️).
The tangent I'm going into is going to be around a specific need we had that seems to keep coming up and how I built something that _doesn't_ solve it, but at least resolves what I care about (for now).
Serializing a file, or groups of files, into a compressed archive file is a common need on computers. The ZIP archive became dominant very quickly after PKWARE introduced it for DOS and then it published the format spec in 1989.
In 1993, v2 of the spec was published, just in time for the internet to explode and take over the world with zip files being sent around the world on every operating system with a network connection.
As a side note, Phil Katz was the primary inventor of the ZIP file format and I consider a software genius who we lost far too young. I'm the age Phil was when he passed away in 2000, and I wish so much to have had a chance to have met him, just on a personal admiration level.
Anyway, needing to archive files with compression is a very common problem. ZIP archives were a system-independent file format that worked well to serve this need and was battle tested.
ZIP archives are also brilliantly extensible too, so we could plug in any compression we needed even if unknown to the ZIP spec, and could have that archive work on all platforms that supported this extended compression method.
This was precisely what we needed as we wanted to try some newer compression codecs other than DEFLATE (which ones are not really relevant). This is pre-iOS (at first pre iPhone, but then pre-iOS as my company owned the iOS name so it was iPhoneOS at the time).
Now, the systems we cared about all did have zip support natively, but none of them had support for custom compression codecs. All supported the default of DEFLATE, and some had bzip2, but none had the extensibility (neither binary nor library).
I figured, surely this is a solved problem and someone has open sourced something -- even if not extensible, at least a posix based implementation is out there that I could fork and build the extensions into.
Well, this was where I was a little surprised. There were definitely projects out there, but none of them were written in a cross platform supporting manner (at this time).
There was a bunch that were DOS or Win16/32 based, a few that were OS/2 or Linux based that I thought I could convert to raw POSIX but ended up being crazy rats nest implementations, and even some unexpected implementations that I was not prepared to start digging into like perl.
So, we stuck with a DEFLATE based solution to use the existing zip/zlib utils native to all our target platforms. This was terribly unsatisfyingfor me as I saw the gap and need failing to be filled -- if only for myself.
So, on the side, I started tinkering with a C++ based solution. It was somewhat unskilled though in how it managed I/O in a different way on every platform with macro delineations based on the platform -- but I did get something mostly working before I abandoned it.
At my next mega corporation job, the need rose again for a multi-platform zip utility. This time, we are in the heat of smartphones taking off and something new comes on the scene that iOS developers start adopting in droves called "minizip".
minizip is a great little implementation (both library and binary) that is posix based, implemented portably (can compile windows, mac os x, linux/unix or even iOS). Many people had already wrapped it into nice system native interfaces (ObjC, C++, C#, etc).
We took the minizip implementation and used the ObjC interface on iOS, a JNI interface for Android and C++ interface for Windows and Linux. The need for extensibility wasn't there so we called it good and moved on.
I, personally, was not satisfied with the solution because it offered no extensibility and frankly only a very scoped down feature set of zip archiving was ever built for the needs we had -- far from the robust foundational component philosophy I wanted to live by.
Then came working at Twitter and I'll be damned if the need didn't come up again. This time, I could foresee a wider trajectory of needs though. We weren't just faced with needing a zip archival utility, but a robust pluggable interface for different compression algorithms.
In the years that had passed since last using minizip, nothing else had joined the fray for zip archival open source projects. Sure there were dozens of more "zip utilities" out there, but they were all built on minizip.
minizip was good at what it set out to do, be an interface for zlib that would work on multiple platforms as both a library and executable binary. But it had some serious drawbacks too.
The implementation, though fine in the most common use cases, was really very buggy. It had no extensibility and even where I might want to add it, there were too many bugs from making assumptions about using DEFLATE.
It was also overly complex. It is understandable given how it evolved but it made the code very unmanageable for maintenance and revision.
I don't say this to disparage minizip or Gilles Vollant. The success of minizip was probably unexpected in the wake of it being open sourced, as it filled a need that nobody else was willing/able to fill. I commend Gilles profusely for what he built in a short time.
But I needed something else than minizip and I could not salvage the implementation (probably a more skilled developer could have iterated minizip). So I set out to build something knew with what I had learned.
This time, I would build something extensible and I would architect it for the needs I knew would be upcoming. However, as a side project, I was eager to build something on my own fast from what I had already built before the needs of a zip utility were
necessary at work.
During my paternity leave for my 3rd child, I wrote ZipUtilities. I wrote it to spec (v2.0 ZIP format) from the ground up and validated it with tests every step of the way (which also exposed how minizip was fragile on a lot of edge cases).
ZipUtilities sacrificed all the plans I had for platform portability in the name of something that would be robust, extensible and to spec. I also punted on 64-bit file size support (max 4GBs, sorry) and encryption (which I think was the correct choice no matter my constraints)
ZipUtilities builds for any Apple platform, and is extensible so that any compression codec can be plugged in with a very clear interface to accomplish it too. It also has a CLI binary (called `noz`), that is super useful.
One day I'll get the noz utility to add support for pluggable codecs so that anyone can just drop a new compression algorithm codec onto their system and it is instantly extended (no need to recompile noz), but that will have to wait.
In the following years, ZipUtilities was integrated in Twitter for iOS and we experimented many times around compression and improved performance thanks to ZipUtilities.
Most valuable was using ZipUtilities in conjunction with Twitter Network Layer to prove out Brotli as our api response encoding in 2016 with many companies following suite.
(POI: Apple added Brotli in 2017 to iOS 11 for network requests and didn't add their own LZFSE as one might have expected -- *wink*)
So that was what brought about ZipUtilities, and I feel confident I delivered on those goals. But, alas, it again is not the philosophically pure single implementation for all platforms I had dreamed of. As much as we all try, software is never finished.

Missing some Tweet in this thread? You can try to force a refresh.

Enjoying this thread?

Keep Current with Nolan O'Brien

Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!

This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!