Tweet

David Vorick

19 Oct, 22 tweets, 4 min read

We are excited today to announce official support for 100 GB files on Skynet. This was a surprisingly difficult thing to support, and is in fact NOT supported on the vast majority of centralized filesharing services.

Here's why it's hard: 🧵/

Most of the web works using a protocol called "TCP". Beginner network engineers are told "UDP is unreliable, and TCP is reliable", which is only somewhat true.

A more true true statement would be "UDP is highly unreliable, and TCP is mildly unreliable".

When you connect to a server over the internet, you are bouncing through often dozens of different routers, all which are handling tons of connections at once. If at any time (extremely common) a router has more messages incoming than it can handle, it'll just delete some.

This is called "dropping packets", and is the main reason that UDP is unreliable. UDP is a naive protocol that just sends packets to the destination, and doesn't worry if stuff gets dropped or reordered along the way.

TCP is considered "reliable" because it does care. The server and the sender work together to enure that all the packets made it, and if some packets get dropped TCP will resend them.

But TCP can only handle so much chaos. Occasionally, The number of packets that get dropped (among other sources of network turbulence) gets so high that TCP also fails.

On the open Internet today, those failures usually happen somewhere between 1 GB and 4 GB of data transferred. And that's why you can't just "upload a bigger file" on most centralized services. TCP isn't stable enough to make it work.

It helps to think about the Internet as a stormy sea. In fact, you will often see network engineers using weather metaphors for the Internet. Packet floods, request storms, etc. Packets are tiny ships crossing a vast and chaotic ocean.

So we got more reliability by adding another layer on top of TCP called "TUS"

tus.io

TUS creates a longer lasting connection between the client and the server that can detect when a TCP connection has failed and retry. TUS can also handle short bursts of things like losing Internet entirely, or switching from wifi to cellular data, etc.

With TUS, we were able to get stable, reliable uploads to 100 GB and more... kinda.

Out of the box TUS is a protocol that talks to one server. And so you can get a super reliable connection to that server, but...

For some of our users on slower connections, a 100 GB upload can take several days. And for maintenance purposes, we restart our servers about once a week. If you are mid-upload when a server restarts, your upload will fail because TUS itself will get interrupted.

That means that slower connection users still experience failure rates as high as 40% when trying to upload large files. Not at all acceptable when our brand is "decentralization can be both fast and reliable"

So we had to add another layer of robustness. We store the progress of the TUS upload in a MongoDB cluster. If a user is uploading to one server and that server restarts, the user will re-connect to another server, and thanks to MongoDB we are able to resume where we left off.

So now we're at a point where a user could spend 20 days uploading a single file, and live through 3+ server restarts, and their file will still complete without errors. It took a surprising amount of technology to pull this off.

And we're still not done. All of this works for 100 GB files, but why stop at 100 GB? Why not go to 20 TB? Or 200,000 TB?

We hit two more technological limitations. The first is the structure of Skynet's metadata. As of writing, we have a limit of 4 MB of metadata, and once you get above about 130 GB you run out of space for your metadata.

The other issue is MongoDB. MongoDB runs out of metadata at 16 MB, which means even after we fix Skynet, we're still capped to about 500 GB as a maximum file size.

(and no, GridFS doesn't work because you can't append to GridFS objects, and the large metadata object needs to be built incrementally)

We expect to have solutions for both the Skynet limitation and the MongoDB limitation in about 2 weeks. We've got a plan and some code and we just need to finish review and testing. After that a 20 TB file limit should be reasonable, something we are excited to try out.

After all of this, we've gotten to a place where Skynet, the decentralized service, is faster and more reliable for large files (>10 GB) than almost every other centralized service. It's a hard problem, and one that we internally insisted on doing correctly.

So next time you have a 20 GB Adobe Premier file to send to a friend, give Skynet a shot. It just might save you a lot of time and frustration.

• • •

Missing some Tweet in this thread? You can try to force a refresh

Share this page!

David Vorick

Try unrolling a thread yourself!

Did Thread Reader help you today?

Like this author's thread?