Jason Scott Profile picture
Aug 19, 2019 11 tweets 6 min read Read on X
Sure, let's do it. (I'm at the Internet Archive headquarters all week, this is my job, I'm in an archives mood, etc.)
A frequent and often mentioned concern around digital materials is that retrieving the data isn't enough - the format specifics, and especially proprietary aspects, mean we might be able to get some of the data but a whole bunch of unique aspects will be forgotten, or lost.
However, decades of improvement in storage and processing means that we're moving away from that dark pronouncement, and one of the first big examples of this is the Applesauce FDC, an Apple II floppy reader with a huge amount of heavy work being done by the software.
The result of this hardware is that it takes a 140 kilobyte floppy disk (140k) and reads it into a 20 megabyte (20,000 kilobyte) disk image. This means a LOT of the magnetic aspects of the floppy are read in for analysis. And they're pretty.
This doesn't just dupe the data, but the copy protection, unique track setup, and a bunch of variance around each byte on the floppy to make it easier to work with. The software can then do all sorts of analysis to give us excellent, bootable disk images.
The advantage of the massive 20 megabyte image existing is that if we screw up and don't account for something, we go back to the huge image and we're able to recreate and re-analyze the data, resulting in a greater chance of success. We can also make out disk damage. Like:
What I was explaining to Frontalot on that chilly day of photography was that now that this method exists and is established, there's a great chance it could be used for other optical and magnetic formats. There are now intense efforts, meant to read Laserdiscs:
On the left is capturing off a standard output from a Laserdisc player. On the right is using the hardware/software one-two punch with the Domesday Duplcator and ld-decode:
But why stop with Laserdiscs. Here's an image of a VHS tape going through the same process, before any processing beyond. Here's ALL of the imaging of the videotape, the entire magnetic mess, for later software improvement.
My belief is that the future of a lot of digital and analog-recorded digital media is that we will wholescale pull it in, "do stuff" to it, and produce some very nice, better-than-it-ever-was versions. I also believe we will decode backup tape and proprietary formats similarly.
Also, I took some pleasant photos of @mc_frontalot - who is on tour!

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Jason Scott

Jason Scott Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @textfiles

Feb 29
The policy of the archive as long as I've been there is that if an unhoused person is non-disruptively sleeping on the steps of the archive, especially because it looks like a church and faintly has the words "church" on it, they are left in peace.
There was a long-time outside sleeper named Thomas Hooker who was on the corner across from the Archive for years and the way it was discovered he died was an employee brought extra food over to him the next morning after a yearly archive celebration.

sfchronicle.com/bayarea/nevius…
When the archive's front doors were open before COVID, one nearby homeless person would come in and expertly play arcade games on the Internet Archive arcade machine to pass his days. Image
Read 4 tweets
May 29, 2023
Watching people debate about AI-Company-vs-User. Here's some other things you might not know (we do this all day, it's our job).
We've had plenty of researchers, and even individuals come to grab materials at scale. Ideally, that's fine, after all, we're the Access people. But in this case, we were seeing 10,000+ requests a second blasting across dozens of AWS IPs. So we ended up blocking them.
A researcher would then contact us through one of our contact numbers or e-mail addresses (info@archive.org is a good one) and try to work with us to get data without the downtime.

Am I making this up? No. Has happened dozens of times across many years.
Read 10 tweets
Nov 16, 2022
Not that anyone was clamoring for my statement on this, but I'm riding twitter down until it effectively dies. I don't think people should stick around just because I'm here - remember that my strolling through most sites causes people to realize terrible times are coming.
But don't worry - I was here at the beginning and I'll stick around until the end. Maybe I'll even unblock @jack so I can watch the flames the way they were meant to - reflected in his dewy eyes
ImageImage
Read 4 tweets
Nov 15, 2022
Today was an interesting lesson.

First, you need to know I spend a lot of time cleaning up collections at the Archive. Moving stuff around, getting it easier to find, read, and so on. I've gone into some pretty deep piles and (I think) emerged with some cleaned-up spots.

But.
There was one collection that I had hoped to really clean up that has been around forever. It's called Old Time Radio and it's a complete mess. It's just a big top-level pile of over 8000 items. And some of those items are actually dozens or hundreds of radio shows in one item. Image
This is one of those miraculous, almost unbelievably popular areas within the archive. The ecosystem of it is deep and rich and for some people, it IS the Internet Archive, it's all we do and all we're good for.
Read 6 tweets
Nov 14, 2022
This was because originally it was considered important to know how the tweet was being generated, especially with a rich client ecosystem (which Twitter killed, then brought back). Since this place is doomed, yes, save those pennies musk
Now, I do find it rich that Emerald Mine is going "nobody seems to remember why this is in the case" like, less than a week after he fired 75% of the staff, but what do I know, I never took math past high school
It's funnier the more I see it because those people in tech who have dealt with large code/hardware/setting stacks have seen this happen before, but honestly, this is the difference between a singer at the front of a football game and an olympic opening ceremony in terms of scale
Read 4 tweets
Nov 13, 2022
I'll phrase this carefully.

Atari, the company, is a piece of garbage and has been for some time. It shares zero DNA with the company Atari as it grew in our hearts for the youth of the 1970s and 1980s.

But ATARI 50, created by Digital Eclipse, is a masterwork. Image
It is a shining example of engaging with your historical subject and making it relevant for the present day, both providing optional ease of use upgrades and ways to take it raw, and it was in its day, and switch effortlessly. It's also layers and layers and layers of reference.
I'm sad Curt Vendel didn't get a chance to see this.
Read 5 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(