My Authors
Read all threads
GATHER ROUND, IT’S DATA SCIENCE STORY TIME 📖with me (Kareem!) ...
I used to work in the stats version of a special operations team. Researchers would come to us if they were stuck on a hard computational or statistical problem, and they needed help. We would basically parachute into their situation and get things done. We got people unstuck!
This might mean:
1. Teaching them how to think about a problem
2. Solving the super hard problems ourselves
3. Building prototype solutions so they could refine them
4. Training them and their people
5. Writing job ads, reviewing resumes so they could put together the right team
Our only limitation was time. We did what we could in the short time window we had to accelerate their research. The goal was to get them or their team to a point where they could take the project the rest of the way.
My favorite caper involved a Dutch file format mystery. There was a survey done in the 60s and all we knew was it was stored on some strange file. We couldn't figure out the format. We tried everything. Nothing worked. We went through all the standard formats and got gibberish.
Out of desperation, we started working through old US government file formats because the survey was so ancient. It was a mix of researching options and finding code that implemented them. Nothing worked. Finally, I had the idea to start playing around with the binary itself.
Yup. Those 1s and 0s they always talk about. I was experimenting with the raw binary data and I realized something interesting...the number of binary digits was divisible by the number of people who took the survey. Interesting! This was the lead we needed.
The original form of this data was punch cards. That's right. Pieces of paper with holes in them. We had a PDF translation from the original Dutch that was a scan of the survey itself and a lot of details about which holes meant what.
This was useless of course...or so we thought. Getting back to the raw binary, I had the insight that maybe...just maybe each 0 and 1 meant "hole" or "not a hole". I tried seeing if the number of binary digits was divisible by the number of holes on a card. IT WAS!!!!!
I ran over to share with rest of the team. I tell them all my detective work. We all agree this sounds like it. We start reliving the trauma of trying pretty much everything possible and how weird and wonderful and fantastic it was that the file we'd been working with for days...
...had a file format that was based on technology that was invented more than a century ago. The things they expect us to do in this job! It was an amazing win for the team. The only thing left was to write a custom decoder from punch card format to text file. No sweat!
My coworker and I were like OK. We gotta write a script to do this. He was an R coder. Like a *hardcore* R coder like an "R is the perfect language " R coder. I was the Python guy and obnoxious about it. I was a "It would probably be easier in Python" Python guy.
So we decided to make it a race. Which of us could craft a custom binary decoder from scratch so we could get these researchers their survey results as fast as possible! In our defense, this wasn't all just a macho waste of time.
His code would be a cross-check, an independent verification of the accuracy of my code! We were having fun but at least it was useful fun!
We were in his office having the group meeting when we decided what we were gonna do. He starts coding immediately. DIRTY TRICK! I grab my Macbook and dash off to my desk. I needed to be "in the zone" for this one. I get to my desk. I close the door. I put on my head phones.
I don't remember the song. Imagine something alt-techno. Here's me putting on my hipster headphones with the wood accents. Here's me setting down my Mac on my always pristine desk. Here's me looking at the frosted glass walls that diffused the light from the hallway. It was on.
I'm frantically, googling packages and example code for working directly with binary. We never work with raw binary. I'm trying out examples. I can see the design in my head already. The Python is falling into place. "You got this", I whisper to myself—and I did.
I strutted out of that room an hour later with a solution. I'm not going to lie. It was delicious lording my win over my work colleague and friend. "Looks like Python got the job done," said with a straight face that said we were going to be howling about it in a minute.
There was a lot of congratulating of me by other members of the team with my R buddy first in line. We then had the obligatory discussion about how I might have lost or he could have won that helps center the idea that the competition is supposed to fun and doesn't mean anything.
I loved those folks. We did great things together by working as a team. We all had different skills which complemented each other. We were diverse. We were international. We cared about each other. We supported each other. We got things done.
So that's it. That's my Dutch file format story. THANKS FOR LISTENING.
Cartoon profile pic by Ethan Kocak (@Blackmudpuppy)
Missing some Tweet in this thread? You can try to force a refresh.

Keep Current with 🔥Kareem Carr🔥

Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!