Alex Gold Profile picture
Jul 9 7 tweets 3 min read
If you're a #dataScientist working in #rstats, you've probably heard of #docker, but might not know why you'd care or how to get started...

Here's a thread summarizing my talk from #useR2022 from a few weeks ago, and a link to my free online book📕!

🧵
Docker is a general tool for packaging code with its dependencies.

So in the data science world, Docker is a tool for reproducibility + portability. Docker can make it easy to share your work with others, or to keep it safe for later.
When you think about reproducing an R project, there are layers of reproducibility, as you make things more reproducible, it also takes more work.
The top layers of the reproducibility stack -- code, data, and R packages have existing tooling to reproduce. #git, {renv} for package libraries.

And who knows what for "reproducing" data. I don't actually have an answer on that one. It really depends on your data.
But those middle layers -- R versions, System Libraries, and the Operating System dependencies. Docker is a ⭐STAR⭐ here.

You can create a container image with a simple Dockerfile, and then have the environment up and running in just moments (once you've downloaded the image).
If you've never dealt with Docker before, here's a simple model of the states of containers and images.
Alright, you're onboard, but how?

Check out my book DevOps 4 Data Science. It's currently in draft form, but the Docker chapter is reasonably complete. It'll be out in print...sometime...but there'll always be a free online copy. Enjoy!

do4ds.com/chapters/sec1/…

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Alex Gold

Alex Gold Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(