I have shunned MATLAB for Python, I have sipped vintage FORTRAN 77 and published Java Applets at the turn of the millennium. I want to offer a grounded perspective on the benefits of doing science online.
As a Master's student in a newly formed Department of Scientific Computing I learned from applied mathematicians, physicists, materials scientists, biologists, geologists, statisticians and engineers. My program was a survey of computational methods and how to apply them.
I was quickly drawn to the discipline of scientific communication, entranced by the papers, talks and posters that were able to convey new and fascinating concepts to me as well as frustrated by those that... didn't.
I also learned that scientists use many different environments, often incompatible with and even incomprehensible to each other. Demonstrating a result was one thing, getting someone else to repeat that result on their machine was entirely another.
The reasons for any given scientist using any given environment were often great, it was where the existing toolset for the discipline was, it was where the compute was. But if you wanted to use an idea that came from somewhere else, well, good luck.
As computing infra has improved and standardized, the various scientific communities have come closer together. the state of the art in large scale computation has become much more cosmopolitan.
The state of scientific communication is still firmly in the grips of the PDF.
The web was born of the desire to share scientific knowledge, yet has remained a frontier for scientific communication for decades. It is often seen as the lowest common denominator of programming environments, and for this very reason it is also the most accessible.
stay tuned for the full talk with links to a bunch of examples #d3js 🔬💻
• • •
Missing some Tweet in this thread? You can try to
force a refresh
SAEs unpack so much existing value and unlock exciting new capabilities. It's happening in text, images and even proteins.
This is a long thread with lots of links and quote tweets of the projects, articles and code that made me 🤯
First of all, what is a Sparse Autoencoder (SAE)?
I really liked @a_karvonen's intuitive explanation here:
It's at a nice level of abstraction that gives a sense for how they work and what is exciting about them without going deep into training / math adamkarvonen.github.io/machine_learni…
One way to think about SAEs is that they are like a prism, they separate out concepts learned in a model into components that can be studied and manipulated.
@thesephist wrote a very accessible report on his experiments with SAEs at Notion
locally processes your tweets.js file (no uploading!)
with a little bit of processing we can easily load 10k+ tweets into a SQL database (DuckDB) for super fast queries
also easy to quickly make a searchable interface on top of some SQL results to visualize who you've been mentioning over time (and whether its tweets, retweets or replies)
I drew this diagram 5 years ago about where I like to focus my energy in the product development process.
@observablehq notebooks basically let me stay in "the fun" all the time...
but the definition of fun is relative!
when I first started learning #d3js I banged my head against the wall for 2 reasons: 1) I didn't understand the browser (DOM, JS, CSS etc.) 2) I didn't know how to think with d3: declaratively and data-driven.
Once I got a handle on these 2, the power (and the fun) came out...
I remember when AJAX was the new hotness in 2005 (it had been in IE for 6 years at that point). IE6 still mattered. git was just invented. there were magazines about code.
Your struggle is real, here is a thread on why I think it's all gonna be ok.
This thread is not a personal response to the OP but an attempt to share my perspective with the community.
Anyway, the reality certainly is that there are a ton of ways to build things, and constantly adding more. It can be hard to see where the fundamentals are underneath.
I did nothing but #reactjs for a solid year in 2015. I didn't really touch it again until a month ago.
It felt like nothing changed in 5 years, because fundamentally nothing did.
24. when I do datavis I iterate a lot and explore the space of both the data as well as the space of possible representations. mostly that means drawing a lot of small rectangles and seeing if anything pops out
25. t-sne, UMAP and dimensionality reduction will make that process much more fun and interesting
26. navigating, collecting and annotating representation spaces is a key challenge to tackle right now, as it's already a nexus for ML & vis