it all started with @_willfalcon casually reading the papers on DIM and CPC and talking about how he could come up with a better contrastive learning algo 1.5+ years ago. instead of adding yet another novel, sota, simple, awesome, principled contrastive learning algo, ..
@_willfalcon sat down, painstakingly implemented an effective & efficient framework for ML experimentation (which ended up being @PyTorchLightnin), talked with the authors of an ever-growing set of novel, sota, simple, awesome, principled contrastive learning algo.'s, ..
reproduced them in a unified sw&conceptual framework as much to his best and experimented with them patiently. along the way, @_willfalcon and i have learned a lot about these recent algo's and @_willfalcon is releasing all his implementations at github.com/PyTorchLightni….
i don't believe this would be accepted to any ML/CV venue due to the lack of novelty, lack of perfect reproduction of existing algo's, lack of state-of-the-art accuracies, lack of theory, etc. but i feel this may be one of more useful contributions to the community.
oh, and huge thanks to @philip_bachman and @devon_hjelm who have helped us reproduce AMDIM (which was the only one algo we can say we reproduced with some confidence) and are giving us feedback on the paper. we'll revise the paper accordingly soon!
• • •
Missing some Tweet in this thread? You can try to
force a refresh
enjoying #ICML2024 ? already finished with llama-3.1 tech report? if so, you must be concerned about the emptiness you'll feel on your flight back home in a couple of days.
do not worry! Wanmo and i have a new textbook on linear algebra for you to read, enjoy and cry on your long flight.
(1/5)
have you ever wondered why SVD comes so late in your linear algebra course?
both wanmo (math prof) and i (cs prof) began to question this a couple of years ago. after all, svd is one of the most widely used concepts from linear algebra in engineering, data science and AI. why wait until the end of the course?
(2/5)
we began to wonder further whether SVD can be introduced as early as possible. i mean ... even before introducing positive definite matrices, matrix determinants and even ... eigenvalues (gasp!) without compromising on mathematical rigors.
we all want to and need to be prepared to train our own large-scale language models from scratch.
why?
1. transparency or lack thereof 2. maintainability or lack thereof 3. compliance or lack thereof
and because we can, thanks to amazing open-source and open-platform ecosystem.
(1/12)
we have essentially lost any transparency into pretraining data.
(2/12)
we are being force-fed so-called values of silicon valley tech co's, ignoring the diversity in values across multiple geographies, multiple sectors and multiple groups.
this semester (spring 2024), i created and taught a new introductory course on causal inference in machine learning, aimed at msc and phd students in cs and ds. the whole material was created from scratch, including the lecture note and lab materials;
now that the course is finally over, i've put all the lab materials, prepared by amazing @taromakino, @Daniel_J_Im and @dmadaan_, into one @LightningAI studio, so that you can try them out yourselves without any hassle;
as i tweeted last week, Prescient Design Team at gRED within @genentech is hiring awesome people. in particular, we have the following positions already open and ready:
[Engineering Lead] we want you to work with us to build a team for creating an ML infrastructure that seamlessly integrate between ML and bio: gene.com/careers/detail…
[Machine Learning Scientist] we have a ton of challenging problems inspired & motivated by biology, chemistry & medicine that are waiting for your creativity, knowledge and ingenuity in ML/AI: gene.com/careers/detail…