it all started with @_willfalcon casually reading the papers on DIM and CPC and talking about how he could come up with a better contrastive learning algo 1.5+ years ago. instead of adding yet another novel, sota, simple, awesome, principled contrastive learning algo, ..
@_willfalcon sat down, painstakingly implemented an effective & efficient framework for ML experimentation (which ended up being @PyTorchLightnin), talked with the authors of an ever-growing set of novel, sota, simple, awesome, principled contrastive learning algo.'s, ..
reproduced them in a unified sw&conceptual framework as much to his best and experimented with them patiently. along the way, @_willfalcon and i have learned a lot about these recent algo's and @_willfalcon is releasing all his implementations at github.com/PyTorchLightni….
i don't believe this would be accepted to any ML/CV venue due to the lack of novelty, lack of perfect reproduction of existing algo's, lack of state-of-the-art accuracies, lack of theory, etc. but i feel this may be one of more useful contributions to the community.
oh, and huge thanks to @philip_bachman and @devon_hjelm who have helped us reproduce AMDIM (which was the only one algo we can say we reproduced with some confidence) and are giving us feedback on the paper. we'll revise the paper accordingly soon!
• • •
Missing some Tweet in this thread? You can try to
force a refresh
we all want to and need to be prepared to train our own large-scale language models from scratch.
why?
1. transparency or lack thereof 2. maintainability or lack thereof 3. compliance or lack thereof
and because we can, thanks to amazing open-source and open-platform ecosystem.
(1/12)
we have essentially lost any transparency into pretraining data.
(2/12)
we are being force-fed so-called values of silicon valley tech co's, ignoring the diversity in values across multiple geographies, multiple sectors and multiple groups.
this semester (spring 2024), i created and taught a new introductory course on causal inference in machine learning, aimed at msc and phd students in cs and ds. the whole material was created from scratch, including the lecture note and lab materials;
now that the course is finally over, i've put all the lab materials, prepared by amazing @taromakino, @Daniel_J_Im and @dmadaan_, into one @LightningAI studio, so that you can try them out yourselves without any hassle;
as i tweeted last week, Prescient Design Team at gRED within @genentech is hiring awesome people. in particular, we have the following positions already open and ready:
[Engineering Lead] we want you to work with us to build a team for creating an ML infrastructure that seamlessly integrate between ML and bio: gene.com/careers/detail…
[Machine Learning Scientist] we have a ton of challenging problems inspired & motivated by biology, chemistry & medicine that are waiting for your creativity, knowledge and ingenuity in ML/AI: gene.com/careers/detail…
denoising in a discrete input has always fascinated me ever since i read jmlr.org/papers/volume1… by Vincent & @hugo_larochelle et al., and yoshua has always motivated me to look into denoising for sequence modeling ever since 2013.
just watched the Social Dilemma netflix.com/title/81254224. to anyone who's been thinking about and following various stories about social media and other "attention-grabbing" services, this documentary won't have too much new stuffs, ...
although it works as a great reminder that these services that are effectively surveilling us 24/7 and profiting by selling who we are are embedded in every aspect of our lives. ...
it's an interesting watch after reading Steven Levy's Facebook: The Inside Story (amazon.com/dp/B07V8CL7RH/).
Two ironies:
1. i'm obviously writing about this documentary immediately after watching it on "Facebook" and "Twitter"