Kyunghyun Cho Profile picture
Sep 16, 2020 8 tweets 4 min read Read on X
denoising in a discrete input has always fascinated me ever since i read jmlr.org/papers/volume1… by Vincent & @hugo_larochelle et al., and yoshua has always motivated me to look into denoising for sequence modeling ever since 2013.
it took me 5 years to look at refinement in the discrete space with @jasondeanlee & @elmanmansimov arxiv.org/abs/1802.06901. it took another 2 years to look at refinement in the hybrid space with jason and @raphaelshu aaai.org/Papers/AAAI/20…
now, we have finally moved on to refinement in the continuous space for sequence modeling with discrete tokens, again with awesome @jasondeanlee & @raphaelshu arxiv.org/abs/2009.07177.
and, ofc, the journey continues!
it's a small step overall that took 7 years, but i like it!
tagged a wrong jason lee. it should’vebeen @jasonleeinf
same here. it should’ve been @jasonleeinf
here's the new link to the latent variable approach: arxiv.org/abs/1908.07181

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Kyunghyun Cho

Kyunghyun Cho Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @kchonyc

Jul 23
enjoying #ICML2024 ? already finished with llama-3.1 tech report? if so, you must be concerned about the emptiness you'll feel on your flight back home in a couple of days.

do not worry! Wanmo and i have a new textbook on linear algebra for you to read, enjoy and cry on your long flight.

(1/5)Image
have you ever wondered why SVD comes so late in your linear algebra course?

both wanmo (math prof) and i (cs prof) began to question this a couple of years ago. after all, svd is one of the most widely used concepts from linear algebra in engineering, data science and AI. why wait until the end of the course?

(2/5)Image
Image
Image
Image
we began to wonder further whether SVD can be introduced as early as possible. i mean ... even before introducing positive definite matrices, matrix determinants and even ... eigenvalues (gasp!) without compromising on mathematical rigors.

(3/5)
Read 5 tweets
Jul 23
very cool to see a pretty exhaustive and extensive technical report on llama-3.1!

a few fun snippets 🧵
PLEASE release this custom html parse PLEASE 🙏 Image
lesson 1: AGI won’t happen due to the degrading QC of NVIDIA.

lesson 2: even Meta couldn’t figure out NCCL watchdog timeout error 😂 Image
Read 9 tweets
Jul 10
we all want to and need to be prepared to train our own large-scale language models from scratch.

why?

1. transparency or lack thereof
2. maintainability or lack thereof
3. compliance or lack thereof

and because we can, thanks to amazing open-source and open-platform ecosystem.

(1/12)
we have essentially lost any transparency into pretraining data.

(2/12)
Image
Image
we are being force-fed so-called values of silicon valley tech co's, ignoring the diversity in values across multiple geographies, multiple sectors and multiple groups.

(3/12)
Image
Image
Read 13 tweets
May 15
this semester (spring 2024), i created and taught a new introductory course on causal inference in machine learning, aimed at msc and phd students in cs and ds. the whole material was created from scratch, including the lecture note and lab materials;

1/4docs.google.com/document/d/1qN…
now that the course is finally over, i've put all the lab materials, prepared by amazing @taromakino, @Daniel_J_Im and @dmadaan_, into one @LightningAI studio, so that you can try them out yourselves without any hassle;

2/4lightning.ai/kc119/studios/…
i'm also making the lecture note i used to teach lectures throughout the semester publicly as well at .

3/4arxiv.org/abs/2405.08793
Read 4 tweets
Aug 23, 2021
good morning!

as i tweeted last week, Prescient Design Team at gRED within @genentech is hiring awesome people. in particular, we have the following positions already open and ready:
[Engineering Lead] we want you to work with us to build a team for creating an ML infrastructure that seamlessly integrate between ML and bio: gene.com/careers/detail…
[Machine Learning Scientist] we have a ton of challenging problems inspired & motivated by biology, chemistry & medicine that are waiting for your creativity, knowledge and ingenuity in ML/AI: gene.com/careers/detail…

cc: @stephenrra
Read 6 tweets
Dec 11, 2020
an awesome workshop on ML for molecules at #NeurIPS2020 neurips.cc/virtual/2020/p…

dying to watch all the talks here!
Read 6 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(