Volodymyr Kuleshov 🇺🇦 Profile picture
Dec 16, 2022 11 tweets 4 min read Read on X
#Neurips2022 is now over---here is what I found exciting this year. Interesting trends include creative ML, diffusion models, language models, LLMs + RL, and some interesting theoretical work on conformal prediction, optimization, and more.
Two best paper awards went to work in creative ML---Imagen and LAION---in addition to many papers on improving generation quality, extending generation beyond images (e.g,. molecules), and more.
There was a lot of talk about ethics in creative ML (even an entire workshop on it), but I also saw fun applications in art, music & science (note: all workshops are recorded). Companies from Google to RunwayML had a big presence.
Diffusion models are another huge topic. Below is our DM circle 🙂 The panel at the DM workshop was great---key problems identified by panelists include discrete models, scalability, and going beyond Gaussian noising.
Several best paper awards went to diffusion model research, including Imagen, "elucidating the space of DMs", Riemannian score-based methods, and more.
Language models are obviously a big deal. Some papers reported interesting counter-intuitive phenomena (see pic), others reported interesting connections to RL. Bonus: best paper award for Chinchilla
LLM+RL is getting a lot of attention. Phil Blunsom gave a great workshop talk on interpreting in-context learning as adaptive computation. Some best papers in bigRL---MineDojo and Procthor.
ChatGPT also happened 🤯
I also found there was a lot of interesting theory work. Emmanuel Candes' keynote was on conformal prediction---a field that really blew up in the last 2-3 years. Turns out you can get confidence intervals in ML on non-IID data.
Also, lots of interesting theory on optimization, SGD, including two best paper awards. Learned optimizers might be making a comeback too!
My predictions for next year: lots of new extensions of diffusion models (discreteness, new types of diffusions). I also think LLMs will soon be smaller and easier to use. I'm excited for 2023!

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Volodymyr Kuleshov 🇺🇦

Volodymyr Kuleshov 🇺🇦 Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @volokuleshov

May 11, 2023
Do you know what's cooler than running LLMs on consumer GPUs? Finetuning large 65B+ LLMs on consumer GPUs! 🤖

Check out my new side project: LLMTune. It can finetune 30B/65B LLAMA models on 24Gb/48Gb GPUs.

github.com/kuleshov-group… Image
Here is a demo of the largest LLAMA-65B model, quantized to 4bits and finetuned on one A6000 GPU, writing the abstract of a machine learning paper: Image
Here is another fun demo in which the model generates the recipe for a blueberry lasagna dish. Image
Read 6 tweets
Jan 30, 2023
Here is an experiment: using ChatGPT to emulate a Jupyter notebook. You can even get it to run GPT inside ChatGPT.

And you can also train neural networks from scratch inside ChatGPT.🤯

Here's walkthrough of how it works.
We start with a clever prompt that asks ChatGPT to be a Jupyter notebook.

It correctly prints out "hello", and can do basic arithmetic. So far, so good!

Let's see if it can run some numpy.
Oops! We forgot to import numpy.

Let's try again.
Read 20 tweets
Dec 11, 2022
How can deep learning be useful in causal inference?

In our #NeurIPS2022 paper, we argue that causal effect estimation can benefit from large amounts of unstructured "dark" data (images, sensor data) that can be leveraged via deep generative models to account for confounders.
Consider the task of estimating the effect of a medical treatment from observational data. The true effects are often confounded by unobserved factors (e.g., patient lifestyle). We argue that latent confounders can be discovered from unstructured data (e.g., clinical notes).
For example, suppose that we have access to raw data from wearable sensors for each patient. This data implicitly reveals whether each patient is active or sedentary—an important confounding factor affecting treatment and outcome. Thus, we can also correct for this confounder.
Read 7 tweets
Dec 26, 2021
Imagine you build an ML model with 80% accuracy. There are many things you can try next: collect data, create new features, increase dropout, tune the optimizer. How do you decide what to try next in a principled way?
Here is an iterative process for developing ML models using which you can obtain good performance even in domains in which you may have little expertise (e.g., classifying bird songs). These ideas are compiled from my Applied ML class at Cornell.
You want to start with an initial baseline and evaluate its performance on a held-out development set. Based on what you see, you try a new model and fix the actual problems you observed. You retrain the new model, re-analyze, and repeat the process as long as needed.
Read 7 tweets
Jan 24, 2021
Did you ever want to learn more about machine learning in 2021? I'm excited to share the lecture videos and materials from my Applied Machine Learning course at @Cornell_Tech! We have 20+ lectures on ML algorithms and how to use them in practice. [1/5]
One new idea we tried in this course was to make all the materials executable. Each set of slides is also a Jupyter notebook with programmatically generated figures. Readers can tweak parameters and generate the course materials from scratch. [2/5]
Also, whenever we introduce an important mathematical formula, we implement it in numpy. This helps establish connections between the math and how to apply it in code. [3/5]
Read 6 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(