Discover and read the best of Twitter Threads about #ICLR2023

Most recents (8)

Thrilled to share new state-of-the-art 3D molecule generation results with our new Geometry-Complete Diffusion Model (GCDM). I'll be presenting GCDM tomorrow at #ICLR2023 MLDD. Look forward to seeing you there!
Paper: arxiv.org/abs/2302.04313
Code: github.com/BioinfoMachine… Image
We find that our model's (1) sensitivity to molecular chirality and (2) application of attention for each molecular graph's edges both uniquely contribute to our model's success. Image
This model configuration lets us achieve significant improvements in generating 3D molecules with specific molecular properties. Image
Read 5 tweets
It is widely thought that neural networks generalize because of implicit regularization of gradient descent. Today at #ICLR2023 we show new evidence to the contrary. We train with gradient-free optimizers and observe generalization competitive with SGD.
openreview.net/forum?id=QC10R…
An alternative theory of generalization is the "volume hypothesis": Good minima are flat, and occupy more volume than bad minima. For this reason, optimizers are more likely to land in the large/wide basins around good minima, and less likely to land in small/sharp bad minima. Image
One of the optimizers we test is a “guess and check” (GAC) optimizer that samples parameters uniformly from a hypercube and checks whether the loss is low. If so, optimization terminates. If not, it throws away the parameters and samples again until it finds a low loss.
Read 7 tweets
Day 1 of #ICLR2023. Kigali is Buzzing with Artificial Intelligence, Machine Learning and Data Science. This is a historic moment as one of the biggest ML conferences is held on the African continent for the first time. It's even better as there is a great local presense. 1/n
Our first #ICLR2023 keynote is Sofia Crespo (@soficrespo91) - "Her work brings into question the potential of AI in artistic practice and its ability to reshape our understandings of creativity."
sofiacrespo.com 2/n
I did mention local presence. We have many attendees, speakers at #ICLR2023 that represent @DeepIndaba @MasakhaneNLP @dsa_org @black_in_ai and beyond. Let's all learn from each other and keep building the momentum.

3/n
Read 11 tweets
We're excited to have a number of papers by co-authors at Prescient and @genentech Research Bio AI/ML at #ICLR2023, #ICML23, and #AISTATS2023 spanning topics from basic ML to applications in drug discovery and even on the properties of black holes! 🌌 Check out below 👇 1/9
"Towards Understanding and Improving GFlowNet Training" #ICML23

@maxwshen, @folinoid, @EhsanHRA, @loukasa_tweet, @kchonyc, @tbyanc

2/9
"Few-shot Learning of Abstract Geometric Reasoning by Infusing Lattice Symmetry Priors in Attention Mechanisms" #ICML23

Matti Atzeni, @mrinmayasachan, @loukasa_tweet

3/9 Image
Read 9 tweets
Excited to announce that our work, “Computational Language Acquisition with Theory of Mind”, has been accepted to #ICLR2023! We equip language-learning agents with Theory of Mind, which improves their performance on an image referential game. arxiv.org/abs/2303.01502 (1/5)
Many language acquisition theories involve children’s ability to ascribe mental states to others. We model this as an internal module trained to predict listener behavior. The speaker generates many candidates and uses an internal listener to rerank and choose between them. (2/5) Image
We find significant gains in referential game accuracy, suggesting that equipping language-learning agents with ToM can improve referential game performance independently of general captioning ability. We also observe improvements in the fluency and precision of captions. (3/5) Image
Read 5 tweets
✨ New Paper ✨ on robust optimization to mitigate unspecified spurious features, accepted to #ICLR2023.
We present AGRO, a novel min-max optimization method that jointly finds coherent error-prone groups in training data and minimizes worst expected loss over them.
🧵🔽 1/6 ImageImage
2/ Human evaluation of ARGO groups in popular benchmark datasets shows that they contain well-defined, yet ✨ previously unstudied ✨ spurious correlations. For e.g., blondes wearing hats or sunglasses in CelebA and MNLI entailment examples with antonyms. More examples in paper. Image
3/ Group distributionally robust optimization (G-DRO) mitigates distributional shifts caused by spurious correlations in the training data by minimizing the worst expected loss over pre-identified groups in the data.
Read 6 tweets
Attention is all you need... but how much of it do you need?

Announcing H3 - a new generative language models that outperforms GPT-Neo-2.7B with only *2* attention layers! Accepted as a *spotlight* at #ICLR2023! 📣 w/ @tri_dao

📜 arxiv.org/abs/2212.14052 1/n
One key point: SSMs are *linear* in sequence length instead of quadratic, and have no fixed context length. Long context for everyone!

We're super excited, so we're releasing our code and model weights today - up to 2.7B parameters!

github.com/HazyResearch/H3 2/n
In H3, we replace attention with a new layer based on state space models (SSMs) - with the right modifications, we find that it can outperform Transformers.

Two key ideas:
* Adapting SSMs to be able to do *comparison*
* Making SSMs as hardware-efficient as attention 3/n
Read 18 tweets
Five papers have been accepted to #ICLR2013 in my group (including one oral presentation), covering topics from combining pretrained LMs with GNNs, deep generative models and pretraining methods for drug discovery.
1) An oral presentation. We proposed an effective and efficient method for combining pretrained LLMs and GNNs on large-scale text-attributed graphs via variational EM. The first place on 3 tasks of node property prediction on OGB leaderboards.

Paper: openreview.net/pdf?id=q0nmYci….
2) An end-to-end diffusion model for protein sequence and structure co-design, which iteratively refines sequences and structures through a denoising network.

Paper: openreview.net/pdf?id=pRCMXcf…
Read 8 tweets

Related hashtags

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!