Bojan Tunguz Profile picture
Jun 5 7 tweets 3 min read Twitter logo Read on Twitter
Large Language Models (LLMs) have emerged as the cornerstone of the current Generative AI revolution. The big problem with LLMs is that they are, well, large. Really, really, large.

1/7 Image
They require an enormous amount of high-quality data to train and even more unfathomably large amount of computational power.

2/7
For a while now there has been a hope that fine-tuning a *smaller* language model on the output of some of these large ones would be a computationally efficient way of bringing the promises of GenAI to a wider audience.

3/7
However, in a recent in-depth research project, that promise has been seriously questioned. It seems that indeed for a genuinely general-purpose GenAI, the really large LLMs can’t be that easily replaced.

4/7
“Overall, we conclude that model imitation is a false promise: there exists a substantial capabilities gap between open and closed LMs that, with current methods, can only be bridged using an unwieldy amount of imitation data or by using more capable base LMs.

5/7
In turn, we argue that the highest leverage action for improving open-source models is to tackle the difficult challenge of developing better base LMs, rather than taking the shortcut of imitating proprietary systems.”

6/7
Read more in the following article:

“The False Promise of Imitating Proprietary LLMs”, arxiv.org/abs/2305.15717

#ArtificialIntelligence, #GenerativeAI #LargeLanguageModels #LLM #GenAI #AI

7/7

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Bojan Tunguz

Bojan Tunguz Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @tunguz

Mar 14
Would you like to win an RTX 4080? You are in luck, because at @nvidia we are giving away one (1) for GTC 2023. All you have to do is:

1. Like and share this tweet

2. Register for GTC: nvda.ws/3j6gw41

3. Post a screenshot of you in a session as a response below

1/7 ImageImage
A few points:

1. I am working with the NVIDIA marketing team to promote one giveaway; there are other influencers who are giving away more GPUs in their own giveaways.

2. GTC registration is completely free and open to the general public. All sessions are online.

2/7
3. You will need to take a screenshot of yourself in a session that you signed up for *while it's going on* and post it here as a response to the first tweet in this thread.

Below you will also find a few great sessions that I would recommend for this GTC.

3/7
Read 7 tweets
Feb 8
Things seem to be moving at a breakneck speed in the world of generative AI and large language models. In a surprise press event yesterday, @Microsoft announced a wide integration of @OpenAI tools into a couple of their major products,

1/4
Bing search engine and Edge web browser. In particular, this seems to be the first time that we'll see anywhere a public use of OpenAI's next generation LLM, GPT4. Most of the new features are still relatively limited, and you'll need to join the waitlist for the full access. 2/4
This announcement is bringing a whole another level of interest and enthusiasm for Bing and Edge. I have used them only occasionally over the years, but these new capabilities might make me use them on a regular basis.

3/4
Read 4 tweets
Feb 7
In a highly anticipated move, @Google yesterday announced that they are launching Bard, a conversational AI app that is based on their LaMDA model.

1/5
LaMDA - Language Model for Dialogue Applications - has been around for at least a year, but due to variety of considerations it has never been accessible to to the public.

2/5
Bard will be using a lighter, more computationally efficient, version of LaMDA. Bard’s rollout will proceed gradually, and in a month or so Google will be releasing an API for it to some trusted partners and developers.

3/5
Read 5 tweets
Jan 30
Deep Learning and Neural Networks have become the default approaches to Machine Learning in recent years. However, despite their spectacular success in certain domains (vision and NLP in particular),

1/5
their use across the board for all ML problems and with all datasets is problematic, to say the least. Oftentimes better and more robust results can be obtained with simpler, easier to train and deploy, classical ML algorithms.

2/5
One such “traditional” approach was recently used to reevaluate sleep scoring on a few publicly available datasets. The results were published in the journal of Biomedical Signal Processing and Control.

3/5
Read 5 tweets
Jan 29
There was nothing that shocked me more when I entered the industry from academia than this kind of attitude. I came from an environment where teaching and learning were the norm, to the one where giving help to “underperformers” was viewed with disdain as a liability.

1/5
Fortunately not all organizations and managers are this cutthroat, but this kind of mindset is pervasive, especially at startups. There is a widespread attitude that *it’s someone else’s responsibility to do the educating*: yours, your previous job’s, your college’s etc.

2/5
And in some way this is a *rational* attitude to have: there are hardly *any* incentives to help others get better, as this is almost never a peer of your performance evaluation.

3/5
Read 5 tweets
Dec 12, 2022
Last week @DeepMind’s research on AlphaCode - a competative programming system - has been published in Science. AlphaCode has been able to beat 54% of humans on a competative coding challenges, putting it on par with many junior-level developers.

1/4
The original announcement from DeepMind came out in February, which in the fast-paced world of AI is already ancient history.

2/4
The explosive rise of generative AI over the past few months will most certainly have a major impact, if it already hasn’t, on the future versions of AlphaCode and similar AI-enabled coding resources.

3/4
Read 4 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(