Thomas Wolf Profile picture
May 31, 2023 1 tweets 1 min read Read on X
The license of the Falcon 40B model has just been changed to… Apache-2 which means that this model is now free for any usage including commercial use (and same for the 7B) 🎉

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Thomas Wolf

Thomas Wolf Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @Thom_Wolf

May 15
A thread of my favorite recent ultra small AI models that you can run locally (text, vision, speech, video) – most of them are <1B some up to 3-4B 👇
Running your small VLM in browser
Read 10 tweets
Feb 25, 2023
Ok friends, it's weekend, we've some time for us. So let me tell you about a possible future for AI where the largest AI spendings, in billions of dollars, in 2-3 years would be on...

... antitrust legal fees

A quick 🧵1/8
let's go back to early 2000 – Microsoft was at that time the archenemy of the free and open source movements and the fight between OSS and private software was going strong

2/8
With smart moves like its partnership with IBM bundling Windows in IBM computers, Microsoft was able to reach a strong market-dominance in the PC world

3/8
Read 8 tweets
Dec 27, 2021
I read a lot of books this year to broaden my horizons in AI/ML with adjacent or complementary disciplines. It was a great pleasure so I’m sharing some my reading list here with a couple of notes:
[1/12]
P. Miller – Theories of Developmental Psychology
A great introduction to the major theoretical schools of child development
Orienting yourself in a field is easier when you’re familiar with a few important researchers & how each brought new views & approaches to the field
[2/12]
In AI, we have G. Hinton, Y LeCun or J. Pearl, in developmental and child psychology, similar pioneers are Jean Piaget, Lev Vygotsky or Eleanor Gibson (among many others) – most of today's research is build following or against some of their ideas
amazon.com/Theories-Devel…
[3/12]
Read 12 tweets
Dec 2, 2021
In 2021, we've seen an explosion of grounded Langage Models: from "image/video+text" to more embodied models in "simulation+text"

But, when tested on text benchmarks, these grounded models really struggle to improve over pure text-LM e.g T5/GPT3

Why?

>>
When we, humans, read sentences like "lemons are yellow" or "the dog is barking", we have the impression to recruit visual and auditory experiences.

Why then is it so difficult for grounded models to use multimodal inputs like we do to improve on text processing benchmarks?

>>
One reason might be that multimodal inputs are harder to efficiently use

I want to talk about another option here which is:
*langage is more abstract than we often think*

I'll summarise this fascinating cog. science opinion piece from Lupyan & Winter royalsocietypublishing.org/doi/10.1098/rs…

>>
Read 9 tweets
Sep 17, 2021
I'm not sure many people know how easy it is to share a dataset with the new versions of the @huggingface hub and Dataset library

So I make a quick video 📺 about it 👇

This works both for public and for private (within an org) datasets sharing
More info: huggingface.co/docs/datasets/…
And soon here: github.com/huggingface/da…

And feel free to ping the awesome @qlhoest @avillanovamoral and our new documentation king @stevhliu about this!
You can do exactly the same with models weights by the way!
Read 4 tweets
Jan 22, 2021
I like @_KarenHao's work and I think I can understand her point.

As someone a bit in the field, let me try to offer my angle on this.

I don't usually write complex/nuanced arguments on Twitter but let's try it again for once :-)

[Short thread]
In my view, this perception of large language models is mostly due to a narrative created by a few teams leveraging large language models as an instrument of power/showcase/business

In my view, these models:
1/ Are interesting artifacts to study and try to understand from a research NLP/Ethics/CL/AI point of view.

We are not talking about creating virus or nuclear stuff here. Releasing a 1T parameters model won't blow up :) it's actually a lot harder/costly/less-usable in practice.
Read 6 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(