Thomas Wolf Profile picture
May 31, 2023 1 tweets 1 min read Read on X
The license of the Falcon 40B model has just been changed to… Apache-2 which means that this model is now free for any usage including commercial use (and same for the 7B) 🎉

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Thomas Wolf

Thomas Wolf Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @Thom_Wolf

Feb 25, 2023
Ok friends, it's weekend, we've some time for us. So let me tell you about a possible future for AI where the largest AI spendings, in billions of dollars, in 2-3 years would be on...

... antitrust legal fees

A quick 🧵1/8
let's go back to early 2000 – Microsoft was at that time the archenemy of the free and open source movements and the fight between OSS and private software was going strong

2/8
With smart moves like its partnership with IBM bundling Windows in IBM computers, Microsoft was able to reach a strong market-dominance in the PC world

3/8
Read 8 tweets
Dec 27, 2021
I read a lot of books this year to broaden my horizons in AI/ML with adjacent or complementary disciplines. It was a great pleasure so I’m sharing some my reading list here with a couple of notes:
[1/12]
P. Miller – Theories of Developmental Psychology
A great introduction to the major theoretical schools of child development
Orienting yourself in a field is easier when you’re familiar with a few important researchers & how each brought new views & approaches to the field
[2/12]
In AI, we have G. Hinton, Y LeCun or J. Pearl, in developmental and child psychology, similar pioneers are Jean Piaget, Lev Vygotsky or Eleanor Gibson (among many others) – most of today's research is build following or against some of their ideas
amazon.com/Theories-Devel…
[3/12]
Read 12 tweets
Dec 2, 2021
In 2021, we've seen an explosion of grounded Langage Models: from "image/video+text" to more embodied models in "simulation+text"

But, when tested on text benchmarks, these grounded models really struggle to improve over pure text-LM e.g T5/GPT3

Why?

>>
When we, humans, read sentences like "lemons are yellow" or "the dog is barking", we have the impression to recruit visual and auditory experiences.

Why then is it so difficult for grounded models to use multimodal inputs like we do to improve on text processing benchmarks?

>>
One reason might be that multimodal inputs are harder to efficiently use

I want to talk about another option here which is:
*langage is more abstract than we often think*

I'll summarise this fascinating cog. science opinion piece from Lupyan & Winter royalsocietypublishing.org/doi/10.1098/rs…

>>
Read 9 tweets
Sep 17, 2021
I'm not sure many people know how easy it is to share a dataset with the new versions of the @huggingface hub and Dataset library

So I make a quick video 📺 about it 👇

This works both for public and for private (within an org) datasets sharing
More info: huggingface.co/docs/datasets/…
And soon here: github.com/huggingface/da…

And feel free to ping the awesome @qlhoest @avillanovamoral and our new documentation king @stevhliu about this!
You can do exactly the same with models weights by the way!
Read 4 tweets
Jan 22, 2021
I like @_KarenHao's work and I think I can understand her point.

As someone a bit in the field, let me try to offer my angle on this.

I don't usually write complex/nuanced arguments on Twitter but let's try it again for once :-)

[Short thread]
In my view, this perception of large language models is mostly due to a narrative created by a few teams leveraging large language models as an instrument of power/showcase/business

In my view, these models:
1/ Are interesting artifacts to study and try to understand from a research NLP/Ethics/CL/AI point of view.

We are not talking about creating virus or nuclear stuff here. Releasing a 1T parameters model won't blow up :) it's actually a lot harder/costly/less-usable in practice.
Read 6 tweets
Jan 14, 2020
I often meet research scientists interested in open-sourcing their code/research and asking for advice.

Here is a thread for you.

First: why should you open-source models along with your paper? Because science is a virtuous circle of knowledge sharing not a zero-sum competition
1. Consider sharing your code as a tool to build on more than a snapshot of your work:
-other will build stuff that you can't imagine => give them easy access to the core elements
-don't over-do it => no need for one-liner abstractions that won't fit other's need – clean & simple
2. Put yourself in the shoes of a master student who has to start from scratch with your code:
- give them a ride up to the end with pre-trained models
- focus examples/code on open-access datasets (not everybody can pay for CoNLL-2003)
Read 8 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(