Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

kache

@yacineMTB

Aug 24, 2023 • 15 tweets • 4 min read • Read on X

Scrolly

I cannot believe zuck et al just beat gpt 3.5 at humaneval pass@1 and is approaching gpt4 with only 34b params

(47 pages, therefore reaction thread - code llama)

>trained on 16k tokens
pretty cool
>7B & 13B
>trained on infilling, instead of just prompt completion.
good for copilot replacement & custom local hacks
because gpt 3.5's init token latency was so bad, I had to retire my custom vscode extension
Having options is good!

500B tokens, then 20B tokens for long context fine tuning that's a lot of tokens
(for the foundational model that they release)
really hope they talk about the distributions of the data

The 500B tokens is:
-a "near deduplicated" dataset of public code
- 8% of the data is from natural language related to code (likely code documentation & public Q/A)
- they prevent forgetting langauge understanding using a sample from a natural language dataset

Their instruct dataset makes me feel itchy. It's generated, and sized at 14k
They use self instruct by creating unit tests, and then running the solutions against them to select.
:<
The problem is that the functions are interview style questions, and too localized

lol holy shit
free use unless you're google source software is going to actually beat gpt4 in a few months guaranteed
this is crazy
also - interesting to note the improvement of code llama python
fwiw a lot doesn't get captured in evals
I expect good UX models have worse evals

needs galactica proofreading :>
(teasing, I make a ton of mistakes too)

interesting
code llama is best at cplusplus human eval, vs other languages
wonder why

context up to 100k tokens shows decrease in ppl. very cool

you, also, learn to code after you learn to read and write, correct? therefore chart.

interesting
use low temperature for first guess, increase temperatures for subsequent guesses?

looooooool shade thrown
"where are the pretrained weights, sama? i though't ya'll were supposed to be open? hmmmmmmmm?" - zuck, probably

lol i knew it
>have access to one of the biggest compute cluster in the world
>overfit it on L1 interview questions
glad they ran the experiment, but I'm not going to bother downloading anything other than the foundation model

summary
- in the next month people are going to build pretty insane things on top of the 34b code foundational model
- the finetunes they created are of scientific interest, but don't download them and train your own

what a huge contribution from their team
it's not just compute
it's a lot of human hours and skill
thank you!

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @yacineMTB

kache

@yacineMTB

Jun 21

I got fired today. I'm not sure why, I personally don't think there is a reason, or that it's important.

When I joined twitter, I joined because of the engineers I met in SF. They seemed happy. They were having fun. Engineers at play. Engineers that were enabled. It was good!

They seemed competent. They spoke clearly. They didn't make things up. They told me why they worked there. One of them said: "this is the only place where I can work with this scale"

The scale. The scale is just absurd. 1m qps shit makes your eyes bleed. Pagers, pagers, pagers!

I was ambivalent to joining before I visited. I had dingboard, and it was growing fast. For me, it was a little adventure. But after meeting those engineers, I wanted to go back.

You can take the boy out of big tech
But its hard to take the big tech out of the boy

Read 31 tweets

kache

@yacineMTB

Mar 2

https://twitter.com/im_roy_lee/status/1895276427005845981

i actually don't think you could cheat the interview i give with AI. like it's laughably easy; it's something that you would have programmed yourself if you ever needed to write a tool to make a chart of your CC transactions

yet, my interviews screen out *a lot* of people

https://twitter.com/im_roy_lee/status/1895276427005845981

the point of a screen is an "are you alive" test and its actually pretty clear within 5 minutes of me going through it

in fact i'd say the more leetcode you do the more likely you are going to fail my screen. being overpracticed is the same as cheating

the truth is that most google programming interviews are laughably easy and are just testing whether you cheated your way through your CS degree. it's abundantly obvious when people do, no amount of "tools" will stop it

at some point we lost the plot and started LC inflation

Read 7 tweets

kache

@yacineMTB

Feb 25

I'm going to keep this thread bumped, comparing grok 3 and claude sonnet 37. I pinky promise i won't be biased.

The sample of the questions will not be "do some code work for me", but rather, explain something technical to me.

It will be a simple point scoring system, by eod

grok 3 gets a point. score is 1 to grok and 0 to sonnet

It was able to explain lazy monadic computation graphs, comparing two examples

claude 3.7 hallucinated / missed the problem (or i didnt understand its explanation which is also not good)

actually did have a benign one off script i needed to clean up some data, claude got the point. grok made a mistake on an import

1 to grok and 1 to sonnet

Read 9 tweets

kache

@yacineMTB

Feb 19

https://twitter.com/Kurrco/status/1891994463720948225

this is not funny, and more of this will start happening. it doesn't take a genius to be able to see what is coming next, stay strapped

https://twitter.com/Kurrco/status/1891994463720948225

remember what I said about industrial capacity to create actuators

did you think that china was the threat?

soon; carrying non-lethal birdshot will be prudent if you're anyone of note

Read 9 tweets

kache

@yacineMTB

Dec 29, 2024

crazy that a few 1 million+ follower people tried to get me fired 3 days ago. that's actually so funny

i think the miscalculation on my part is how much damage, in aggregate, was done by the ice effect of the control of free speech by tech lefties. i don't really blame them for their strong reaction against my (somewhat tasteless) shitposting

i also, felt cornered over the 2010s

i probably will not ever get fired for shitposting. if i do ever get fired, it will be because of a lack of shipping ability.

but also, I've afforded myself relative freedom - I realized that I can just bootstrap a company if I ever really do need to, and keeping my costs low

Read 7 tweets

kache

@yacineMTB

Dec 12, 2024

my dad has been a teacher most of his life (50 years minus some). tutoring people older than him in his adolescence to pass IB. university academic, PhD. then taught middle school and highschool math, now university

i told him about homeschooling, he surprised me with a "yes"

the reason he said yes isn't what you think. public schools do suck, yeah. the reason he said yes is because he's an avid user of LLMs, and was an avid user of google when it dropped (he taught me all the tricks!)

he speaks to advanced voice in arabic, and he taught himself python in his 60s in the last year with LLMs helping him

he seriously thinks we don't need school anymore

Read 5 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter URL or ID to Unroll

kache

Try unrolling a thread yourself!

More from @yacineMTB

kache

kache

kache

kache

kache

kache

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!