Dzmitry Bahdanau Profile picture
Feb 2 10 tweets 4 min read
I spent 1000s of hours on competitive programming (proof-link: codeforces.com/profile/rizar). This makes me qualified to comment on #AlphaCode by @DeepMind

The result is nice, the benchmark will be useful, some ideas are novel. But human level is still light years away.

1/n
The system ranks behind 54.3% participants. Note that many participants are high-school or college students who are just honing their problem-solving skills. Most people reading this could easily train to outperform #AlphaCode, especially if time pressure is removed...
Limited time (e.g. 3 hours to solve 6 problems) is a key difficulty in comp. programming. The baseline human is very constrained in this model-vs-human comparison. For #AlphaCode the pretraining data, the fine-tuning data, the model size, the sampling - all was nearly maxed out.
Importantly, the vast majority of the programs that #AlphaCode generates are wrong (Figure 8). It is the filtering using example tests that allows #AlphaCode to actually solve something. Example tests are part of the input (App. F), yet most sampled programs can't solve them.
Using example tests is a fair game for comp. programming and perhaps for some of real world backend development. But for much of the real-world code (e.g. code that defines front-end behavior) crafting tests is not much easier than coding itself.
The paper emphasizes creative aspects of competitive programming, but from my experience it does involve writing lots of boilerplate code. Many problems involve deployment of standard algorithms: Levenstein-style DP, DFS/BFS graph traversals, max-flow, and so on.
Sec. 6.1 makes a point that #AlphaCode does not exactly copy sequences from training data. That’s a low bar for originality: change a variable name and this is no longer copying. It would be interesting to look at nearest neighbor solutions found using neural representations.
Let me also dilute these critical remarks with a note of appreciation. AlphaCode uses a very cool “clustering” method to marginalize out differently-written but semantically equivalent programs. I think forms of this approach can become a code generation staple.
To sum up: AlphaCode is a great contribution, and AI for coding is a very promising direction with lots of great applications ahead. But this is not AlphaGo in terms of beating humans and not AlphaFold in terms of revolutionizing an entire field of science. We've got work to do.
Thank for reading, and if you read thus far, considering submitting, reviewing for or simply attending our Deep Learning for Code Workshop at ICLR 2022 (@DL4Code, dl4c.github.io).

Call for papers: dl4c.github.io/callforpapers/
Call for reviewers: docs.google.com/forms/d/e/1FAI…

n/n

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Dzmitry Bahdanau

Dzmitry Bahdanau Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

:(