I'm humbled by the recent advances in NLP. I was testing this Keras model on @huggingface (huggingface.co/keras-io/trans…) using the abstract of a random (but good) ML article:
arxiv.org/pdf/2002.09405…
Q: "Which examples of simulated environments are given in the text ?"
A: "fluids, rigid solids, and deformable materials"
👍 spot on
Q: "What does this new model do better than previous instances ?"
R: "advances the state-of-the-art in learned physical simulation"
👍👍 yep!
Q: "What is this article about ?"
A: "Graph Network-based Simulators"
👍👍👍 indeed
Q: "How does the model work ?"
A: "advances the state of the art"
❌ Nope! It did not get this one.
Q: "Which type of neural network is the model based on ?"
A: "Graph Network"
👍 good answer again
4/5 great answers without (much) cherry-picking. I did not ask yes/no questions because the model cannot give yes/no answers.
All in all, rather impressive. You can train this yourself on keras.io:
keras.io/examples/nlp/q…

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Martin Görner

Martin Görner Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @martin_gorner

Aug 2, 2021
Here is Mask R-CNN, the most popular architecture used for object detection and segmentation.
The conceptual principle of the R-CNN family is to use a two-step process for object detection:
1) a Region Proposal Network (RPN) identifies regions of interests(ROIs)
2) The ROIs are cut from the image and fed through a classifier.
In fact, the cutting is not done the original image but directly on the feature maps extracted from the backbone. Since the feature maps are much lower resolution than the image, the cropping requires some care: sub-pixel extraction and interpolation aka. "ROI alignment".
Read 5 tweets
Jul 19, 2021
The MobileNet family of convolutional architectures uses depth-wise convolutions where the channels of the input are convolved independently.
Their basic building block is called the "Inverted Residual Bottleneck", compared here with the basic blocks in ResNet and Xception (dw-conv for depth-wise convolution).
Here is MobileNetV2, optimized for low weight count and fast inference.
Read 4 tweets
Jun 28, 2021
I made a ton of ML architecture illustrations for an upcoming book. Starting with good old Alex Net

The book: oreilly.com/library/view/p… by @lak_gcp, Ryan Gillard and myself.
and just as good and old VGG19:
Here is a SqueezeNet module.The pape rcalls them "fire
🔥 modules"
Read 4 tweets
Nov 28, 2019
Now reading the ARC paper by @fchollet.
arxiv.org/abs/1911.01547 “On the measure of intelligence” where he proposes a new benchmark for “intelligence” called the “Abstraction and Reasoning corpus”.
Highlights below ->
@fchollet Chess was considered the pinnacle of human intelligence, … until it was solved by a computer and surpassed Garry Kasparov in 1997. Today, it is hard to argue that a min-max algorithm with optimizations represents “intelligence”.
@fchollet AlphaGo took this to the next step. It became world champion at Go by using deep learning. Still, the program is narrowly focused on playing Go and solving this task did not lead to breakthroughs in other fields.
Read 32 tweets
Sep 13, 2018
Google Cloud Platform now has preconfigured deep learning images with Tensorflow, PyTorch, Jupyter, Cuda and CuDNN already installed. It took me some time to figure out how to start Jupyter on such an instance. Turns out it's a one liner:
Detailed instructions:
1) Go to cloud.google.com/console and create an instance (pick the Tensorflow deep learning image and a powerful GPU)
2) Ssh into your instance using the "gcloud compute ssh" command in the pic (there will be additional install prompts to accept and a reboot on the first connection. Relaunch the command after that to reconnect). Replace PROJECT_NAME and INSTANCE_NAME with your own values.
Read 7 tweets
Jan 19, 2017
I believe a dev can get up to speed on neural networks in 3h and then learn by himself. Ready for a crash course? /1
Got 3 more hours ? The "Tensorflow without a PhD" series continues. First a deep dive into modern convolutional architectures: .
This session walks you through the construction of a neural network that can spot airplanes in aerial imagery. A good place to start for software devs who know some basics (relu, softmax, ...) and want to see a real model built from scratch.
Read 8 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

:(