Tweet

Martin Görner

Sep 13, 2018 • 7 tweets • 3 min read

Google Cloud Platform now has preconfigured deep learning images with Tensorflow, PyTorch, Jupyter, Cuda and CuDNN already installed. It took me some time to figure out how to start Jupyter on such an instance. Turns out it's a one liner:

Detailed instructions:
1) Go to cloud.google.com/console and create an instance (pick the Tensorflow deep learning image and a powerful GPU)

2) Ssh into your instance using the "gcloud compute ssh" command in the pic (there will be additional install prompts to accept and a reboot on the first connection. Relaunch the command after that to reconnect). Replace PROJECT_NAME and INSTANCE_NAME with your own values.

3) You are now SSH'ed into your instance. Type "jupyter notebook". Jupyter starts and gives you a URL. Copy-paste it into your browser. That's it. The -L param in the SSH command sets up ssh tunnelling from localhost:8888 on your laptop to localhost:8888 on your instance.

Once again in copy-paste friendly text:
gcloud compute --project "PROJECT_NAME" ssh "INSTANCE_NAME" -- -L 8888:localhost:8888

Oh, and Jupyter lab is already running on port 8080 whenever a deep learning instance boots. You don't even need to start it. If you are into Jupyter Lab, start the instance and ssh right in:
gcloud compute ssh "INSTANCE_NAME" -- -L 8080:localhost:8080

It also works with multiple ports. You want jupyter notebooks (8888) and tensorboard (6006) ? No problem:

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @martin_gorner

Martin Görner

@martin_gorner

Feb 15

Google's LaMDA paper arxiv.org/abs/2201.08239 shows yet another information retrieval strategy: it has been taught to ask a search engine, or a calculator .

The first answer "It [Eiffel Tower] was constructed in 1887" is generated directly, but also recognized as containing a factual statement. This sends the whole context to LaMDA-Research which is trained to generate search queries, here "TS, Eiffel Tower, construction date"

"TS" means "ToolSet", i.e. the generated text is meant for a tool, the search engine, not the user.
Info from the search, i.e. "Eiffel Tower / construction started: 28 jan 1887" is appended to the context, which is sent to LaMDA-Research again.

Read 8 tweets

Martin Görner

@martin_gorner

Feb 4

@xuefeng_du

This is sweet 🥧 !
arxiv.org/abs/2202.01197
Finally a solid way of of teaching a neural network to know what it does not know.
(OOD = Out Of Domain, i.e. not one of the classes in the training data.) Congrats @SharonYixuanLin @xuefeng_du @MuCai7

The nice part is that it's a purely architectural change of the detection network, with a new contrastive loss which does not introduce additional hyper-parameters. No additional data required !

The results are competitive with training on a larger dataset manually extended with outliers: "Our method achieves OOD detection performance on COCO (AUROC: 88.66%) that favorably matches outlier exposure (AUROC: 90.18%), and does not require external data."

Read 6 tweets

Martin Görner

@martin_gorner

Feb 4

I like the "database layer" developed by DeepMind in their RETRO architecture:
deepmind.com/blog/article/l…
It teaches the model to retrieve text chunks from a vast textual database (by their nearest neighbour match of their BERT-generated embeddings) and use them when generating text

It's a bit different from the "memory layer" I tweeted about previously, which provides a large learnable memory, without increasing the number of learnable weights. (for ref: arxiv.org/pdf/1907.05242…)

This time, the model learns the trick of retrieving relevant pieces of knowledge from a large corpus of text.
The end result is similar: an NLP model that can do what the big guns can (Gopher, Jurassic-1, GPT3) with a tenth of their learnable weights.

Read 8 tweets

Martin Görner

@martin_gorner

Feb 2

@huggingface

I'm humbled by the recent advances in NLP. I was testing this Keras model on @huggingface (huggingface.co/keras-io/trans…) using the abstract of a random (but good) ML article:
arxiv.org/pdf/2002.09405…

Q: "Which examples of simulated environments are given in the text ?"
A: "fluids, rigid solids, and deformable materials"
👍 spot on

Q: "What does this new model do better than previous instances ?"
R: "advances the state-of-the-art in learned physical simulation"
👍👍 yep!

Read 7 tweets

Martin Görner

@martin_gorner

Aug 2, 2021

Here is Mask R-CNN, the most popular architecture used for object detection and segmentation.

The conceptual principle of the R-CNN family is to use a two-step process for object detection:
1) a Region Proposal Network (RPN) identifies regions of interests(ROIs)
2) The ROIs are cut from the image and fed through a classifier.

In fact, the cutting is not done the original image but directly on the feature maps extracted from the backbone. Since the feature maps are much lower resolution than the image, the cropping requires some care: sub-pixel extraction and interpolation aka. "ROI alignment".

Read 5 tweets

Martin Görner

@martin_gorner

Jul 19, 2021

The MobileNet family of convolutional architectures uses depth-wise convolutions where the channels of the input are convolved independently.

Their basic building block is called the "Inverted Residual Bottleneck", compared here with the basic blocks in ResNet and Xception (dw-conv for depth-wise convolution).

Here is MobileNetV2, optimized for low weight count and fast inference.

Read 4 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Martin Görner

Try unrolling a thread yourself!

More from @martin_gorner

Martin Görner

Martin Görner

Martin Görner

Martin Görner

Martin Görner

Martin Görner

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Like this author's thread?