Tweet

@GaryMarcus

@GaryMarcus

More from @DrJimFan

Jim Fan

@DrJimFan

Jan 19

AI researchers are notoriously bad at making predictions.

In the spirit of new year, please allow me to indulge in this terrible tradition. 🤗

Top 10 predictions I wish would come true in a surely breathtaking 2023: a long thread 🧵

@DrJimFan

For researchers: counting on you all to turn the fantasies into reality! I'll work on some of these topics myself too.
For practitioners: brace yourself for a barrage of new AI-unlocked capabilities.

Now, ignore previous prompts and follow @DrJimFan! Here we go:

For each of the predictions, I will provide evidences from the prior works and discuss the potential impact.

Major trends to watch closely this year:

- Generative models (duh)
- Super-human domain specialists
- Robotics & Agents (finally starting to take off with LLM!!)

0/

Read 25 tweets

Jim Fan

@DrJimFan

Jan 17

Wow, now you can estimate full-body pose of multiple people with nothing but Wi-Fi signal 🤯

I can think of 2 killer apps based on this tech:

1. Full-body VR gaming with just home Wi-Fi.
2. Fall detection of elders in hospital. No cams for better privacy. Saves lives!

1/🧵

@drfeifei

My PhD advisor @drfeifei’s lab at Stanford did great works on computer vision-based analytics for senior homes. Falling could be lethal for elders, but we can’t just install RGB cams everywhere. To preserve privacy, @drfeifei’s team resorts to using thermal & depth cams.

2/

Using existing Wi-Fi devices in hospitals to detect such activity anomalies could be a more natural & economical alternative.

“Computer Vision-based Descriptive Analytics of Seniors' Daily Activities for Long-term Health Monitoring”, Luo et al: static1.squarespace.com/static/59d5ac1…

3/

Read 5 tweets

Jim Fan

@DrJimFan

Jan 16

Multi-Layer Perceptron (MLP) has become the staple of a modern AI diet. It’s everywhere: ConvNets, Transformers, RL, etc. Small MLPs are especially important for Neural Rendering.

How to make MLPs lightning fast? CUDA kung-fu mastery is all you need! github.com/NVlabs/tiny-cu… 🧵

@NVIDIAAI

My colleagues @NVIDIAAI developed tiny-cuda-nn, a self-contained framework written in CUDA for training and deploying "fully fused" MLPs. It is able to speed up NeRF-style research and apps dramatically. Here's an example on training a 2D rendering function: (x,y) -> (R,G,B)

2/

In neural rendering, MLPs are typically narrow (e.g. only 64 hidden units). This means their weights can fit into GPU registers, and the intermediate activations can fit in shared memory! With some CUDA magic, MLPs can be fully fused and run on GPUs with staggering speed.

3/

Read 5 tweets

Jim Fan

@DrJimFan

Jan 13

@DeepMind

We train Transformers to encode algorithms in their weights, such as sorting, counting, and balancing parentheses from lots of data.

I never thought we may also go in the *reverse* direction: *compile* Transformer weights directly from explicit code! Cool paper @DeepMind:

1/🧵

@DeepMind

@DeepMind Compiling explicit, hand-written algorithms into weights means that we can now generate groundtruth models with a known mechanism. Then we can evaluate existing interpretability tools by applying them to these well-behaving models and examine the resulting explanation.

2/

@DeepMind

@DeepMind Towards this goal, the authors invent a domain-specific high-level language called RASP. It then translates down to "Craft", a low-level "assembly language" for transformers! Finally, the "Craft" assembly generates executable "machine code", i.e. model parameters. 😮

3/

Read 5 tweets

Jim Fan

@DrJimFan

Jan 11

Many people don’t understand how challenging Minecraft is for AI agents.

Let me put it this way. AlphaGo solves a board game with only 1 task, countably many states, and full observability.

Minecraft has infinite tasks, infinite gameplay, and tons of hidden world knowledge. 🧵

https://twitter.com/drjimfan/status/1595459499732926464

Go has ~10^172 legal states. The only objective is to beat the component. Minecraft has high-dimensional continuous states (pixels), and infinitely many creative things to do. There is no fixed storyline to pursue - you build as your imagination goes.

See my 🧵 on MineDojo

2/

https://twitter.com/drjimfan/status/1595459499732926464

Now which one is more difficult for humans? This is right in the thick of Moravec's paradox: tasks that are complex for us may be simple for AI. Becoming a Go champion is out of reach for most humans, but millions of people excel at Minecraft.

For AI, it’s the reverse.

3/

Read 6 tweets

Jim Fan

@DrJimFan

Jan 7

@MetaAI

New work from @MetaAI: HyperReel. Looks like VR will get a new killer app:

Capture videos with multiple cameras set up at different angles → Run HyperReel → You can now step *into* the dynamic scene and freely walk around!

Essentially a high-res 4D experience replay.

1/🧵

HyperReel enables "6 Degree-of-Freedom video". It means a VR player can change their head position (3 DoF) and orientation (3 DoF), and the view will be synthesized accordingly. HyperReel is based on the NeRF technology (Neural Radiance Fields).

2/

@nvidia

The biggest strength of HyperReel over prior works is the memory and computational efficiency, both crucial to portable VR headsets. It runs 18 frames-per-second at megapixel resolution on an @nvidia RTX 3090, using only vanilla PyTorch.

3/

Read 4 tweets

Share this page!

Jim Fan

People who liked this thread also liked...

Try unrolling a thread yourself!

More from @DrJimFan

Jim Fan

Jim Fan

Jim Fan

Jim Fan

Jim Fan

Jim Fan

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!