elvis Profile picture
Dec 14, 2018 16 tweets 3 min read Read on X
Takeaways/Observations/Advice from my #NeurIPS2018 experience (thread):
❄️(1): deep learning seems stagnant in terms of impactful and new ideas
❄️(2): on the flip side, deep learning is providing tremendous opportunities for building powerful applications (could be seen from the amount of creativity and value of works presented in workshops such as ML for Health and Creativity)
❄️(3): the rise of deep learning applications is all thanks to the continued integration of software tools (open source) and hardware (GPUs and TPUs)
❄️(4): Conversational AI is important because it encompasses most subfields in NLP... also, embedding social capabilities into these type of AI systems is a challenging task but very important one going forward
❄️(5): it's important to start to think about how to transition from supervised learning to problems involving semi-supervised learning and beyond. Reinforcement learning seems to be the next frontier. BTW, Bayesian deep learning is a thing!?
❄️(6): we should not avoid the questions or the thoughts of inspiring our AI algorithms based on biological systems just because people are saying this is bad... there is still a whole lot to learn from neuroscience
❄️(7): when we use the word "algorithms" to refer to AI systems it seems to be used in negative ways by the media... what if we use the term "models" instead? (rephrased from Hanna Wallach)
❄️(8): we can embrace the gains of deep learning and revise our traditional learning systems based on what we have learned from modern deep learning techniques (this was my favorite piece of advice)
❄️(9): the ease of applying machine learning to different problems has sparked leaderboard chasing... let's all be careful of those short-term rewards
❄️(10): there is a ton of noise in the field of AI... when you read about AI papers, systems and technologies just be aware of that
❄️(11): causal reasoning needs to be paid close attention... especially as we begin to heavily use AI systems to make important decisions in our lives
❄️(12): efforts in diversification seems to have amplified healthy interactions between young and prominent members of the AI community
❄️(13): we can expect to see more multimodal systems and environments being used and leveraged to help with learning in various settings (e.g., conversation, simulations, etc.)
❄️(14): let's get serious about reproducibility... this goes for all sub-disciplines in the field of AI
❄️(15): more efforts need to be invested in finding ways to properly evaluate different types of machine learning systems... this was a resonant theme at the conference...from the NLP people to the statisticians to the reinforcement learning people... it's a serious problem
I will formalize and expound on all of these observations, takeaways, and advice learned from my NeurIPS experience in a future post (will be posted directly at @dair_ai)... at the moment, I am still trying to put together the resources (links, slides, papers, etc.)

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with elvis

elvis Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @omarsar0

Mar 13
Prompt Engineering is NOT dead!

If you develop seriously with LLMs and are building complex agentic flows, you don't need convincing about this.

I've built the most comprehensive, up-to-date course on prompting LLMs, including reasoning LLMs.

4 hours of content! All Python! Image
Check it out if you're building AI Agents or RAG systems -- prompting tips, emerging use cases, advanced prompting techniques, enhancing LLM reliability, and much more.

All code examples use pure Python and the OpenAI SDKs. That's it!
This course is for devs and AI engineers looking for a proper overview of LLM design patterns and prompting best practices.

We offer support, a forum, and live office hours too.

DM me for discount options. Students & teams also get special discounts.

dair-ai.thinkific.com/courses/prompt…
Read 5 tweets
Mar 11
NEW: OpenAI announces new tools for building agents.

Here is everything you need to know: Image
OpenAI has already launched two big agent solutions like Deep Research and Operator.

The tools are now coming to the APIs for developers to build their own agents. Image
The first built-in tool is called the web search tool.

This allows the models to access information from the internet for up-to-date and factual responses. It's the same tool that powers ChatGPT search.

Powered by a fine-tuned model under the hood... Image
Read 16 tweets
Mar 5
A Few Tokens Are All You Need

Can you cut the fine-tuning costs of an LLM by 75% and keep strong reasoning performance?

A new paper from the Tencent AI Lab claims that it might just be possible.

Let's find out how: Image
The First Few Tokens

It shows that all you need is a tiny prefix to improve your model’s reasoning—no labels or massive datasets are required!

Uses an unsupervised prefix fine-tuning method (UPFT)—only requiring prefix substrings (as few as 8 tokens) of generated solutions. Image
Task template for Prefix Tuning

They use a simple task template for prefix tuning. By using a few leading tokens of the solution, the model learns a consistent starting approach without requiring complete, correct final answers. Other approaches require entire reasoning traces. Image
Read 8 tweets
Feb 27
Say goodbye to Chain-of-Thought.

Say hello to Chain-of-Draft.

To address the issue of latency in reasoning LLMs, this work introduces Chain-of-Draft (CoD).

Read on for more: Image
What is it about?

CoD is a new prompting strategy that drastically cuts down verbose intermediate reasoning while preserving strong performance. Image
Minimalist intermediate drafts

Instead of long step-by-step CoT outputs, CoD asks the model to generate concise, dense-information tokens for each reasoning step.

This yields up to 80% fewer tokens per response yet maintains accuracy on math, commonsense, and other benchmarks. Image
Read 7 tweets
Feb 20
NEW: Sakana AI introduces The AI CUDA Engineer.

It's an end-to-end agentic system that can produce highly optimized CUDA kernels.

This is wild! They used AI to discover ways to make AI run faster!

Let's break it down: Image
The Backstory

Sakana AI's mission is to build more advanced and efficient AI using AI.

Their previous work includes The AI Scientist, LLMs that produce more efficient methods to train LLMs, and automation of new AI foundation models.

And now they just launched The AI CUDA Engineer.Image
Why is this research a big deal?

Writing efficient CUDA kernels is challenging for humans.

The AI CUDA Engineer is an end-to-end agent built with the capabilities to automatically produce and optimize CUDA kernels more effectively. Image
Read 14 tweets
Feb 19
NEW: Google introduces AI co-scientist.

It's a multi-agent AI system built with Gemini 2.0 to help accelerate scientific breakthroughs.

2025 is truly the year of multi-agents!

Let's break it down: Image
What's the goal of this AI co-scientist?

It can serve as a "virtual scientific collaborator to help scientists generate novel hypotheses and research proposals, and to accelerate the clock speed of scientific and biomedical discoveries." Image
How is it built?

It uses a coalition of specialized agents inspired by the scientific method.

It can generate, evaluate, and refine hypotheses.

It also has self-improving capabilities.
Read 11 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(