Takeaways/Observations/Advice from my #NeurIPS2018 experience (thread):
❄️(1): deep learning seems stagnant in terms of impactful and new ideas
❄️(2): on the flip side, deep learning is providing tremendous opportunities for building powerful applications (could be seen from the amount of creativity and value of works presented in workshops such as ML for Health and Creativity)
❄️(3): the rise of deep learning applications is all thanks to the continued integration of software tools (open source) and hardware (GPUs and TPUs)
❄️(4): Conversational AI is important because it encompasses most subfields in NLP... also, embedding social capabilities into these type of AI systems is a challenging task but very important one going forward
❄️(5): it's important to start to think about how to transition from supervised learning to problems involving semi-supervised learning and beyond. Reinforcement learning seems to be the next frontier. BTW, Bayesian deep learning is a thing!?
❄️(6): we should not avoid the questions or the thoughts of inspiring our AI algorithms based on biological systems just because people are saying this is bad... there is still a whole lot to learn from neuroscience
❄️(7): when we use the word "algorithms" to refer to AI systems it seems to be used in negative ways by the media... what if we use the term "models" instead? (rephrased from Hanna Wallach)
❄️(8): we can embrace the gains of deep learning and revise our traditional learning systems based on what we have learned from modern deep learning techniques (this was my favorite piece of advice)
❄️(9): the ease of applying machine learning to different problems has sparked leaderboard chasing... let's all be careful of those short-term rewards
❄️(10): there is a ton of noise in the field of AI... when you read about AI papers, systems and technologies just be aware of that
❄️(11): causal reasoning needs to be paid close attention... especially as we begin to heavily use AI systems to make important decisions in our lives
❄️(12): efforts in diversification seems to have amplified healthy interactions between young and prominent members of the AI community
❄️(13): we can expect to see more multimodal systems and environments being used and leveraged to help with learning in various settings (e.g., conversation, simulations, etc.)
❄️(14): let's get serious about reproducibility... this goes for all sub-disciplines in the field of AI
❄️(15): more efforts need to be invested in finding ways to properly evaluate different types of machine learning systems... this was a resonant theme at the conference...from the NLP people to the statisticians to the reinforcement learning people... it's a serious problem
I will formalize and expound on all of these observations, takeaways, and advice learned from my NeurIPS experience in a future post (will be posted directly at @dair_ai)... at the moment, I am still trying to put together the resources (links, slides, papers, etc.)
• • •
Missing some Tweet in this thread? You can try to
force a refresh
The multi-state training might not make sense initially but they provide clues on optimizations that we can continue to tap into.
Data quality is still very important for enhancing the usability of the LLM.
Unlike other reasoning LLMs, DeepSeek-R1's training recipe and weights are open so we can build on top of it. This opens up exciting research opportunities.
About the attached clip: the previous preview model wasn't able to solve this task. DeepSeek-R1 can solve this and many other tasks that o1 can solve. It's a very good model for coding and math.
When DeepSeek said "on par with OpenAI-o1" I thought they were just hyping. But based on my tests, it's clearly not so.
Wanted to add that DeepSeek-R1 got all of the hard tasks from the OpenAI LLM reasoning blog post correct for me. This is wild and totally unexpected! The only task where it failed (i.e., crossword puzzle) o1 also fails.
multi-state training -> multi-stage training
It means a couple of rounds of RL and fine-tuning. This leads to a model that is not only good at complex reasoning but is also aligned and usable in a real-world setting.
If you used their preview model, it definitely felt like it lacked the human preference alignment part which they somehow figured out in this release through the "RL for all scenarios" step explained in the paper.
An AI agent is made up of both the environment it operates in (e.g., a game, the internet, or computer system) and the set of actions it can perform through its available tools. This dual definition is fundamental to understanding how agents work.
👨💻 Agent Example
The figure shows an example of an agent built on top of GPT-4. The environment is the computer which has access to a terminal and filesystem. The set of action include navigate, searching files, viewing files, etc.
Google recently published this great whitepaper on Agents.
2025 is going to be a huge year for AI Agents.
Here's what's included:
- Introduction to AI Agents
- The role of tools in Agents
- Enhancing model performance with targeted learning
- Quick start to Agents with LangChain
- Production applications with Vertex AI Agents
- o1 is launching out of preview in the API
- support for function calling, structured output, and developer messages
- reasoning_effort parameter to tell the model how much effort to spend on thinking
- vision inputs in the API is here too
Visual inputs with developer message (this is a new spin to system message for better steering the model) inside of the OpenAI Playground
Cool to see support for function calling and response format for o1