JUST IN: Google DeepMind releases Gemma, a series of open models inspired by the same research and tech used for Gemini.
Open models fit various use cases so this is a very smart move from Google.
Great to see that Google recognizes the importance of openness in AI science and technology.
There are 2B (trained on 2T tokens) and 7B (trained on 6T tokens) models including base and instruction tuned versions. Trained on a context length of 8192 tokens.
Commercial use is allowed.
These are not multimodal models but based on the reported experimental results they appear to outperform Llama 2 7B and Mistral 7B.
I am excited about those MATH, HumanEval, GSM8K, and AGIEval results. These are really incredible results for a model this size.
Excited to dive deeper into these models. The model prompting guide is dropping soon. Stay tuned!
Blog:
Google DeepMind just announced Gemini, their largest and most capable AI model.
A short summary of all you need to know:
1) What it is - Built with multimodal support from the ground up. Remarkable multimodal reasoning capabilities across text, images, video, audio, and code. Nano, Pro, and Ultra models are available to support different scenarios such as efficiency/scale and support complex capabilities.
2) Performance - The results on the standard benchmarks (MMLU, HumanEval, Big-Bench-Hard, etc.) show improvement compared to GPT-4 (though not by a lot). Still very impressive!
3) Outperforming human experts - They claim that Gemini is the first model to outperform human experts on MMLU (Massive Multitask Language Understanding), a popular benchmark to test the knowledge and problem-solving abilities of AI models.
4) Capabilities- Gemini surpasses SOTA performance on a bunch of multimodal tasks like infographic understanding and mathematical reasoning in visual contexts. There was a lot of focus on multimodal reasoning capabilities with the ability to analyze documents and uncover knowledge that's hard to discern. The model capabilities reported are multimodality, multilinguality, factuality, summarization, math/science, long-context, reasoning, and more. It's probably one of the most capable models by the looks of it.
5) Trying it out - Apparently, a fine-tuned Gemini Pro is available to use via Bard. Can't wait to experiment with this soon.
6) Availability - Models will be made available for devs on Google AI Studio and Google Cloud Vertex AI by Dec 13th.
blog:
technical report:
Here is the model verifying a student's solution to a physics problem. Huge implications in education. Will be taking a very close look at applications here.
Aug 2, 2023 • 5 tweets • 2 min read
You can now connect Jupyter with LLMs!
It provides an AI chat-based assistant within the Jupyter environment that allows you to generate code, summarize content, create comments, fix errors, etc.
You can even generate entire notebooks using text prompts!
How can you build your own custom ChatGPT-like system on your data?
This is not easy as it could require complex architecture and pipelines.
Given the high demand, I started to explore the ChatLLM feature by @abacusai.
I’m very impressed! Let's take a look at how it works:
Everyone has a knowledge base or data sitting around, like wiki pages, documentation, customer tickets, etc.
With ChatLLM you can quickly create a chat app, like ChatGPT, that helps you discover and answer questions about your data.
Jun 22, 2023 • 7 tweets • 3 min read
MosaicML just released MPT-30B!
The previous model they released was 7B. MPT-30B is an open-source model licensed for commercial use that is more powerful than MPT-7B.
8K context and 2 fine-tuned variants: MPT-30B-Instruct and MPT-30B-Chat.
As an ML engineer, I’ve spent a lot of time building forecasting models.
Now, I don’t have to build complex time series models or understand market signals.
Akkio makes forecasting quick, easy, and accurate with predictive AI.
Here’s how:
Last week, I hosted a webinar with @AkkioHQ, and here’s a glimpse at what we covered:
-Challenges working with time series data and forecasting models
- Best practices for cleaning and preparing data to improve forecasting
- Several use cases like web traffic forecasting and… twitter.com/i/web/status/1…
May 25, 2023 • 6 tweets • 3 min read
Finetuning LLMs to call APIs
Present Gorilla, a finetuned LLaMA-based model that surpasses GPT-4 on writing API calls. This capability can help identify the right API, boosting the ability of LLMs to interact with external tools to complete specific tasks.
The bottom part shows how it performs inference (either using retrieval or zero-shot).
This seems like a really important layer for improving the reliability and effectiveness of LLMs that interact… twitter.com/i/web/status/1…
Apr 21, 2023 • 13 tweets • 6 min read
Exciting new updates from Bard!
Bard now helps with code generation tasks like debugging and code explanation. Supports over 20 programming languages like C++ and Python.
The Export to Colab option is brilliant for quick experimentation and refinement. I love this!… twitter.com/i/web/status/1…
Started to test it a bit.
I asked, "Can you please help me implement a basic RNN and test it on dummy text data?"
Then I exported to generated code to Google Colab. One section of the code wasn't working...
Apr 20, 2023 • 4 tweets • 2 min read
Prompt engineering tools are appearing everywhere!
We're also seeing a set of standardized prompt engineering techniques and tools to build effectively with LLMs.
Just this week:
W&B Prompts (from @weights_biases): tools to support prompt engineering; allows for debugging LLMs apps interactively, view latency, and other tracking features
LLM-based agents for performing complex scientific experiments.
Really interesting paper on developing agents based on LLMs for autonomous design, planning, and execution of scientific experiments. If you're looking for good papers on LLMs, you should read this one.… twitter.com/i/web/status/1…
We are also starting to see more use of vector search for improving the results of LLMs on more complex tasks. This is particularly important when the number of tokens you can pass to the LLM is limited. Selecting relevant docs is an effective approach.
Apr 15, 2023 • 6 tweets • 3 min read
OpenAssistant is officially released!
OpenAssistant is an open-source chat model. The release includes models, datasets, and a chat interface.
The dataset consists of a ~161K human-generated, human-annotated assistant-style conversation corpus, including 35 different languages… twitter.com/i/web/status/1…
The accompanying paper can be found here:
Proposes an approach that teaches LLMs to debug its predicted program via few-shot demonstrations.
This allows a model to identify its mistakes by explaining generated code in natural language.
Achieves SoTA on several code generation tasks like… twitter.com/i/web/status/1…
The full procedure is illustrated here:
Apr 4, 2023 • 4 tweets • 3 min read
Prompt Engineering Guide (20K⭐️)
We started with basic prompt examples and have expanded to a comprehensive prompt engineering guide used by thousands of AI developers and researchers working with LLMs.
- Now over 200K+ learners
- Chinese & Japanese translations are now… twitter.com/i/web/status/1…
GitHub star history 🤯
Apr 4, 2023 • 5 tweets • 2 min read
Another open-source chat model!
Baize is an open-source chat model fine-tuned with LoRA. Leverages 100K dialogs generated from ChatGPT chatting with itself.
The repo also contains other dialog datasets.
Also releases 7B, 13B, and 30B models. 60B model coming soon!
paper:… twitter.com/i/web/status/1…
Trying it out now. It's nice to see this simple prompt work. The explanation seems a bit weird but the generated code looks good.
Apr 4, 2023 • 10 tweets • 4 min read
New chatbot just dropped!
Vicuna-13B - an open-source chatbot trained by fine-tuning LLaMA on ~70K user-shared ChatGPT conversations.
Claims to achieve "more than 90%* quality of OpenAI ChatGPT and Google Bard while outperforming other models like LLaMA and Stanford Alpaca in… twitter.com/i/web/status/1…
Just gave it a quick try and the results look impressive compared to previous open chatbots I have tested. Didn't expect explanations that feel like ChatGPT. Interesting!
Mar 31, 2023 • 10 tweets • 5 min read
BloombergGPT is a new LLM for finance.
It's a 50 billion parameter language model trained on financial data.
Claims the largest domain-specific dataset yet with 363 billion tokens... further augmented with 345 billion tokens from general purpose arxiv.org/abs/2303.17564…… twitter.com/i/web/status/1…
Breakdown of the full training set used to train BloombergGPT.
Mar 30, 2023 • 12 tweets • 5 min read
As an ML Engineer, this is one of the most useful applications of GPT-4 I've seen.
Chat Explore is a powerful AI-powered data exploration tool.
Here’s why I am so impressed:
Speeding Your Data Exploration Workflow
@AkkioHQ’s Chat Explore is a GPT-4-powered data exploration tool to quickly answer questions to help you make faster decisions and find actionable insights.
Let’s take a look at a simple data exploration workflow with Chat Explore:
Mar 29, 2023 • 9 tweets • 4 min read
Open-source ML is at it again!
Introducing ColossalChat - an open-source solution for cloning ChatGPT with a complete RLHF pipeline.
Here is what you need to know:
While LLMs like ChatGPT are available as a service, we need a practical open-source solution with a complete RLHF pipeline.
Colossal-AI presents ColossalChat, a new open-source solution built upon the LLaMA model that closely resembles the original ChatGPT technical solution.
Mar 28, 2023 • 4 tweets • 2 min read
ChatDoctor: A medical chat model fine-tuned on LLaMA using medical domain knowledge.
Collects data on around 700 diseases and generated 5K doctor-patient conversations to finetune the LLM.