elvis Profile picture
Dec 5, 2018 9 tweets 3 min read Read on X
A simple method for fair comparison? #NeurIPS2018 Image
Considerations: Image
Reproducibility checklist: Image
There is room for variability, especially when using different distributed systems: Image
Complexity of the world is discarded... We need to tackle RL in the natural world through more complex simulations. Image
Embedding natural background? Image
Set the bar higher for the naturalism of the environment: Image
You learn a lot by considering this idea of stepping out in the real world: Image
Reproducibility test: Image

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with elvis

elvis Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @omarsar0

Jul 18
That's right! It's a huge week for small language models (SLMs)

Few new SLMs on my radar:
Mistral NeMo

Highlights:
- Introduced by Mistral + NVIDIA
- Apache 2.0 license
- outperforms Gemma 2 9B and Llama 2 8B
- multilingual capabilities
- efficient tokenizer (Tekken)

GPT-4o mini

Highlight: "15 cents per million input tokens, 60 cents per million output tokens, MMLU of 82%, and fast."

Read 7 tweets
Feb 21
JUST IN: Google DeepMind releases Gemma, a series of open models inspired by the same research and tech used for Gemini.

Open models fit various use cases so this is a very smart move from Google.

Great to see that Google recognizes the importance of openness in AI science and technology.

There are 2B (trained on 2T tokens) and 7B (trained on 6T tokens) models including base and instruction tuned versions. Trained on a context length of 8192 tokens.

Commercial use is allowed.

These are not multimodal models but based on the reported experimental results they appear to outperform Llama 2 7B and Mistral 7B.

I am excited about those MATH, HumanEval, GSM8K, and AGIEval results. These are really incredible results for a model this size.

Excited to dive deeper into these models. The model prompting guide is dropping soon. Stay tuned!Image
When I said outperforms other models, I meant generally outperforms them on all the benchmarks. Llama has a lot of catching up to do but it is interesting to see Mistral 7B trail Gemma very closely. These numbers don't really mean much in the context of real-world applications.

If you follow me here on X, you know how excited I get about unlocking unique value and use cases with small language models (SLMs). It will also be fun to run these locally and other other small devices. As I have been saying, SLMs are underexplored. It's a mistake to just see them as research artifacts.Image
Read 4 tweets
Dec 6, 2023
Gemini is here!

Google DeepMind just announced Gemini, their largest and most capable AI model.

A short summary of all you need to know:

1) What it is - Built with multimodal support from the ground up. Remarkable multimodal reasoning capabilities across text, images, video, audio, and code. Nano, Pro, and Ultra models are available to support different scenarios such as efficiency/scale and support complex capabilities.

2) Performance - The results on the standard benchmarks (MMLU, HumanEval, Big-Bench-Hard, etc.) show improvement compared to GPT-4 (though not by a lot). Still very impressive!

3) Outperforming human experts - They claim that Gemini is the first model to outperform human experts on MMLU (Massive Multitask Language Understanding), a popular benchmark to test the knowledge and problem-solving abilities of AI models.

4) Capabilities- Gemini surpasses SOTA performance on a bunch of multimodal tasks like infographic understanding and mathematical reasoning in visual contexts. There was a lot of focus on multimodal reasoning capabilities with the ability to analyze documents and uncover knowledge that's hard to discern. The model capabilities reported are multimodality, multilinguality, factuality, summarization, math/science, long-context, reasoning, and more. It's probably one of the most capable models by the looks of it.

5) Trying it out - Apparently, a fine-tuned Gemini Pro is available to use via Bard. Can't wait to experiment with this soon.

6) Availability - Models will be made available for devs on Google AI Studio and Google Cloud Vertex AI by Dec 13th.

blog:

technical report:
Image
Here is the model verifying a student's solution to a physics problem. Huge implications in education. Will be taking a very close look at applications here. Image
As is becoming common now, very little to no details on architecture but it's great to see distillation useful for the Nano series models. Image
Read 16 tweets
Aug 2, 2023
You can now connect Jupyter with LLMs!

It provides an AI chat-based assistant within the Jupyter environment that allows you to generate code, summarize content, create comments, fix errors, etc.

You can even generate entire notebooks using text prompts!

You can also pass it… https://t.co/12DlystPJOtwitter.com/i/web/status/1…
Image
Official announcement: blog.jupyter.org/generative-ai-…
I am excited about the %%ai magic commands. Here is an example of how to use ChatGPT to generate working code within the notebook cells. Image
Read 5 tweets
Jun 25, 2023
How can you build your own custom ChatGPT-like system on your data?

This is not easy as it could require complex architecture and pipelines.

Given the high demand, I started to explore the ChatLLM feature by @abacusai.

I’m very impressed! Let's take a look at how it works:
Everyone has a knowledge base or data sitting around, like wiki pages, documentation, customer tickets, etc.

With ChatLLM you can quickly create a chat app, like ChatGPT, that helps you discover and answer questions about your data.
With @abacusai you point your system to a knowledge base or set of documents.

It ingests the data and creates the necessary pipelines, chunks the data, and sets up the essential components like embedding lookup.

You can use LLM APIs or open-source models.
Read 8 tweets
Jun 22, 2023
MosaicML just released MPT-30B!

The previous model they released was 7B. MPT-30B is an open-source model licensed for commercial use that is more powerful than MPT-7B.

8K context and 2 fine-tuned variants: MPT-30B-Instruct and MPT-30B-Chat.

https://t.co/rk3gdr8Ig8mosaicml.com/blog/mpt-30b
MPT-30B training data. "To build 8k support into MPT-30B efficiently, we first pre-trained on 1T tokens using sequences that were 2k tokens long, and continued training for an additional 50B tokens using sequences that were 8k tokens long."
Read 7 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(