Matt Shumer Profile picture
CEO @HyperWriteAI, @OthersideAI - I make AIs do the impossible.
9 subscribers
Feb 3 5 tweets 1 min read
Introducing `OpenDeepResearcher` 🌎

An open-source AI agent that does comprehensive research for you.

Just provide a topic, and the AI will go off, do research, and return a comprehensive report.

How it works: The approach is really simple.

Given a query, the AI:
- performs searches, views the result pages, and extracts important info
- if it wants to look deeper, it can repeat this process, with new queries
- once it's done, it uses the context to generate a report

That's it!
Feb 1 5 tweets 1 min read
Some initial impressions of o3 mini:

- it’s clear that the benchmarks don’t fully capture how good this model is — it’s clearly the best model I’ve used, for code

- the Cursor team has not figured out how to get it to work well in Composer — ChatGPT gives far better results I will be happily using this over o1/o1 pro.

So far, it has been much more accurate, more capable, and the speed is so nice.

I’ve already replaced my o1 pro bookmark with a link to o3 mini high mode.
Dec 17, 2024 17 tweets 4 min read
Since I've been getting so many requests:

Here's a mega-thread with my most useful o1 / o1 pro prompting tips for coding.

If you get the hang of using these, you'll build much faster and come up with far more elegant solutions! First — why o1? Compared to other models, it's:

- Capable of solving far more complex problems
- More likely to solve on the first shot, without back-and-forth
- Solutions tend to be more elegant and require fewer code changes
Nov 26, 2024 9 tweets 2 min read
Introducing OpenReasoningEngine, an open-source test-time-compute engine that can be used with any OpenAI-compatible model.

Image input, function calling, basic continual learning, + more.

This is an early experiment — there are issues that will need to be ironed out.

Thread: The engine guides the model to think step-by-step, and at each step, allows it to use code interpreters, web search, etc. to iterate solutions, test approaches, and gather info before responding.

So, when it finally responds, the answer is more likely to be accurate.
Sep 5, 2024 9 tweets 3 min read
I'm excited to announce Reflection 70B, the world’s top open-source model.

Trained using Reflection-Tuning, a technique developed to enable LLMs to fix their own mistakes.

405B coming next week - we expect it to be the best model in the world.

Built w/ @GlaiveAI.

Read on ⬇️:Image Reflection 70B holds its own against even the top closed-source models (Claude 3.5 Sonnet, GPT-4o).

It’s the top LLM in (at least) MMLU, MATH, IFEval, GSM8K.

Beats GPT-4o on every benchmark tested.

It clobbers Llama 3.1 405B. It’s not even close. Image
Jul 26, 2024 8 tweets 2 min read
Introducing `llama-405b-to-8b` ✍️

Get the quality of Llama 3.1 405B, at a fraction of the cost and latency.

Give one example of your task, and 405B will teach 8B (~30x cheaper!!) how to do the task perfectly.

And it's open-source: github.com/mshumer/gpt-pr…
This was made in partnership with @OctoAICloud — particularly Ben Hamm, who adapted my existing prompt optimization tools to take advantage of the new Llama 3.1 models.
Jul 22, 2024 7 tweets 2 min read
Introducing `claude-sonnet-to-gpt-4o-mini` ✍️

Get the quality of Claude 3.5 Sonnet, at a fraction of the cost and latency.

Give one example of your task, and Sonnet will teach 4o-mini (20x cheaper!!) how to do the task perfectly.

And it's open-source: shorturl.at/Cjjwt
This repo was inspired by this tweet that went viral months ago.

I discovered that if you prompt Haiku w/ Opus-generated examples, it can match Opus' quality.

Now, we have even better 'teacher' models than Opus, and cheaper 'student' models than Haiku.

Apr 10, 2024 4 tweets 1 min read
Introducing `gemini-youtube-researcher` 📈

An open-source Gemini 1.5 Pro agent that LISTENS to videos and delivers topical reports.

Just provide a topic, and a chain of AIs with access to YouTube will analyze relevant videos and generate a comprehensive report for you. This uses the new Gemini 1.5 Pro API that was released today.

It currently only supports listening to the audio content of videos. If anyone wants, please feel free to add support for video frames as well.
Apr 8, 2024 4 tweets 1 min read
Open-sourcing `AI-Oracle`.

Generates better responses than Claude 3 Opus.

A very simple approach that combines the abilities of Claude 3, GPT-4, and Perplexity to provide better results than any could provide on their own.

Seriously -- it's dumb simple.

Notebook in thread: How does it work?

The process is super simple. We simply query each model individually:
- Claude 3 Opus for reasoning + personality
- GPT-4 for reasoning
- PPLX for freshness/up-to-date info

Then, Claude combines the strengths of each and responds with a final, ideal output.
Apr 5, 2024 6 tweets 2 min read
Introducing `claude-researcher` 📈

A powerful Claude 3 research agent that delivers thorough reports in record time.

Just provide an topic, and a chain of AIs with **access to Google** will generate an incredibly comprehensive report for you.

And it's open-source! `claude-researcher` is a constrained agent -- meaning its behavior is highly-controlled, leading to better results than open-ended agents.

It chains together lots of Claude 3 calls (and Google access) that work together to create a detailed report on a topic of your choice.
Apr 3, 2024 7 tweets 2 min read
Introducing `Claude-Author` 📕✍️

One prompt -> an entire novel!

Just describe the high-level details, and a chain of AI systems will write an entire book for you in minutes.

- complete w/ cover art
- packages your book as a real e-book

And it's open-source! Previous AI book-writing systems produced mildly interesting books that were filled with errors and quite boring.

Claude-Author is the first AI system that actually produces readable books.

Still not perfect, but it's a leaps and bounds improvement over previous approaches.
Mar 27, 2024 5 tweets 1 min read
Introducing `claude-llm-trainer` ✍️

The world's simplest way to train a task-specific LLM.

Just write a sentence describing the model you want.

A chain of AI systems will generate a dataset and train a model for you.

And it's open-source. How it works:

- The user describes the model they want
Ex: "A model that writes Python functions"

- claude-llm-trainer leverages a chain of Claude 3 calls to create a great dataset for your task.

- We process the dataset, and train a LLaMA model!
Mar 25, 2024 5 tweets 2 min read
Introducing `claude-journalist` ✍️

The first Claude 3 journalist agent.

Just provide a topic, and it will:
- Search the web for articles/real-time details
- Choose the best sources and read through them
- Write a fantastic, *factual* article + edit it

And it's open-source! If you want to try it, you can head to the Github repo in the last tweet in this thread.

But if you don't want to bother with code, I've built an even better + FASTER version into HyperWrite -- try it here: app.hyperwriteai.com/personalassist…
Mar 22, 2024 6 tweets 2 min read
Introducing `claude-investor` 📈

The first Claude 3 investment analyst agent.

Just provide an industry, and it will:
- Find financial data/news for key companies
- Analyze sentiment/trends for each
- Rank stocks by investment potential + price targets

And it's open-source! `claude-investor` is a constrained agent -- meaning its behavior is highly-controlled, leading to better results than open-ended agents.

It chains together lots of Claude 3 calls that work together to analyze the major stocks in a given category.
Mar 21, 2024 7 tweets 2 min read
Introducing `claude-opus-to-haiku` ✍️

Get the quality of Claude 3 Opus, at a fraction of the cost and latency.

Give one example of your task, and Claude 3 Opus will teach Haiku (60x cheaper!!) how to do the task perfectly.

And it's open-source: github.com/mshumer/gpt-pr… This repo was inspired by this tweet that went viral.

Claude 3 Haiku is *60x* cheaper than Opus, and 10x faster.

I discovered that if you prompt Haiku with a number of great examples, it can match Opus' quality.
Mar 20, 2024 6 tweets 2 min read
Introducing `claude-prompt-engineer` ✍️

An agent that creates optimal Claude 3 prompts.

Just describe a task, and a chain of AIs will:
- Generate many possible prompts
- Test them in a ranked tournament
- Return the best one

And it's open-source: github.com/mshumer/gpt-pr…
`claude-prompt-engineer` is a constrained agent -- meaning its behavior is highly-controlled, leading to better results than open-ended agents.

It chains together lots of Claude 3 calls that work together to find the best possible prompt.
Oct 19, 2023 7 tweets 3 min read
Introducing the world's most powerful AI Assistant.

Personal Assistant is NOT just another AI chatbot.

It can:
- Operate your browser to actually complete tasks
- Cite sources, so you can trust what it says
- And so much more.

You won't believe what Personal Assistant can do:
Personal Assistant combines everything we've built to create the single most capable Assistant on the planet — from researching, to carrying out tasks for you, and much more.

For example, here is the Assistant writing a well-researched marketing email AND sending it!
Sep 12, 2023 8 tweets 3 min read
Here's a simple guide to set up your OpenAI Playground for day-to-day use, as a (better!) replacement for ChatGPT.

I've been getting so many questions about this, so hopefully this is helpful!

Read on:

Image First, why would you want to use the Playground over ChatGPT?

- Greater system prompt/behavior control
- Save multiple system prompts
- Temperature/creativity control
- Longer outputs for reasoning prompts/working with longer text
- Non-nerfed models :)
- Edit all messages

Etc.
Aug 23, 2023 5 tweets 2 min read
This is the world's simplest way to fine-tune a task-specific GPT-3.5.

**Just write a sentence describing the model you want.**

A chain of AI systems will generate a dataset and train a model for you.

And it's open-source: github.com/mshumer/gpt-ll…
This is a new addition to gpt-llm-trainer library.

gpt-llm-trainer is a constrained agent -- meaning its behavior is highly-controlled, leading to better results than open-ended agents.

It chains together lots of GPT-4 calls that work together to create a great dataset for you.
Aug 16, 2023 5 tweets 2 min read
Introducing `gpt-oracle-trainer` ✍️

The easiest way to create a chatbot that can answer questions about your product.

Just paste in your product's docs, and a chain of AI systems will generate a dataset and train a LLaMA 2 for you.

And it's open-source: github.com/mshumer/gpt-or…
gpt-oracle-trainer is a constrained agent -- meaning its behavior is highly-controlled, leading to better results than open-ended agents.

It chains together lots of GPT calls that work together to create a great dataset for you.
Aug 9, 2023 6 tweets 2 min read
Introducing `gpt-llm-trainer` ✍️

The world's simplest way to train a task-specific LLM.

**Just write a sentence describing the model you want.**

A chain of AI systems will generate a dataset and train a model for you.

And it's open-source: https://t.co/ANXr0SXPOjgithub.com/mshumer/gpt-ll…
gpt-llm-trainer is a constrained agent -- meaning its behavior is highly-controlled, leading to better results than open-ended agents.

It chains together lots of GPT-4 calls that work together to create a great dataset for you.