Latest Twitter Threads by @Teknium1 on Thread Reader App

Nov 2, 2023 • 10 tweets • 4 min read

Today I am releasing Open Hermes 2.5!

This model used the Hermes 2 dataset, with an added ~100k examples of Code Instructions, created by @GlaiveAI!

This model was originally meant to be OpenHermes-2-Coder, but I discovered during the process that it also improved almost every other benchmark!

Big improvements in HumanEval, but also in AGIEval and TruthfulQA, small improvement in GPT4All, and a slight decline in BigBench. This equated to a net gain across the board.

The HumanEval (Code Benchmark) has had the biggest improvement, from 43% to 50.7%, but as you can see, Open Hermes 2 has been a huge improvement over the original Nous-Hermes 13b, and a dramatic improvement over the original llama-2 7B base model:

Oct 16, 2023 • 9 tweets • 5 min read

𝐈𝐧𝐭𝐫𝐨𝐝𝐮𝐜𝐢𝐧𝐠 𝐎𝐩𝐞𝐧 𝐇𝐞𝐫𝐦𝐞𝐬 𝟐, a continuation of the Hermes series of models, now built on Mistral 7B!

The Hermes 2 model was trained on 900,000 instructions, and surpasses all previous versions of Hermes 13B and below, and matches 70B on some benchmarks!

Hermes 2 changes the game with strong multiturn chat skills, system prompt capabilities, and uses ChatML format. It's quality, diversity and scale is unmatched in the current OS LM landscape. Not only does it do well in benchmarks, but also in unmeasured capabilities, like Roleplaying, Tasks, and more.

The model can be downloaded on my HuggingFace, here:

Here are some example outputs, showcasing programming, recipes, discussions on consciousness, and roleplaying! huggingface.co/teknium/OpenHe…

May 3, 2023 • 4 tweets • 1 min read

Finally getting around to cleaning up the GPT4 / GPTeacher's code-instruct dataset to add to the repo, wish me luck xD I'm not liking the progress so far... I may have made such a mess making this dataset it has to be cleaned manually 🥲

May 2, 2023 • 4 tweets • 1 min read

I'm working on a LORA right now for 7B Llama. Its training consists of a combined/shuffled dataset of WizardLM + GPT4-LLM (GPT4Alpaca+Unnatural Instructions) + GPTeacher(General-Instruct+Roleplay-Instruct) and my unreleased Roleplay-Instruct v2.
Epoch 1: huggingface.co/teknium/Llama-… I havent tested it at all yet fyi. Could be completely terrible. Who knows. Will upload epoch 2 and the final 3rd epoch when done.

Apr 19, 2023 • 5 tweets • 2 min read

New LLM: @StabilityAI has released a 3B and 7B parameter LLM trained on (at least a large portion of) The Pile v2 - its unclear if it is 800B tokens or 1.5T tokens, I'm hearing conflicting reports on that - they also released fine tuned alpaca versions: huggingface.co/stabilityai/st…

It's confirmed atm only 800B Tokens - but *will be 1.5T* when done

Mar 14, 2023 • 5 tweets • 2 min read

Can you bring back Sydney with ChatGPT4?

Mar 3, 2023 • 6 tweets • 2 min read

Okay, so, people are having issues with running LLaMa on their home PC. I'm using a 3090, thanks to some help by a few people, I have got updated code for LLaMa's example inference code to run it.
First, download this repo and the models: github.com/facebookresear… 1/ Place each model folder in the root of your project, install the requirements, make sure you get pytorch with cuda installed. Replace `example .py` with pastebin.com/fG2J7CHf

Mar 2, 2023 • 4 tweets • 1 min read

LLaMa models torrent.. If FB or someone requests I take it down I will. eee cdn.discordapp.com/attachments/10…

https://twitter.com/Teknium1/status/1631317551023263745

Share this page!

Enter URL or ID to Unroll