CEO & Co-founder @ Stealth ||| AI Researcher ||| LLM hacker
6 subscribers
Jun 15, 2023 โข 14 tweets โข 4 min read
Everyone Everywhere is talking about the new Function Calling capability of the OpenAI API. ๐ฎ
But what does it mean, and how do you use it?
๐งตA Full Guide (1 / 13)
๐ง 1/ To put it simply, you can now describe functions to GPT4/3.5 and have the model intelligently choose to output a JSON object containing arguments to call those functions. This is a new way to more reliably connect GPT's capabilities with external tools, databases and APIs.
Jun 7, 2023 โข 12 tweets โข 3 min read
Is your LLM-based App protected against Prompt Injection and Hijacking? ๐ฅท๐ฆนโโ๏ธ
I bet it doesn't!
Witness firsthand the potential impact as I demonstrate the Hacking/Exploitation of this Demo App>>>
๐งต 1/12
Let's consider a very simple app called FitGPT.
Given some basic information, it will provide the client with some fitness advice. Below you can watch how it works in the happy flow. Nice and simple.
But now let's see what it looks like under the hood>>>
๐งต 2/8
Jun 1, 2023 โข 18 tweets โข 6 min read
How to Fine-Tune a Large Language Model (LLM) on a Single GPU or Google Colab? ๐
LoRA (Low-Rank Adaptation)! ๐
Recently, LoRA has been a hot topic in the context of fine-tuning LLMs.
But what does LoRA really mean?
Let's dive in ๐
1 / 18
Let's get back first to the fundamentals of fine-tuning a model. ๐
The typical process looks like that-
2 / 18
May 13, 2023 โข 11 tweets โข 2 min read
1/ Holy Moses ๐คฏ
Is Vector Databases (Pinecone, Chroma...) soon to be DEAD? ๐ค
Anthropic just expanded their Claude LLM's context window to 100K tokens. X3 than GPT-4 not-yet-released 32K version. ๐
Here is my full analysis โคต๏ธโคต๏ธโคต๏ธ 2/
Anthropic expanded Claudeโs context window to 100K tokens, around 75,000 words! ๐ญ Businesses can now submit hundreds of pages for Claude to digest and analyze. Conversations can go on for hours or even days! โ
May 5, 2023 โข 13 tweets โข 4 min read
GitHub Copilot RIP? ๐๐ชฆ
Introducing StarCoder๐
All you need to Know (+Demo+Extension+Model+Data)โคต๏ธโคต๏ธโคต๏ธ
2/ ๐ Introduction
StarCoder and StarCoderBase are Large Language Models for Code trained on GitHub data. They outperform existing open Code LLMs on programming benchmarks and match or surpass closed models (like CoPilot).
Apr 24, 2023 โข 9 tweets โข 2 min read
1/๐ Is this the future of Language Model Models (LLMs) with unlimited tokens?
A new paper - Scaling Transformer to 1 MILLION tokens and beyond with RMT,
This is the essence of it>>>
2/โพ RMT, Recurrent Memory Transformer, is able to retain information across up to 2 million tokens!
What???
Apr 13, 2023 โข 7 tweets โข 2 min read
1/๐ Have you ever wondered how to BYPASS the TOKEN LIMITATION in OpenAI GPT requests?
Here are 5 methods to achieve infinite tokens โฌโฌโฌ
2/๐ฆ Stuffing
Stuffing involves passing all related data as context to the language model, making a single call to it.
Pros+ => Easy implementation and access to all data.
Cons- => Limited context length and infeasibility for larger amounts of data.
Apr 12, 2023 โข 4 tweets โข 2 min read
1/ Open-source ML has done it again! ๐จ๐จ๐จ
Databricks just released a new LLM - Dolly 2.0 ๐๐จโ๐ป
Here's what you need to know:
โคต๏ธโคต๏ธโคต๏ธ 2/ - This model is a 12B parameter language model based on EleutherAl Pythia model family.
- It's fine-tuned on 15K high-quality human-generated prompt/response pairs (crowdsourced among Databricks employees) for instruction tuning LLMs.
โคต๏ธโคต๏ธโคต๏ธ
Apr 12, 2023 โข 7 tweets โข 2 min read
Wouldn't you like to train 10B+ ChatGPT-style models on a single GPU and 100B+ on multi-GPUs systems?
โคต๏ธโคต๏ธโคต๏ธ
Microsoft just released DeepSpeed Chat, a game-changing end-to-end RLHF pipeline for training ChatGPT-like models! ๐๐
Apr 10, 2023 โข 4 tweets โข 2 min read
ChatGPT is so last month ๐ด
๐ฅ Stanford/Google researchers just dropped some mindblowing new research on generative agents, and it's like they brought Westworld to life. ๐ค
Here's what you should knowโคต๏ธโคต๏ธโคต๏ธ
Using a simulation video game they created, researchers made 25โฆ twitter.com/i/web/status/1โฆ
I will publish in the following days (tinyurl.com/yxf5xpku) a technical report* on this paper. Please follow to get notified. It will be accessible for everyone (not only patrons). You are welcome to follow.
*Architecture, APIs, Retrieval DBs, LangChain, Logic, etc.
Mar 31, 2023 โข 5 tweets โข 1 min read
Introducing HuggingGPT๐ฅ๐
HuggingGPT is a collaborative system that consists of an LLM as the controller and numerous expert models as collaborative executors (from HuggingFace Hub).
The workflow of HuggingGPT consists of 4 stages
>>> 1/ Task Planning: Using ChatGPT to analyze the requests of users to understand their intention, and disassemble them into possible solvable sub-tasks.
Mar 30, 2023 โข 15 tweets โข 3 min read
Wondering how to create ChatGPT From GPT-3? ๐ค
Reinforcement Learning from Human Feedback (RLHF)!
A complete guide (13 tweets) for RLHF.
Thread>>> 1/ It is unlikely that supervised learning is going to lead us to true artificial intelligence. Before Deep Learning, most reinforcement learning applications were impractical.
Mar 30, 2023 โข 5 tweets โข 2 min read
The LLM tip of the day #1-
I have discovered a small yet crucial tip for producing superior code with GPT-4, which can help you increase productivity, deliver faster results, and enhance accuracy.
>>>Thread>>>
When you are asking an LLM, let's say GPT-4, to code something, it is fair to say (although I am simplifying things a bit) that it is converging to the expectancy level of the training data.
What do I mean?
>>>
Mar 28, 2023 โข 13 tweets โข 4 min read
Curious about how your life will change with ChatGPT's Browsing mode?
Check out this 12-tweet thread for early access insights.
Introducing OpenChatKit ๐ -
The first open-source alternative to ChatGPT!
A team of ex-OpenAI fellows at Together have released a 20B chat-GPT model, fine-tuned for chat using EleutherAI's GPT-NeoX-20B, with over 43 million instructions under the Apache-2.0 license.
>>>
This instruction-tuned large language model has been optimized for chat on 100% carbon-negative compute.
OpenChatKit includes four essential components:
>>>
Mar 2, 2023 โข 9 tweets โข 2 min read
Birthday Paradox Explained
The birthday paradox is a surprising and counterintuitive phenomenon in probability theory that demonstrates the likelihood of two people in a group sharing the same birthday, even when the group is relatively small.
1 / 9
The paradox is often misunderstood as being about the probability of two people in a group having the same birthday, but it's actually about the probability of any two people in the group having the same birthday.
2 / 9
Feb 23, 2023 โข 14 tweets โข 3 min read
*** The History Behind ChatGPT ***
OpenAI's ChatGPT is a remarkable NLP model that has gotten a lot of attention, but it is important to note that the technology behind it has a rich history of research and development spanning several decades.
<1 / 14> THREAD
RNNs, first introduced in 1986 by David Rumelhart, form the foundation of it all. RNNs are specialized artificial neural networks designed to work with time-series or sequence data (paper: lnkd.in/d4jeAZnJ).
<2 / 14> THREAD
Jan 7, 2023 โข 8 tweets โข 2 min read
0/Get the free, open-source LLM that outperforms GPT-3 - download now!
1/Revolutionary new open-source large language model beats GPT-3 and PALM! And the best part? You can run it for free on your own computer-
Read more-
New research which was published recently from the University of California has given a paralyzed man the ability to communicate by converting his brain EEG signals into computer-generated writing.
Jan 6, 2023 โข 7 tweets โข 2 min read
Ready to take your Machine Learning skills to the next level?
Check out these 5 top-rated 100% free ML courses from Ivy League Universities:
- MIT 6.S191 Introduction to Deep Learning introtodeeplearning.com
Dec 29, 2022 โข 10 tweets โข 2 min read
8 Biggest mistakes beginner Data Scientist make.
Here is the breakdown- 1. Not spending enough time understanding the problem
Understand the need before everything else - business needs, available data, metrics, and KPIs for success.