Itamar Golan ๐Ÿค“ Profile picture
CEO & Co-founder @ Stealth ||| AI Researcher ||| LLM hacker
Vicky Joslyn Profile picture Louis Winthorpe III Profile picture roadzhang Profile picture dragondelis ๐Ÿ‡บ๐Ÿ‡ฆ Profile picture Shen Yang Profile picture 6 subscribed
Jun 15, 2023 โ€ข 14 tweets โ€ข 4 min read
Everyone Everywhere is talking about the new Function Calling capability of the OpenAI API. ๐Ÿ˜ฎ

But what does it mean, and how do you use it?

๐ŸงตA Full Guide (1 / 13) ๐Ÿง  1/ To put it simply, you can now describe functions to GPT4/3.5 and have the model intelligently choose to output a JSON object containing arguments to call those functions. This is a new way to more reliably connect GPT's capabilities with external tools, databases and APIs.
Jun 7, 2023 โ€ข 12 tweets โ€ข 3 min read
Is your LLM-based App protected against Prompt Injection and Hijacking? ๐Ÿฅท๐Ÿฆนโ€โ™‚๏ธ

I bet it doesn't!

Witness firsthand the potential impact as I demonstrate the Hacking/Exploitation of this Demo App>>>

๐Ÿงต 1/12 Image Let's consider a very simple app called FitGPT.

Given some basic information, it will provide the client with some fitness advice. Below you can watch how it works in the happy flow. Nice and simple.

But now let's see what it looks like under the hood>>>

๐Ÿงต 2/8 Image
Jun 1, 2023 โ€ข 18 tweets โ€ข 6 min read
How to Fine-Tune a Large Language Model (LLM) on a Single GPU or Google Colab? ๐Ÿ˜Š

LoRA (Low-Rank Adaptation)! ๐Ÿš€

Recently, LoRA has been a hot topic in the context of fine-tuning LLMs.

But what does LoRA really mean?

Let's dive in ๐Ÿ“š

1 / 18 Image Let's get back first to the fundamentals of fine-tuning a model. ๐Ÿ“š

The typical process looks like that-

2 / 18
May 13, 2023 โ€ข 11 tweets โ€ข 2 min read
1/ Holy Moses ๐Ÿคฏ

Is Vector Databases (Pinecone, Chroma...) soon to be DEAD? ๐Ÿค”

Anthropic just expanded their Claude LLM's context window to 100K tokens. X3 than GPT-4 not-yet-released 32K version. ๐Ÿš€

Here is my full analysis โคต๏ธโคต๏ธโคต๏ธ Image 2/

Anthropic expanded Claudeโ€™s context window to 100K tokens, around 75,000 words! ๐Ÿ’ญ Businesses can now submit hundreds of pages for Claude to digest and analyze. Conversations can go on for hours or even days! โŒ›
May 5, 2023 โ€ข 13 tweets โ€ข 4 min read
GitHub Copilot RIP? ๐Ÿ•Š๐Ÿชฆ

Introducing StarCoder๐ŸŒŸ

All you need to Know (+Demo+Extension+Model+Data)โคต๏ธโคต๏ธโคต๏ธ 2/ ๐Ÿ™ˆ Introduction

StarCoder and StarCoderBase are Large Language Models for Code trained on GitHub data. They outperform existing open Code LLMs on programming benchmarks and match or surpass closed models (like CoPilot).
Apr 24, 2023 โ€ข 9 tweets โ€ข 2 min read
1/๐Ÿ”Š Is this the future of Language Model Models (LLMs) with unlimited tokens?

A new paper - Scaling Transformer to 1 MILLION tokens and beyond with RMT,

This is the essence of it>>> ImageImage 2/โ™พ RMT, Recurrent Memory Transformer, is able to retain information across up to 2 million tokens!

What???
Apr 13, 2023 โ€ข 7 tweets โ€ข 2 min read
1/๐Ÿš€ Have you ever wondered how to BYPASS the TOKEN LIMITATION in OpenAI GPT requests?

Here are 5 methods to achieve infinite tokens โฌโฌโฌ Image 2/๐Ÿ“ฆ Stuffing

Stuffing involves passing all related data as context to the language model, making a single call to it.

Pros+ => Easy implementation and access to all data.

Cons- => Limited context length and infeasibility for larger amounts of data.
Apr 12, 2023 โ€ข 4 tweets โ€ข 2 min read
1/ Open-source ML has done it again! ๐Ÿ’จ๐Ÿ’จ๐Ÿ’จ

Databricks just released a new LLM - Dolly 2.0 ๐Ÿ‘๐Ÿ‘จโ€๐Ÿ’ป

Here's what you need to know:
โคต๏ธโคต๏ธโคต๏ธ Image 2/
- This model is a 12B parameter language model based on EleutherAl Pythia model family.

- It's fine-tuned on 15K high-quality human-generated prompt/response pairs (crowdsourced among Databricks employees) for instruction tuning LLMs.

โคต๏ธโคต๏ธโคต๏ธ
Apr 12, 2023 โ€ข 7 tweets โ€ข 2 min read
Wouldn't you like to train 10B+ ChatGPT-style models on a single GPU and 100B+ on multi-GPUs systems?

Introducing DeepSpeed-Chat๐Ÿš€๐Ÿ”ฅ

github.com/microsoft/Deepโ€ฆ

โคต๏ธโคต๏ธโคต๏ธ Image Microsoft just released DeepSpeed Chat, a game-changing end-to-end RLHF pipeline for training ChatGPT-like models! ๐Ÿ˜๐Ÿ‘
Apr 10, 2023 โ€ข 4 tweets โ€ข 2 min read
ChatGPT is so last month ๐Ÿ˜ด

๐Ÿ”ฅ Stanford/Google researchers just dropped some mindblowing new research on generative agents, and it's like they brought Westworld to life. ๐Ÿค–

Here's what you should knowโคต๏ธโคต๏ธโคต๏ธ

Using a simulation video game they created, researchers made 25โ€ฆ twitter.com/i/web/status/1โ€ฆ Image I will publish in the following days (tinyurl.com/yxf5xpku) a technical report* on this paper. Please follow to get notified. It will be accessible for everyone (not only patrons). You are welcome to follow.

*Architecture, APIs, Retrieval DBs, LangChain, Logic, etc.
Mar 31, 2023 โ€ข 5 tweets โ€ข 1 min read
Introducing HuggingGPT๐Ÿ”ฅ๐Ÿš€

HuggingGPT is a collaborative system that consists of an LLM as the controller and numerous expert models as collaborative executors (from HuggingFace Hub).

github.com/microsoft/JARVโ€ฆ

The workflow of HuggingGPT consists of 4 stages
>>> Image 1/
Task Planning: Using ChatGPT to analyze the requests of users to understand their intention, and disassemble them into possible solvable sub-tasks.
Mar 30, 2023 โ€ข 15 tweets โ€ข 3 min read
Wondering how to create ChatGPT From GPT-3? ๐Ÿค“

Reinforcement Learning from Human Feedback (RLHF)!

A complete guide (13 tweets) for RLHF.

Thread>>> Image 1/
It is unlikely that supervised learning is going to lead us to true artificial intelligence. Before Deep Learning, most reinforcement learning applications were impractical.
Mar 30, 2023 โ€ข 5 tweets โ€ข 2 min read
The LLM tip of the day #1-

I have discovered a small yet crucial tip for producing superior code with GPT-4, which can help you increase productivity, deliver faster results, and enhance accuracy.

#LLMTipOfTheDay

>>>Thread>>> Image When you are asking an LLM, let's say GPT-4, to code something, it is fair to say (although I am simplifying things a bit) that it is converging to the expectancy level of the training data.

What do I mean?

>>>
Mar 28, 2023 โ€ข 13 tweets โ€ข 4 min read
Curious about how your life will change with ChatGPT's Browsing mode?

Check out this 12-tweet thread for early access insights.

Absolutely mind-blowing>>๐Ÿคฏ๐Ÿคฏ๐Ÿคฏ Initializing...

1 / 12
Mar 11, 2023 โ€ข 5 tweets โ€ข 2 min read
Introducing OpenChatKit ๐Ÿš€ -
The first open-source alternative to ChatGPT!

A team of ex-OpenAI fellows at Together have released a 20B chat-GPT model, fine-tuned for chat using EleutherAI's GPT-NeoX-20B, with over 43 million instructions under the Apache-2.0 license.

>>> This instruction-tuned large language model has been optimized for chat on 100% carbon-negative compute.

OpenChatKit includes four essential components:

>>>
Mar 2, 2023 โ€ข 9 tweets โ€ข 2 min read
Birthday Paradox Explained

The birthday paradox is a surprising and counterintuitive phenomenon in probability theory that demonstrates the likelihood of two people in a group sharing the same birthday, even when the group is relatively small.

1 / 9 The paradox is often misunderstood as being about the probability of two people in a group having the same birthday, but it's actually about the probability of any two people in the group having the same birthday.

2 / 9
Feb 23, 2023 โ€ข 14 tweets โ€ข 3 min read
*** The History Behind ChatGPT ***

OpenAI's ChatGPT is a remarkable NLP model that has gotten a lot of attention, but it is important to note that the technology behind it has a rich history of research and development spanning several decades.

<1 / 14> THREAD Image RNNs, first introduced in 1986 by David Rumelhart, form the foundation of it all. RNNs are specialized artificial neural networks designed to work with time-series or sequence data (paper: lnkd.in/d4jeAZnJ).

<2 / 14> THREAD
Jan 7, 2023 โ€ข 8 tweets โ€ข 2 min read
0/Get the free, open-source LLM that outperforms GPT-3 - download now! 1/Revolutionary new open-source large language model beats GPT-3 and PALM! And the best part? You can run it for free on your own computer-
Jan 7, 2023 โ€ข 4 tweets โ€ข 2 min read
๐™๐™ง๐™–๐™ฃ๐™จ๐™›๐™ค๐™ง๐™ข๐™ž๐™ฃ๐™œ ๐™€๐™€๐™‚ ๐˜ฝ๐™ง๐™–๐™ž๐™ฃ ๐™’๐™–๐™ซ๐™š๐™จ ๐™ž๐™ฃ๐™ฉ๐™ค ๐™Ž๐™ฅ๐™ค๐™ ๐™š๐™ฃ ๐™’๐™ค๐™ง๐™™๐™จ ๐™ช๐™จ๐™ž๐™ฃ๐™œ ๐˜ฟ๐™š๐™š๐™ฅ ๐™‡๐™š๐™–๐™ง๐™ฃ๐™ž๐™ฃ๐™œ

Machine Learning can change a life.

#artificialintelligence #deeplearning #machinelearning #datascience #eeg #neuroscience

Read more- New research which was published recently from the University of California has given a paralyzed man the ability to communicate by converting his brain EEG signals into computer-generated writing.
Jan 6, 2023 โ€ข 7 tweets โ€ข 2 min read
Ready to take your Machine Learning skills to the next level?

Check out these 5 top-rated 100% free ML courses from Ivy League Universities: - MIT 6.S191 Introduction to Deep Learning
introtodeeplearning.com
Dec 29, 2022 โ€ข 10 tweets โ€ข 2 min read
8 Biggest mistakes beginner Data Scientist make.

Here is the breakdown- 1. Not spending enough time understanding the problem

Understand the need before everything else - business needs, available data, metrics, and KPIs for success.