Latest Twitter Threads by @ItakGol on Thread Reader App

Jun 15, 2023 • 14 tweets • 4 min read

Everyone Everywhere is talking about the new Function Calling capability of the OpenAI API. 😮

But what does it mean, and how do you use it?

🧵A Full Guide (1 / 13) 🧠 1/ To put it simply, you can now describe functions to GPT4/3.5 and have the model intelligently choose to output a JSON object containing arguments to call those functions. This is a new way to more reliably connect GPT's capabilities with external tools, databases and APIs.

Jun 7, 2023 • 12 tweets • 3 min read

Is your LLM-based App protected against Prompt Injection and Hijacking? 🥷🦹‍♂️

I bet it doesn't!

Witness firsthand the potential impact as I demonstrate the Hacking/Exploitation of this Demo App>>>

🧵 1/12

Let's consider a very simple app called FitGPT.

Given some basic information, it will provide the client with some fitness advice. Below you can watch how it works in the happy flow. Nice and simple.

But now let's see what it looks like under the hood>>>

🧵 2/8

Jun 1, 2023 • 18 tweets • 6 min read

How to Fine-Tune a Large Language Model (LLM) on a Single GPU or Google Colab? 😊

LoRA (Low-Rank Adaptation)! 🚀

Recently, LoRA has been a hot topic in the context of fine-tuning LLMs.

But what does LoRA really mean?

Let's dive in 📚

1 / 18

Let's get back first to the fundamentals of fine-tuning a model. 📚

The typical process looks like that-

2 / 18

May 13, 2023 • 11 tweets • 2 min read

1/ Holy Moses 🤯

Is Vector Databases (Pinecone, Chroma...) soon to be DEAD? 🤔

Anthropic just expanded their Claude LLM's context window to 100K tokens. X3 than GPT-4 not-yet-released 32K version. 🚀

Here is my full analysis ⤵️⤵️⤵️

2/

Anthropic expanded Claude’s context window to 100K tokens, around 75,000 words! 💭 Businesses can now submit hundreds of pages for Claude to digest and analyze. Conversations can go on for hours or even days! ⌛

May 5, 2023 • 13 tweets • 4 min read

GitHub Copilot RIP? 🕊🪦

Introducing StarCoder🌟

All you need to Know (+Demo+Extension+Model+Data)⤵️⤵️⤵️

2/ 🙈 Introduction

StarCoder and StarCoderBase are Large Language Models for Code trained on GitHub data. They outperform existing open Code LLMs on programming benchmarks and match or surpass closed models (like CoPilot).

Apr 24, 2023 • 9 tweets • 2 min read

1/🔊 Is this the future of Language Model Models (LLMs) with unlimited tokens?

A new paper - Scaling Transformer to 1 MILLION tokens and beyond with RMT,

This is the essence of it>>>

2/♾ RMT, Recurrent Memory Transformer, is able to retain information across up to 2 million tokens!

What???

Apr 13, 2023 • 7 tweets • 2 min read

1/🚀 Have you ever wondered how to BYPASS the TOKEN LIMITATION in OpenAI GPT requests?

Here are 5 methods to achieve infinite tokens ⏬⏬⏬

2/📦 Stuffing

Stuffing involves passing all related data as context to the language model, making a single call to it.

Pros+ => Easy implementation and access to all data.

Cons- => Limited context length and infeasibility for larger amounts of data.

Apr 12, 2023 • 4 tweets • 2 min read

1/ Open-source ML has done it again! 💨💨💨

Databricks just released a new LLM - Dolly 2.0 🐑👨‍💻

Here's what you need to know:
⤵️⤵️⤵️

2/
- This model is a 12B parameter language model based on EleutherAl Pythia model family.

- It's fine-tuned on 15K high-quality human-generated prompt/response pairs (crowdsourced among Databricks employees) for instruction tuning LLMs.

⤵️⤵️⤵️

Apr 12, 2023 • 7 tweets • 2 min read

Wouldn't you like to train 10B+ ChatGPT-style models on a single GPU and 100B+ on multi-GPUs systems?

Introducing DeepSpeed-Chat🚀🔥

github.com/microsoft/Deep…

⤵️⤵️⤵️

Microsoft just released DeepSpeed Chat, a game-changing end-to-end RLHF pipeline for training ChatGPT-like models! 😍👏

Apr 10, 2023 • 4 tweets • 2 min read

ChatGPT is so last month 😴

🔥 Stanford/Google researchers just dropped some mindblowing new research on generative agents, and it's like they brought Westworld to life. 🤖

Here's what you should know⤵️⤵️⤵️

Using a simulation video game they created, researchers made 25… twitter.com/i/web/status/1…

I will publish in the following days (tinyurl.com/yxf5xpku) a technical report* on this paper. Please follow to get notified. It will be accessible for everyone (not only patrons). You are welcome to follow.

*Architecture, APIs, Retrieval DBs, LangChain, Logic, etc.

Mar 31, 2023 • 5 tweets • 1 min read

Introducing HuggingGPT🔥🚀

HuggingGPT is a collaborative system that consists of an LLM as the controller and numerous expert models as collaborative executors (from HuggingFace Hub).

github.com/microsoft/JARV…

The workflow of HuggingGPT consists of 4 stages
>>>

1/
Task Planning: Using ChatGPT to analyze the requests of users to understand their intention, and disassemble them into possible solvable sub-tasks.

Mar 30, 2023 • 15 tweets • 3 min read

Wondering how to create ChatGPT From GPT-3? 🤓

Reinforcement Learning from Human Feedback (RLHF)!

A complete guide (13 tweets) for RLHF.

Thread>>>

1/
It is unlikely that supervised learning is going to lead us to true artificial intelligence. Before Deep Learning, most reinforcement learning applications were impractical.

Mar 30, 2023 • 5 tweets • 2 min read

The LLM tip of the day #1-

I have discovered a small yet crucial tip for producing superior code with GPT-4, which can help you increase productivity, deliver faster results, and enhance accuracy.

#LLMTipOfTheDay

>>>Thread>>>

When you are asking an LLM, let's say GPT-4, to code something, it is fair to say (although I am simplifying things a bit) that it is converging to the expectancy level of the training data.

What do I mean?

>>>

Mar 28, 2023 • 13 tweets • 4 min read

Curious about how your life will change with ChatGPT's Browsing mode?

Check out this 12-tweet thread for early access insights.

Absolutely mind-blowing>>🤯🤯🤯 Initializing...

1 / 12

Mar 11, 2023 • 5 tweets • 2 min read

Introducing OpenChatKit 🚀 -
The first open-source alternative to ChatGPT!

A team of ex-OpenAI fellows at Together have released a 20B chat-GPT model, fine-tuned for chat using EleutherAI's GPT-NeoX-20B, with over 43 million instructions under the Apache-2.0 license.

>>>

This instruction-tuned large language model has been optimized for chat on 100% carbon-negative compute.

OpenChatKit includes four essential components:

>>>

Mar 2, 2023 • 9 tweets • 2 min read

Birthday Paradox Explained

The birthday paradox is a surprising and counterintuitive phenomenon in probability theory that demonstrates the likelihood of two people in a group sharing the same birthday, even when the group is relatively small.

1 / 9

The paradox is often misunderstood as being about the probability of two people in a group having the same birthday, but it's actually about the probability of any two people in the group having the same birthday.

2 / 9

Feb 23, 2023 • 14 tweets • 3 min read

*** The History Behind ChatGPT ***

OpenAI's ChatGPT is a remarkable NLP model that has gotten a lot of attention, but it is important to note that the technology behind it has a rich history of research and development spanning several decades.

<1 / 14> THREAD

RNNs, first introduced in 1986 by David Rumelhart, form the foundation of it all. RNNs are specialized artificial neural networks designed to work with time-series or sequence data (paper: lnkd.in/d4jeAZnJ).

<2 / 14> THREAD

Jan 7, 2023 • 8 tweets • 2 min read

0/Get the free, open-source LLM that outperforms GPT-3 - download now!

1/Revolutionary new open-source large language model beats GPT-3 and PALM! And the best part? You can run it for free on your own computer-

Jan 7, 2023 • 4 tweets • 2 min read

𝙏𝙧𝙖𝙣𝙨𝙛𝙤𝙧𝙢𝙞𝙣𝙜 𝙀𝙀𝙂 𝘽𝙧𝙖𝙞𝙣 𝙒𝙖𝙫𝙚𝙨 𝙞𝙣𝙩𝙤 𝙎𝙥𝙤𝙠𝙚𝙣 𝙒𝙤𝙧𝙙𝙨 𝙪𝙨𝙞𝙣𝙜 𝘿𝙚𝙚𝙥 𝙇𝙚𝙖𝙧𝙣𝙞𝙣𝙜

Machine Learning can change a life.

#artificialintelligence #deeplearning #machinelearning #datascience #eeg #neuroscience

Read more-

New research which was published recently from the University of California has given a paralyzed man the ability to communicate by converting his brain EEG signals into computer-generated writing.

Jan 6, 2023 • 7 tweets • 2 min read

Ready to take your Machine Learning skills to the next level?

Check out these 5 top-rated 100% free ML courses from Ivy League Universities: - MIT 6.S191 Introduction to Deep Learning
introtodeeplearning.com

Dec 29, 2022 • 10 tweets • 2 min read

8 Biggest mistakes beginner Data Scientist make.

Here is the breakdown-

1. Not spending enough time understanding the problem

Understand the need before everything else - business needs, available data, metrics, and KPIs for success.

Share this page!

Enter URL or ID to Unroll