Introducing LeMUR, short for Leveraging Large Language Models to Understand Recognized Speech.
Our new framework to apply LLMs to audio transcriptions and spoken conversations.
A key challenge with applying LLMs to audio files today is that LLMs are limited by their context windows.
Before an audio file can be sent into an LLM, it needs to be converted into text. And the longer an audio file is when transcribed into text, ...
May 8, 2023 • 10 tweets • 4 min read
You might have heard of vector databases and how they can be used to give LLMs long-term memory.
Here's a very beginner-friendly introduction to vector databases:
1/10
Let’s start with the motivation.
Over 80% of the data out there is unstructured, such as social media posts, images, videos, or audio data.
We cannot easily fit them into a relational database.
Built on top of the Conformer architecture and trained on 650K hours of audio data, it achieves near-human-level performance, making up to 43% fewer errors on noisy data than other ASR models.
1/9 2/9 Conformer architecture
We use a modified version of the conformer neural net published by Google Brain.
It's built on top of an Efficient Conformer (Orange Labs, 2021), that introduces the following technical modifications:
Feb 27, 2023 • 8 tweets • 3 min read
How to make your ML code 86 times faster💨:
By using JAX! This simple example speeds up a function from 331 ms to 3.81 ms!
Learn about JAX here. 1/8🧵 2/8 What is JAX?
JAX is Autograd and XLA, brought together for high-performance numerical computing and machine learning research.
In simple words: JAX is NumPy on the CPU, GPU, and TPU, with great automatic differentiation for high-performance machine learning research.
Feb 17, 2023 • 5 tweets • 3 min read
Reinforcement Learning is taking over the AI world.
Here are 4 RL courses from Stanford, UC Berkeley, DeepMind, and Hugging Face you can take for free to keep up with the trend:
1. Stanford CS234: Reinforcement Learning | Winter 2019
How to create an AI app with a free GPU using Flask ngrok, and Google Colab.
In this example, we build our own Stable Diffusion app.
Let's look at it step-by-step (The link to the code is at the end).
1/8🧵 2/8 Prerequisites:
Get a free ngrok Token at ngrok.com. We need this to expose our app that's running in the Google Colab to the public and get a public URL.
Feb 8, 2023 • 4 tweets • 2 min read
Are you tired of googling how to work with pandas DataFrames?
Meet the Open Source package "sketch". It's an AI code-writing assistant that understands data content.
It helps you to analyze your data and to write code:
Let's see how to use it:
1/4 2/4 Install it with pip, then load a data frame
Jan 13, 2023 • 7 tweets • 2 min read
6 Stanford Machine Learning, Deep Learning, and NLP courses you can watch for free to start your AI journey🧵
1. Stanford CS229: Machine Learning Course
As part of our end-of-the-year countdown collaboration with many amazing creators, we asked them to recommend must-read books for 2023.
Here is what they recommended: 1/4🧵 2/4 AI-related books: @DanKornas and @sophiamyang - Designing Machine Learning Systems by Chip Huyen @t_redactyl from the @jetbrains team - Introduction to Linear Algebra by Gilbert Strang @samzee_codes - Applied Artificial Intelligence by Mariya Yao et al.
Jan 6, 2023 • 5 tweets • 1 min read
Reinforcement Learning is taking over the AI world!🧠
It's one of the most flexible tools for future AI systems.
A mini thread 1/5🧵
2/5 Until recently, RL was often viewed as not so suitable for real-world problems, and only useful to play Atari games. This perception was due in part to the fact that RL methods require enormous amounts of data to train. They also require extensive fine-tuning to perform well.
Dec 19, 2022 • 7 tweets • 4 min read
Do you want to get started with Machine Learning in 2023?
Here's a 5 Step Study Guide for you with free resources🧵🚀
1. Math
Generative Models are a subset of AI Models that can be used to generate data that is similar to a set of training data. For example, if a well-performing generative model is trained on a dataset of human faces,
Nov 8, 2022 • 20 tweets • 3 min read
Here are 10 common AI terms explained in an easily understandable way.
1. Classification 2. Regression 3. Underfitting 4. Overfitting 5. Cost function 6. Loss function 7. Validation data 8. Neural Network 9. Parameter 10. Hyperparameter
AI Thread🧵👇
1. Classification
A Machine Learning task which seeks to classify data points into different groups (called targets or class labels) that are pre-determined by the training data. For example, if we have a medical dataset consisting of biological measurements...
Oct 28, 2022 • 18 tweets • 8 min read
The field of AI is moving incredibly fast.
Here’s a curated list of 15 AI newsletters to stay ahead of the field.🧵
1. The Batch | @DeepLearningAI_
Stay updated with weekly AI News and Insights delivered to your inbox.