๐งต 1/ Vector stores & embeddings may be the talk of the town, but there's more to explore! Discover how to fine-tune LLaMA, an open-source language model, to make it sound like Homer Simpson! You can even apply this method to other characters! ๐ฉ๐บ Shout out to @bfirsh for this!
@bfirsh 2/ The process starts by using a dataset containing scripts from The Simpsons TV show (Seasons 1-12) obtained from Kaggle. With ~60k lines of dialog and 1.1M tokens, it's time to train LLaMA to reproduce the voice of the characters. ๐
@bfirsh 3/ To make LLaMA speak like a character, the dataset is parsed into scenes and prompts are generated. This is done by taking the previous lines in the scene, the character with the next line, and that line. The model is then prompted to complete the line in context. ๐ญ
@bfirsh 4/ The training process is similar to that of Alpaca, which was covered in their previous blog post. After training, you can generate scripts using cog predict commands. It's a surprisingly quick process thanks to LLaMA being open-source! ๐ #Alpaca
@bfirsh 5/ The results? An impressive Homer Simpson bot that occasionally delivers genuinely funny lines. This demonstrates the power of fine-tuning open-source language models to create character-specific voices. ๐ฌ
@bfirsh 6/ This experiment showcases the potential of fine-tuning open-source language models to create unique voices for characters. Imagine the possibilities for content creation, gaming, and entertainment! ๐ฎ๐ฌ Can't wait to see what the community builds!
@bfirsh 8/ Want to try it yourself? Check out @bfirsh's blog post for a detailed walkthrough on how to fine-tune LLaMA to speak like Homer Simpson or any other character. D'oh! ๐ฉ #DIYreplicate.com/blog/fine-tuneโฆ
โข โข โข
Missing some Tweet in this thread? You can try to
force a refresh
1/ A recent controversy at Google has sparked important questions about training ML models on the output of other models. Let's dive into the engineering, business, and legal aspects of this practice. Buckle up, folks! ๐งต
2/ Engineering recipes for training algorithms on generated data are still evolving. Instances of using a competitor's model outputs to train your own are surfacing. Are these techniques fair game or should there be limits?
3/ Business-wise, data may not always make your business more defensible. Market leaders might spend resources gathering data, but if their product's data makes it easier for competitors to catch up, is that initial effort a strong defense?
1/ ๐ #AutoGPT is trending today, but what's the hype all about? Let's unpack these AI game-changers, explore their potential, and recognize their limitations. Thread๐ #HypeVsReality
2/ ๐ฏ AutoGPTs are AI agents that can perform tasks autonomously, with little to no human intervention. They can even chain multiple GPT-4s together to work on different tasks simultaneously! However, they can get stuck & may need human help. #AutonomousAI#GPT4
3/ ๐ Two main models dominate the AutoGPT landscape: BabyAGI by @yoheinakajima & AutoGPT by @SigGravitas. They're trending on GitHub and attracting devs worldwide. Don't forget @asimdotshrestha's AgentGPT, which runs in-browser!
๐ 1/ Wanna take #AutoGPT to the next level? Say hello to @dataleapHQ the "Upwork for AI Agents" - a marketplace where you can hire AutoGPTs and other AI agents to get the job done. Buckle down for our vision paper! ๐งต dataleap.substack.com/p/ai-workforceโฆ
@dataleapHQ@hwchase17@LangChainAI@gpt_index@yoheinakajima@ShunyuYao12@SigGravitas ๐ค 3/ The gig economy has traditionally been the domain of human freelancers, but at Dataleap we are looking to create a new era where AI agents and humans coexist, each leveraging their unique strengths. Sometimes they'll complement each other; other times, it's all AI.
๐งต1/ Just attended a ๐ฅ @LangChainAI webinar on AI agents, ft. some of the brightest minds in the space! Let's unpack the key takeaways & explore the cutting-edge work being done.
@LangChainAI@charles_irl@ShunyuYao12@mbusigin@yoheinakajima@hwchase17 ๐ง 2/ Shunyu introduced the core idea of his #ReAct paper, which adds a "thought" between Action & Observation. The open question is, do we need a strict pattern of Thought-Action-Observation or should we just add thoughts as a special type of action, offering more flexibility?
๐งต 1/ ๐ค๐ง Stanford AI researchers have introduced a groundbreaking concept: Generative Agents, computer programs that simulate authentic human behavior using generative models. These agents display memory, introspection, and planning capabilities. Let's dive in. ๐
2/ ๐ฎ In the study, 25 Generative Agents were placed in a virtual sandbox-like world (think The Sims). They had unique backgrounds and interacted in a 2-day simulation. Examples of emergent behavior: one agent threw a party, another ran for mayor!
3/ ๐ Here's the kicker: actual humans role-playing the same 25 agents generated responses that were rated as less human-like than the chatbot-powered agents by an evaluation panel. Generative Agents are getting closer and closer to authentic human behavior. ๐ฎ
๐งต 1/ ๐ Yesterday we talked about how important chunking is when using vector databases like @pinecone, @weaviate_io or @trychroma. But what exactly are vector databases in the first place? Let's explore this game-changer! ๐
@pinecone@weaviate_io@trychroma 2/ ๐ค Machine Learning (ML) techniques can transform complex data into vector embeddings, describing data objects as numeric values in multiple dimensions. Vector databases index these embeddings for easy search & retrieval, finding similar values. ๐ง
@pinecone@weaviate_io@trychroma 3/ ๐ Vector databases excel at similarity search (vector search), allowing users to find related results without knowing specific keywords or metadata classifications. This provides accurate results while eliminating irrelevant ones that traditional search tech might return.