The blistering pace continues and this week we got a preview of what comes after ChatGPT. From multimodal ImageBind to Google I/O to text-to-3D worlds, here's everything that happened in AI this week. 👇🧵 twitter.com/i/web/status/1…
𝟮/ 𝗧𝗵𝗲 𝗙𝘂𝘁𝘂𝗿𝗲 𝗶𝘀 𝗠𝘂𝗹𝘁𝗶𝗺𝗼𝗱𝗮𝗹 𝗕𝗮𝗯𝘆
Meta open sourced ImageBind, a model that connects six different modalities - images, text, audio, depth, thermal, and IMU or inertial measurement.
Quietly, this was one of the biggest weeks in recent AI history
Xi Xinping, the White House, Warren Buffett all discussed AI while new research read minds, new tools stunned users and AI safety convos went mainstream
2/ 𝗣𝗿𝗼𝗱𝘂𝗰𝘁 𝗟𝗮𝘂𝗻𝗰𝗵𝗲𝘀 & 𝗨𝗽𝗱𝗮𝘁𝗲𝘀:
•Midjourney v5.1 releases - a more “opinionated” version
•@StabilityAI@DeepFloydIF can actually put text in text-to-image creations
•Inflection releases Pi -a more personal AI
3/ 𝗣𝗿𝗼𝗱𝘂𝗰𝘁 𝗟𝗮𝘂𝗻𝗰𝗵𝗲𝘀 & 𝗨𝗽𝗱𝗮𝘁𝗲𝘀 cont
• @iBabyAGI puts AutoGPT in your pocket
•Ashton Kutcher spins up an oversubscribed $243m AI fund in 5 weeks
•Everyone is stoked over ChatGPT Code Interpreter
Tl:DR
Google, OpenAI & ANY closed models are going to be outcompeted by open source models
Why? The tinkerers w open source tools are advancing in ways the big co's didn't think was possible and at a far faster speed
3/ What Happened?
According to the author, over the last one to two months, a sea change in the LLM space, the short of which was that individual tinkers have driven massive innovation after getting access to "their first really capable foundation model - Meta's LLaMa"
If you've ever been frustrated that you couldn't get real words into a generative AI image, a new open source competitor to Midjourney is going to blow you away.
Let me explain...
2/ The images above share the same prompt: "a film camera photo of a 1960s southern california beachside burger restaurant, hazy afternoon, sign that says "burger"
Midjourney's image [left] is an unbelievably cool shot...but with gibberish.