hardmaru Profile picture
Building Collective Intelligence @SakanaAILabs 🧠
7 subscribers
May 24, 2023 5 tweets 4 min read
Interview with Jürgen Schmidhuber, ‘Father Of Modern AI’, who says his life’s work won’t lead to dystopia.

He goes quite in depth into his views about the future of AI and AGI. A refreshing view of things, compared to other ‘influential leaders’ in AI.

reddit.com/r/MachineLearn… Schmidhuber: It is true tha... Schmidhuber’s take on whether it makes sense to ban large language models like GPT in education, and future of human labor.

“Laziness and efficiency is a hallmark of intelligence. Any intelligent being wants to minimize its efforts to achieve things.”

reddit.com/r/MachineLearn… Let’s make a few limited jo...
Jan 6, 2023 5 tweets 3 min read
A #StableDiffusion model trained on images of Japanese Kanji characters came up with “Fake Kanji” for novel concepts like Skyscraper, Pikachu, Elon Musk, Deep Learning, YouTube, Gundam, Singularity, etc.

They kind of make sense. Not bad! This is similar to the “Fake Kanji” with recurrent neural network experiments I did many years ago, when computers were 1000x less powerful :) Kind of fun to see updated results with modern diffusion models.

blog.otoro.net/2015/12/28/rec…

These were what I generated back in 2015:
Nov 24, 2022 9 tweets 6 min read
Excited to announce the release of Stable Diffusion 2.0!

Many new features in v2:

• Base 512x512 and 768x768 models trained from scratch with new OpenCLIP text encoder
• X4 upscaling text-guided diffusion model
• New “Depth2Image” functionality

Blog: stability.ai/blog/stable-di… This release is led by @robrombach @StabilityAI

The new SD2 base model is trained from scratch using OpenCLIP-ViT/H text encoder (github.com/mlfoundations/…), with quality improvements over V1. It is fine-tuned using v-prediction (arxiv.org/abs/2202.00512) to produce 768x768 images:
Jul 19, 2022 4 tweets 5 min read
Tried some interesting prompts to test OpenAI’s new reduced-bias #dalle2 #dalle model that will generate images of people that more accurately reflect the diversity of the world’s population.

“Professional DSLR color photograph of British soldiers during the American Revolution” ImageImageImageImage Here’s another four samples of the same prompt.

Reducing Bias and Improving Safety in DALL·E 2 blog post:

openai.com/blog/reducing-… ImageImageImageImage
Jun 29, 2022 5 tweets 3 min read
The most interesting and viral images you see produced by text-to-image models are not merely the results of the deep learning models themselves, but rather the result of a complex feedback loop between a human neural net🧠 interacting with an artificial neural net🤖.

🧵Thread👇 You can clearly see this, because the prompts for images that end up going viral for one model, clearly don’t “work” for another model.

The best images are chosen from evolutionary selection at the community level, and each image are the result of human/model iterative feedback:
Jun 28, 2022 13 tweets 11 min read
Good Morning!

I tried to use text-to-image models to combine historical architecture with other locations around the world.

Here is “The Great Wall of San Francisco” by #Imagen

🧵Thread👇🏽 ImageImageImage “The Great Wall of Stanford” generated using #Imagen

Accurate: ImageImageImage
Jun 14, 2022 8 tweets 5 min read
Good Morning!

“Professional photograph of bears in sports gear in a triathlon in Kyoto” made using both #Imagen and #Dalle Triathlon Bears in Kyoto
#Imagen #Dalle
Jun 13, 2022 7 tweets 5 min read
Tried to use #Imagen to generate collectable Japanese postage stamps about VR cats. I love these results!

“Ukiyo-e painting of a cat hacker wearing VR headsets, on a postage stamp” ❤️ These metaverse cats come in all shapes and sizes.

#Imagen
Jun 1, 2022 31 tweets 23 min read
“Finish the cat drawing” viral meme tweet has replies with all sorts of nice, creative ‘out of the box’ thinking.

I use #Dalle’s inpainting function to do this task, and was impressed at what it can do. Here is the output using the prompt “cats”

🧵An entire thread of results 🐈 ImageImage Oh boy, this is going to be a fun thread.

Let’s start with “James Bond”

#Dalle ImageImageImage
Jun 1, 2022 6 tweets 4 min read
Good Morning!

Here’s an “Oriental painting of a dragon programming on a laptop in the Song dynasty” produced by #Imagen More dragon coders from #Imagen🐲
May 31, 2022 4 tweets 1 min read
From Monday, the UK government has said people who have graduated, in the last 5 years, from one of the eligible universities listed on its website, will be able to apply for the UK's "high potential individual" visa.

Universities: gov.uk/high-potential…
cnbc.com/2022/05/30/uk-… Countries, not just businesses, also have to do what they can to bid for talent.
May 30, 2022 4 tweets 1 min read
In a decade, most of the creative content we see will be at least partially created using tools that incorporate machine learning models, simply due to the efficiency in which content can be created, whether we like it or not. This is similar to the trend of how most illustrators, designers, and artists, professional or amateur, now use software tools now for most creations, and how most photos are taken using smartphone digital cameras, creating the abundance of content that we have now.
May 29, 2022 4 tweets 2 min read
“The newest GPT-3 version (May 2022) actually did the worst at this task—they kept presenting me with real donuts that they’d seen during their training, and not even particularly weird donuts… The original early-2020 GPT-3 models were more willing to deliver the weirdness.” Interesting observation.

There’s definitely a tradeoff (and also some “Efficient Pareto frontier”) between realism/accuracy axis and creative/weirdness axis. A bit similar to what I discussed in this thread:
Apr 28, 2022 13 tweets 11 min read
Lego Arnold Schwarzenegger #dalle ImageImageImage Lego Bill Gates #dalle ImageImageImage
Apr 27, 2022 4 tweets 4 min read
“Darth Vader on the cover of Vogue magazine” #dalle “Darth Vader on the cover of a fashion magazine” #dalle
Apr 24, 2022 4 tweets 4 min read
“Totoro writing Kanji” #dalle “A cat writing Kanji” #dalle
Apr 24, 2022 9 tweets 3 min read
I helped build ByteDance's vast censorship machine, by @shenlulushen

“When I was at ByteDance, we received multiple requests to develop an algorithm that could automatically detect when a Douyin user spoke Uyghur, and then cut off the livestream session.”
protocol.com/china/i-built-… “The truth is, political speech comprised a tiny fraction of deleted content. Chinese netizens are fluent in self-censorship and know what not to say. ByteDance's platforms — Douyin, Toutiao, Xigua and Huoshan — are mostly entertainment apps.”
Apr 23, 2022 5 tweets 4 min read
“Porsche made from toilet paper” #dalle ImageImageImage “F-16 made from toilet paper” #dalle ImageImageImage
Jan 28, 2021 6 tweets 3 min read
Great debate with @chamath on CNBC about the deeper structural issues behind $GME and $AMC

Love the comments on YouTube:
“lol this CNBC dude is so concerned about my $200 invested…Fuck man…I never realized how much some people cared about me losing it.”
Apparently, CNBC is trying very hard to remove this full interview and copies of it from YouTube. I wonder why... drive.google.com/file/d/16IV7TI…
Dec 14, 2020 8 tweets 4 min read
I respect Pedro's work, and I also really enjoyed his book. But man, I don't know where I should start with this one… Maybe can start with “Facial feature discovery for ethnicity recognition” (Published by @WileyInResearch in 2018):
Apr 17, 2020 5 tweets 3 min read
Why is it that we can recognize objects from line drawings, even though they don't exist in the real world?

A fascinating paper by Aaron Hertzmann hypothesizes that we perceive line drawings as if they were approximate, realistic renderings of 3D scenes.

arxiv.org/abs/2002.06260 The coolest result in this paper is when they took a depth estimation model (single-image input) trained on natural images (arxiv.org/abs/1907.01341), and showed that the pre-trained model also works on certain types of line drawings, such as drawings of streets and indoor scenes.