Tweet

aifunhouse

Sep 30 • 19 tweets • 6 min read

@aifunhouse

It's Friday and that means it's time for the @aifunhouse Week in Review!

As always, it's been a wild week in #AI!

DreamBooth, Instant NeRF, Make a Video, and more ... let's get in!

🤖🧵👇🏽

1. First up, DreamBooth, a technique from Google Research originally applied to their tool Imagen, but generalizable to other models, allows for fine-tuning of text-to-image networks to allow generation of consistent characters across contexts and styles. dreambooth.github.io

https://twitter.com/kevinbparry/status/1574335434146078723

2. The results are wild - take a look:

https://twitter.com/kevinbparry/status/1574335434146078723

https://twitter.com/aifunhouse/status/1575506574986166273

3. Here's our thread from earlier this week:

https://twitter.com/aifunhouse/status/1575506574986166273

https://twitter.com/psuraj28/status/1575123562435956740

4. Want to dig into DreamBooth for your own (likely questionable) purposes? Github and Colabs yonder:

https://twitter.com/psuraj28/status/1575123562435956740

5. Next - in a rare departure from Large Language Models and text-to-image, NVidia's Instant NGP with instant NeRF dramatically reduces the amount of time required to infer 3D scenes from a 2D images. Think about this as uber-photogrammetry. github.com/NVlabs/instant…

6. Great how-to on getting that Instant NGP installed, compiled, and running here:
developer.nvidia.com/blog/getting-s…

7. Additional tools in the repo allow for mesh generation, SDF, gigapixel image approximation, volume rendering, camera moves, interactive rendering with multisample DoF, slicing, and rad visualizations of what's happening under the hood in the neural net.

8. Next up, coming in hot from Meta AI is Make-A-Video, a paper and perhaps? a set of hosted tools (sign up if you're interested - shocker thanks Zuck 🙄) capable of text-to-video, image tweening, and video variation creation with pretty decent results.
makeavideo.studio

9. Subjectively, the Make-A-Video output's quality is reminiscent of GAN image output ~2 years ago, which is in no way a small feat. The images are stable, have decent detail and resolution, and plausible lighting and subjects.

10. They do have some GAN-like undesirable artifacts as well, including harsh edges, lack of definition in detailed areas, and a crushed color palette. Lots of room for improvement but impressive set of early results in what is sure to be the next frontier for image generation.

11. Bonus sample from Make-A-Video: "A golden retriever eating ice cream on a beautiful tropical beach at sunset, high resolution"

https://twitter.com/_akhaliq/status/1575544270811004928

12. Next up - in a similar vein to DreamBooth we have Re-Imagen, which acheives now SoTA results in image retrieval for "even for rare or unseen entities", even with the challenging COCO and WikiImages datasets:

https://twitter.com/_akhaliq/status/1575544270811004928

13. According to the paper, Re-Imagen outperforms StableDiffusion and DALL-E 2 in terms of faithfulness and photorealism with human raters, mostly in low-frequency entities.

14. You can think of this as a one-shot learning implementation of the sort of thing DreamBooth and Textual Inversion are capable of, and the results are indeed impressive.

@aifunhouse

15. That's it for the @aifunhouse Week in Review!

What were your favorite announcements, demos, or papers this week? What did we miss?

@aifunhouse

16. Did you love this thread? Of course you did! You're no dummy and you have a nice smile!

Follow @aifunhouse for more tips, tutorials, tricks, and roundups from this Cambrian Explosion of AI crazy!

17. RT this thread to let your friends know where the best roundups can be found (it's here BTW).

https://twitter.com/aifunhouse/status/1575925150121431042?s=20&t=VaJqWjW4DAZyQT21j9gzZg

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @aifunhouse

aifunhouse

@aifunhouse

Oct 2

1. Hey there young Spielbergs!

Curious about how AI can be used for film making?

It's still early days, but between text-to-image, text-to-audio and AI-driven animation, building blocks are starting to appear.

Today's thread provides an overview.
🧵👇

@mrjonfinger

2. First off, some highlights!

Way back in July @mrjonfinger used @midjourney to produce a coherent short film.

Very solid visuals, but the voices and animation are a bit stilted. We had to rewatch to grok the plot, but it's 100% there once you get it.

https://twitter.com/mrjonfinger/status/1552401566291832832

@adampickard

3. Another early AI success story is @adampickard's use of DALL-E's to recreate the famous short film "Powers of Ten" by Ray and Charles Eames.

There's no dialog here, but the narrative of the original definitely comes through.

https://twitter.com/adampickard/status/1551584412659335168

Read 18 tweets

aifunhouse

@aifunhouse

Oct 1

@OpenAI

1. Last week @OpenAI finally removed the waitlist for DALL-E!

Of the AI image creation tools out there, DALL-E is arguably the most accessible.

Today's thread provides a hands-on tutorial for new DALL-E users looking to generate visually consistent assets.

Sticker Time!

2. If you've never used DALL-E, head on over and sign up now!

First announced in April, DALL-E is the OG text-to-image ML model. For months, access was extremely limited, but now everyone, including you, can log in and take this dream machine for a spin.

labs.openai.com

3. If you need inspiration, OpenAI's Instagram is filled with images that a mere 6 month ago would have knocked our collective socks off.

Since then of course, we've all become jaded as new mind-bending tech comes out weekly.

Ah to be young again.

instagram.com/openaidalle/?h…

Read 15 tweets

aifunhouse

@aifunhouse

Sep 27

@OpenAI

1. Last week @OpenAI released Whisper, an open source model for transcribing audio.

Let’s see how you can use Whisper + GPT-3 to quickly summarize text-heavy YouTube vids.

If you're new to ML this is a great tutorial to get hands on and play along.

openai.com/blog/whisper/

🧵

@jeffistyping

2. First we'll to use Whisper to transcribe a video we want to summarize.

Whisper is open source and there are already multiple UIs on the web that allow you run it on video and audio.

For YouTube videos, @jeffistyping created a super simple UI.

huggingface.co/spaces/jeffist…

@garrytan

3. To transcribe a video, just paste in its YouTube URL and wait for the title and preview image to appear.

In this example, we chose a 10-minute video of @garrytan discussing Adobe's recent $20B acquisition of Figma.

Read 14 tweets

aifunhouse

@aifunhouse

Sep 26

@CorridorDigital

Text-to-image networks generate amazing imagery, but out of the box, it's hard to define characters and styles that remain consistent across outputs.

The crew at @CorridorDigital has a great video showcasing how tools like DreamBooth can solve this.
🧵👇

DALL-E and Stable Diffusion (SD) have taken Twitter by storm because of the high quality visuals they generate from text prompts.

Within a prompt, you can refer to well-known people/objects (e.g. "Eiffel Tower") and place them in specific situations, or stylize them in new ways.

Sadly, if you want to create an image with a not-so-well-known object (e.g. you), you're out of luck.

You can try to describe the object, but this won't cut it for specific people or products. It also won't result in great consistency across images.

Read 13 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Separate emails with commas Message

Share this page!

aifunhouse

People who liked this thread also liked...

Try unrolling a thread yourself!

More from @aifunhouse

aifunhouse

aifunhouse

aifunhouse

aifunhouse

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!