Jim Fan Profile picture
Apr 3 12 tweets 8 min read Twitter logo Read on Twitter
Just got access to Adobe Firefly! How does the world's leading creative tool maker fare against MidJourney, a self-funded 11-person team?

Let's check it out. Left is Firefly and right is MidJourney V5. Prompt in "ALT" button on lower-left corner.

Deadpool posing on a car. 1/🧵 Deadpool wide angle pose on top of a car outside an apartmenMidJourney V5 image credited to @LinusEkenstam
Super Mario in a dim lit street with a big reflection in a puddle. Firefly's interpretation of "Super Mario" is ... exotic (?) 😅

Prompt and image credits to @LinusEkenstam @vitomotiv.

2/ A photograph capturing Super Mario in a pose in a dim lit st
Same prompt as above but for Pikachu. Again, somehow Firefly does not fully get these famous characters. Maybe a training data copyright issue?

Prompt and MJ image credits to @LinusEkenstam @vitomotiv.

3/ A photograph capturing Pikachu in a dim lit street and a big
Next, who is the better portrait photographer?

Photo of a large crowd of commuters in Tokyo, sharply focused faces, but it's the woman in red that commands your attention. Warm glow, elegance.

Prompt & MJ image credit: @nickfloats

4/ Modern street style photo from above shot on Fujifilm captur
How about some sci-fi?

Abstract fractal circular mosaic city architecture.

Prompt & MJ image credit: @chetbff @BambuuArt

5/ Abstract Fractal circular mosaic city architecture made of m
Now let's do some mobile app icon design. Does Firefly even know what an app icon is?

iOS app icon, Sci-fi planet landscape with skeuomorphic style.

Prompt & MJ image credit: @followmarcos

6/ App Icon Design: iOS, Sci-fi planet landscape with skeuomorp
The "human finger" test is becoming the new visual Turing Test. It's the final moat that Diffusion needs to conquer to become truly sentient 🤣.

A stunning young Jamaican woman wearing white retrofuturistic sequin Gucci gown, standing in the desert.

Credit: @nickfloats

7/ editorial style photo, medium-full shot, afga vista film sti
Finally, a landscape photo. It turns out to be an easy task that both Firefly and MJ excel.

Red Ferrari F40 in Dandelions at the Lake Seealpsee.

Prompt & MJ image credit: @heyBarsee

8/ Red Ferrari F40 in Dandelions at the Lake Seealpsee, shot wi
Note: these prompts are heavily optimized for MidJourney, so that may give it an unfair advantage. However, I did try a few variations but still couldn't get better results. I'm not a prompt ninja, so your mileage may vary.

Still, I'm grateful for Adobe's early beta access! /🧵
Note 2: Firefly is only trained on Adobe Stock and fully licensed images. The data curation is very conservative, which may cripple its performance.

I also included examples without copyrighted characters in the thread.
Note 3: Adobe research scientist @vdeschaintre has a good point: it may be a significant plus for companies who must ensure the IP copyright of the output image. They may be more than willing to sacrifice quality for legality, which makes MJ a less appealing option.
Thanks for all your feedback. I wrote a summary note to give Firefly's approach fair and proper credits:

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Jim Fan

Jim Fan Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @DrJimFan

Apr 5
Reading @MetaAI's Segment-Anything, and I believe today is one of the "GPT-3 moments" in computer vision. It has learned the *general* concept of what an "object" is, even for unknown objects, unfamiliar scenes (e.g. underwater & cell microscopy), and ambiguous cases.

I still… twitter.com/i/web/status/1…
Team: Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alex Berg, Wan-Yen Lo, Piotr Dollar, Ross Girshick.

Website: segment-anything.com

Original announcement from @MetaAI
Ross (@inkynumbers) was the inventor of Fast R-CNN 7 years ago, which kickstarted CNN-based image segmentation. He co-invented Faster R-CNN and Mask R-CNN. All these years of deep research culminated in Segment-Anything.

I have so much respect for Ross and his team.
Read 4 tweets
Apr 2
HuggingGPT is the most interesting paper I read this week. It gets very close to the "Everything App" vision that I described a while ago.

ChatGPT acts as a controller over the *AI model space*, picks the right model (app) given the human specification, and assembles them… twitter.com/i/web/status/1…
Vision of an "Everything App" based on ChatGPT app store:
Prismer, my team's open-source multimodal LLM:
Read 4 tweets
Mar 29
In 2018, Turing Prize Laureate Judea Pearl said that “all achievements of deep learning amount to just “curve fitting” and will never learn causality.

Yet GPT seems quite capable of reasoning about “why” (cause & effect) and “what if” (counterfactual imagination).

Why?

Going… twitter.com/i/web/status/1… https://www.quantamagazine....
First, GPT-4 acknowledges Judea's statement and humbly admits that there are limitations. Then it gives a bullet list of 5 points.

>>> Image
1) Pretraining data contains many examples of causality and counterfactuals, so GPT's answer may simply interpolate human's judgement in similar scenarios.
2) Inductive reasoning: making educated guesses based on common sense.

>>> Image
Read 5 tweets
Mar 27
Enough with LLMs - exciting things are happening in the world of atoms.

This is Stanford ALOHA, a low-cost and agile robot platform. The whole system is open-source (!!): hardware design, CAD models for 3D printing, simulator, and training code. Time to trossenrobotics.com/aloha.aspxtwitter.com/i/web/status/1…
If you want to learn more, here's the original tweet from Tony @tonyzzhao:

Technical thread from advisor, Prof. Chelsea Finn: @chelseabfinn
Read 4 tweets
Mar 23
OpenAI just announced ChatGPT Plugins. If ChatGPT's debut was the "iPhone event", today is the "iOS App Store" event.

3 official plugins available now:
- Web browser: adding Bing in the loop
- Code interpreter: adding a live Python interpreter in a github.com/hwchase17/lang…twitter.com/i/web/status/1…
ChatGPT's Retrieval Plugin is open-source, check it out: github.com/openai/chatgpt…

LangChain supports OpenAI's API, as well as other providers and models. If @StabilityAI succeeds in reproducing an open GPT-4, then LangChain will become a truly end-to-end Android!

Can't wait.
I cross-post on Linkedin if you find Twitter thread a bit hard to navigate. Welcome you to follow me there as well!

linkedin.com/in/linxifan/
Read 4 tweets
Mar 22
10x engineer is a myth. 100x AI-powered engineer is more real than ever. As OpenAI winds down Codex, Microsoft announces GitHub Copilot X. I think it's almost as exciting as GPT-4 itself:

- Copilot Chat: any piece of text database will be "chattable", and codebase is no… twitter.com/i/web/status/1… Image
Copilot for Pull Request needs to be enrolled on a per-repo basis: copilot4prs.githubnext.com/login

3/
Read 7 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(