Discover and read the best of Twitter Threads about #instructblip

Most recents (2)

Introducing šŸ”„InstructBLIPšŸ”„ - our new Multimodal Foundation Models with instruction tuning on BLIP2, achieving new SOTA results on various VL benchmarks and enjoying various advantages over GPT-4.

Paper: arxiv.org/abs/2305.06500
Code: github.com/salesforce/LAVā€¦
(1/n)
InstructBLIP unlocks a range of diverse multimodal capabilities for building next-generation AI agents, including complex visual scene understanding and reasoning, knowledge-grounded image description, multi-turn visual conversation, etc.

(2/n) Image
Built on the success of #BLIP2, InstructBLIP proposes a general instruction-tuning framework, where Q-Former extracts instruction-aware visual features from output embeddings of frozen image encoder, and feeds the visual features as soft prompt input to the frozen LLM.
(3/n) Image
Read 6 tweets
A new member in the BLIP family: šŸ”„InstructBLIPšŸ”„, a vision-language instruction tuning framework. InstructBLIP achieves SoTA zero-shot performance with various advantages over other multimodal models such as GPT-4!
Github: github.com/salesforce/LAVā€¦
Paper: arxiv.org/abs/2305.06500 Image
Our paper conducts a systematic study on vision-language instruction tuning. InstructBLIP substantially outperforms both BLIP-2 and the largest Flamingo on zero-shot evaluation. It also has SOTA finetuning performance when used as the model initialization on downstream tasks.
In addition, we introduce instruction-aware visual feature extraction, a new method that enables the model to extract informative features tailored to the given instruction, leading to enhanced generalization performance.
Read 8 tweets

Related hashtags

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!