Steven Hoi Profile picture
Founder & CEO @HyperGAI. Ex-MD of Salesforce Research Asia, Ex-VP @Salesforce AI; Researcher in #AI; Professor at SMU & Ex-AP at NTU; IEEE Fellow.
Jun 2, 2023 5 tweets 3 min read
📢Introducing 🔥#CodeTF🔥, a one-stop Transformer Library for Code Large Language Models (CodeLLM), with a unified interface for training & inference on code tasks (code generation,summarization,translation,etc)

Paper: arxiv.org/abs/2306.00029
Code: github.com/salesforce/Cod…

(1/n) CodeTF library supports both the development and deployment of Code LLMs for code intelligence tasks. The library can support both training and serving code LLM models, code utilities to process code data, and popular research benchmarks to evaluate the model performance.
(2/n) Image
May 25, 2023 9 tweets 4 min read
Introducing 🔥BLIP-Diffusion🔥, a novel method for enabling Text-to-image Diffusion models with multimodal controllable generation/editing, powered by BLIP-2 pre-trained text-aligned subject representation.

Paper: arxiv.org/abs/2305.14720
Project: dxli94.github.io/BLIP-Diffusion…

(1/n) BLIP-Diffusion learns pretrained subject representation to unlock a range of zero-shot/few-step-tuned image generation and editing capabilities, e.g., subject-driven generation, zero-shot subject-driven image manipulation, controllable subject-driven image editing, etc.

(2/n) Image
May 16, 2023 6 tweets 4 min read
Introducing 🔥CodeT5+🔥, a new family of open-source code LLMs for both code understanding and generation, achieved new SoTA code generation performance on HumanEval, surpassing all the open-source code LLMs.

Paper: arxiv.org/pdf/2305.07922…
Code: github.com/salesforce/Cod…

(1/n) Image CodeT5+ proposes a flexible model architecture of encoder-decoder with a mixture of varied pretraining tasks, which can flexibly operate in different modes (i.e., encoder-only, decoder-only, and encoder-decoder) for a wide range of code understanding and generation tasks.

(2/n) Image
May 12, 2023 6 tweets 4 min read
Introducing 🔥InstructBLIP🔥 - our new Multimodal Foundation Models with instruction tuning on BLIP2, achieving new SOTA results on various VL benchmarks and enjoying various advantages over GPT-4.

Paper: arxiv.org/abs/2305.06500
Code: github.com/salesforce/LAV…
(1/n) InstructBLIP unlocks a range of diverse multimodal capabilities for building next-generation AI agents, including complex visual scene understanding and reasoning, knowledge-grounded image description, multi-turn visual conversation, etc.

(2/n) Image