Adept Profile picture
Useful general intelligence.
Oct 18, 2023 5 tweets 2 min read
We’re open-sourcing a multimodal model: Fuyu-8B! Building useful AI agents requires fast foundation models that can see the visual world.

Fuyu-8B performs well at standard image understanding benchmarks, but it also can do a bunch of new stuff (below)

adept.ai/blog/fuyu-8b We think MM models are especially useful for handling unstructured knowledge worker data, so we’ve given Fuyu-8B capabilities in:

- Understanding diagrams, charts, and graphs
- Doing OCR on screens
- Outputting bounding boxes for the locations of objects on screens
- Answering UI-based questions
Sep 14, 2022 7 tweets 3 min read
1/7 We built a new model! It’s called Action Transformer (ACT-1) and we taught it to use a bunch of software tools. In this first video, the user simply types a high-level request and ACT-1 does the rest. Read on to see more examples ⬇️ 2/7 This can be especially powerful for manual tasks and complex tools — in this example, what might ordinarily take 10+ clicks in Salesforce can be now done with just a sentence.