Thread by @huggingface on Thread Reader App

We just released Transformers' boldest feature: Transformers Agents.

This removes the barrier of entry to machine learning

Control 100,000+ HF models by talking to Transformers and Diffusers

Fully multimodal agent: text, images, video, audio, docs...🌎

huggingface.co/docs/transform…

Create an agent using LLMs (OpenAssistant, StarCoder, OpenAI ...) and start talking to transformers and diffusers

It responds to complex queries and offers a chat mode. Create images using your words, have the agent read the summary of websites out loud, read through a PDF

How does it work in practice?

It's straightforward prompt-building:
• Tell the agent what it aims to do
• Give it tools
• Show examples
• Give it a task

The agent uses chain-of-thought reasoning to identify its task and outputs Python code using the tools.

It comes with built-in tools:

• Document QA
• Speech-to-text and Text-to-speech
• Text {classification, summarization, translation, download, QA}
• Image {generation, transforms, captioning, segmentation, upscaling, QA}
• Text to video

It is EXTENSIBLE by design.

Tools are elementary: a name, a description, a function.

Designing a tool and pushing it to the Hub can be done in a few lines of code.

The toolkit of the agent serves as a base: extend it with your tools, or with other community-contributed tools:

huggingface.co/docs/transform…

Please play with it, add your tools, and let's create *super-powerful agents* together.

Here's a notebook to get started: colab.research.google.com/drive/1c7MHD-T…

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Share this page!

Enter URL or ID to Unroll