Activeloop Profile picture
Jun 19, 2023 14 tweets 7 min read Read on X
1. Generate Picture Books with AI for free (code open-source👇) with @OpenAI Function Calling, @LangChainAI, #DeepLake, & @StabilityAI.

- Prompt -> a PDF storybook with illustrations.
- Stores image & text pairs in the multimodal #DeepLake VectorDB for model training/finetuning!
2. Read the 🧵 to learn how @OpenAI Function Calling & @LangChainAI helped.

FableForge is built by @ethanjdev & handles:

1. Prompt -> text & images
2. PDF creation
3. Deep Lake DB to view the multimodal images + text dataset or stream it in real-time to train/fine-tune an LLM. FableForge diagram
3. But first... What's @OpenAI's function calling update?

In essence, it's bridging the gap between unstructured language input and structured, actionable output that other systems, tools, or services can use.
4. The chat models can now detect if a function needs to be called based on the user's input and respond with JSON that conforms to the described function's signature.
5. Effectively, you now can:

- Create QA chatbots with external tools (e.g., Plugins)
- Extract structured data from text
- Convert natural language into API calls or database queries (this is what we've used).
6. The first approach of instructing the language model to generate prompts didn't work, since Stable Diffusion was released in 2022, and teaching #GPT4 how to properly prompt was difficult. stable diffusion descriptio...
7. For our prompts, we need structured data to adhere to specific rules. Here's one of the functions we've used.

We'll send the chat model a page from our book, the function, & instructions to infer the details from the page. In return, we get structured data to form a prompt! writing the functions - pro...
8. @LangChainAI has recently added even better support for using functions (attend this webinar to learn more!)

9. Once we obtain the prompts, we then generate the texts and images with #StableDiffusion and #GPT4. This part of the code takes care of the following steps: api utils for image and tex...
10. Then we put the PDF together.

1. Text Addition and Image Conversion
2. Cover Generation
3. PDF Assembly generating the pdfs
11. Now that we have finalized our picture book, we want to store the images and prompts in Deep Lake. Deep Lake is multimodal, which means we can store embeddings, images, text, etc - all within one 'database'. This allows for some cool stuff - deep lake dataset creation
12. For instance - visualizing the image and text pairs, as well as streaming the entire dataset for further fine-tuning based on user feedback! Image
13. (13/14) Give the open-source repo a star! github.com/e-johnstonn/Fa…. Also, accepting PRs for local models, user feedback buttons in the UI, etc.
14. (14/14) Full writeup here:

activeloop.ai/resources/ai-s…

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Activeloop

Activeloop Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(