Latest Twitter Threads by @MaartenGr on Thread Reader App

Feb 11 • 5 tweets • 2 min read

Did you know we continue to develop new content for the "Hands-On Large Language Models" book?

There's now even a free course available with @DeepLearningAI!

@JayAlammar and I are incredibly proud to bring you this highly animated (and free 😉) course:

Feb 3 • 4 tweets • 1 min read

A Visual Guide to Reasoning LLMs 💭

With over 40 custom visuals, explore DeepSeek-R1, the train-time compute paradigm shift, test-time compute techniques, verifiers, STaR, and much more!

Link below

From exploring verifiers for distilling reasoning:

May 31, 2023 • 7 tweets • 4 min read

Multimodal, multi-aspect, Hugging Face Hub, safetensors, and more in BERTopic v0.15 🔥

Working together with @huggingface on this was a blast!

🤗Blog: huggingface.co/blog/bertopic
🤗Hub Example: huggingface.co/MaartenGr/BERT…
Changelog: maartengr.github.io/BERTopic/chang…

An update thread🧵

Apply textual topic modeling on images with the new update (🖼️+ 🖹 or 🖼️ only)!

Introducing a multi-modal Clip backend that embeds text and images.

Even when you have only images, you can caption the most representative images of each topic and extract textual representations!

Feb 14, 2023 • 8 tweets • 4 min read

The v0.14 release of BERTopic is here 🥳 Fine-tune your topic keywords and labels with models from @OpenAI, @huggingface, @CohereAI, @spacy_io, and @LangChainAI.

Use models for part-of-speech tagging, text generation, zero-shot classification, and more!

An overview thread👇🧵

Use OpenAI's or Cohere's GPT models to suggest topic labels. For each topic, only a single API is needed, significantly reducing costs by focusing on representative documents and keywords. You can even perform prompt engineering by customizing the prompts.

Dec 28, 2022 • 7 tweets • 3 min read

Final Preview: Outlier Reduction!

In the upcoming release of BERTopic, it will be possible to perform outlier reduction! Easily explore several strategies for outlier reduction after training your topic model. A flexible and modular approach!

A preview thread👇🧵

Strategy #1
The first strategy to reduce outliers is by making use of the soft-clustering capabilities of HDBSCAN. We find the best matching topic for each outlier document by looking at the topic-document probabilities generated by HDBSCAN.

Share this page!

Enter URL or ID to Unroll