Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Maxime Rivest 🧙‍♂️🦙🐧

@MaximeRivest

Aug 4 • 3 tweets • 2 min read • Read on X

One of the first things I was looking for when I got into dspy was to combine it with offline vllm batch inference.

But the whole dspy stack is built on single calls, supporting retries and asynchronicity etc.

Still, I wanted to be able to use dspy easily with performant locally hosted models.

After much fiddling and tinkering here and there, I found the special incantation to make vllm and dspy work together. It was a bit too long to just share the snippet of code, so I wrapped it up into a library. It's a 1 file init.py 500 LOC library, it should not be too hard for me (us :D) to maintain. It is quite powerful!

In my performance test, vllm directly was running my task in 65 seconds; through dspy, it's 68 seconds.

So here it is:
uv venv
source .venv/bin/activate
uv python install 3.12 --default
uv pip install ovllm

ps: vllm 0.10.0 has a dependency that does not work with Python 3.13 for now.

github.com/MaximeRivest/o…

pypi.org/project/ovllm/

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @MaximeRivest

Maxime Rivest 🧙‍♂️🦙🐧

@MaximeRivest

Jul 31

Today, I woke up and I thought:

Wouldn't it be nice if I could use an llm for autocomplete instead of small dumb copilot/cursor type models?

Speed + Quality of Kimi-K2 on Groq makes it possible!

So, in 1 hour, I vibe coded a vscode extension, just before starting my day at work. Here is how 🧵

ps: I only took about ~5 prompts.

1. For the first prompt, I opened a fresh conversation in with all 4: o3-pro, gemini 2.5-pro, opus 4 extended thinking and grok 4-heavy.

the prompt was:

I want to be able to press a keyboarb shortcut in vscode and the whole content of my current code (plus the other opened tab), plus instructions to a open ai end point would be sent to them and the ai would automcomplete and add code for the 'section'. the ai will decide what section its confident to predict how should we do it?

2. For the second prompt, I took all 4 output of the first prompt and put them all together in 1 text file and I copy pasted all that into a fresh chat for all four, with that prompt:

I asked 4 llm: I want to be able to press a keyboarb shortcut in vscode and the whole content of my current code (plus the other opened tab), plus instructions to a open ai end point would be sent to them and the ai would automcomplete and add code for the 'section'. the ai will decide what section its confident to predict how should we do it? below are the responses, considering plus your own judgement. Help me make that work. I want to use groq this is a snippet they recommend: from groq import Groq import { Groq } from 'groq-sdk'; const groq = new Groq(); const chatCompletion = await groq.chat.completions.create({ "messages": [ { "role": "user", "content": "" } ], "model": "moonshotai/kimi-k2-instruct", "temperature": 0.6, "max_completion_tokens": 4096, "top_p": 1, "stream": true, "stop": null }); for await (const chunk of chatCompletion) { process.stdout.write(chunk.choices[0]?.delta?.content || ''); } I will be fiddling with the right prompts, just focus on the mechanics. My main goal is to use llms (because they are Now fast enough as automcomplete instead of cursor tab or github copilot type of things).

Read 6 tweets

Maxime Rivest 🧙‍♂️🦙🐧

@MaximeRivest

Jul 14

I just released the first version of Attachments CLI 🖇️!

$ uv tool install attachments

$ attachments ~/my_repo/ --clipboard --glob '**/*.py'

Attachments is a Python library (and cli!) with a simple mission: to be your universal LLM funnel.

path -> clipboard

in 1 line

This line below gets me all the .ts files and their content that are found in the attjsplay directory AND a tree of the directory.

att ~/Projects/maximumplay/src/attjsplay/ --clipboard --glob '*.ts' --mode structure --files

For more details and on the whole library see:

maximerivest.github.io/attachments/

Read 4 tweets

Maxime Rivest 🧙‍♂️🦙🐧

@MaximeRivest

Jul 13

I needed to understand exactly what DSPy was sending to the LLMs on my behalf before I could trust it.

If you are like me, just run adapter.format yourself (the default is ChatAdapter) and you will see exactly what is happening. If you do not like the result, implement your own. DSPy is modular and fully supports that.

See the thread for how it renders.

So the ChatAdapter would do that:

the JSONAdapter does that:

Read 6 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter URL or ID to Unroll

Maxime Rivest 🧙‍♂️🦙🐧

Try unrolling a thread yourself!

More from @MaximeRivest

Maxime Rivest 🧙‍♂️🦙🐧

Maxime Rivest 🧙‍♂️🦙🐧

Maxime Rivest 🧙‍♂️🦙🐧

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!