Wouldn't it be nice if I could use an llm for autocomplete instead of small dumb copilot/cursor type models?
Speed + Quality of Kimi-K2 on Groq makes it possible!
So, in 1 hour, I vibe coded a vscode extension, just before starting my day at work. Here is how ๐งต
ps: I only took about ~5 prompts.
1. For the first prompt, I opened a fresh conversation in with all 4: o3-pro, gemini 2.5-pro, opus 4 extended thinking and grok 4-heavy.
the prompt was:
I want to be able to press a keyboarb shortcut in vscode and the whole content of my current code (plus the other opened tab), plus instructions to a open ai end point would be sent to them and the ai would automcomplete and add code for the 'section'. the ai will decide what section its confident to predict how should we do it?
2. For the second prompt, I took all 4 output of the first prompt and put them all together in 1 text file and I copy pasted all that into a fresh chat for all four, with that prompt:
I asked 4 llm: I want to be able to press a keyboarb shortcut in vscode and the whole content of my current code (plus the other opened tab), plus instructions to a open ai end point would be sent to them and the ai would automcomplete and add code for the 'section'. the ai will decide what section its confident to predict how should we do it? below are the responses, considering plus your own judgement. Help me make that work. I want to use groq this is a snippet they recommend: from groq import Groq import { Groq } from 'groq-sdk'; const groq = new Groq(); const chatCompletion = await groq.chat.completions.create({ "messages": [ { "role": "user", "content": "" } ], "model": "moonshotai/kimi-k2-instruct", "temperature": 0.6, "max_completion_tokens": 4096, "top_p": 1, "stream": true, "stop": null }); for await (const chunk of chatCompletion) { process.stdout.write(chunk.choices[0]?.delta?.content || ''); } I will be fiddling with the right prompts, just focus on the mechanics. My main goal is to use llms (because they are Now fast enough as automcomplete instead of cursor tab or github copilot type of things).
I needed to understand exactly what DSPy was sending to the LLMs on my behalf before I could trust it.
If you are like me, just run adapter.format yourself (the default is ChatAdapter) and you will see exactly what is happening. If you do not like the result, implement your own. DSPy is modular and fully supports that.