Latest Twitter Threads by @myainotez on Thread Reader App

Dec 19, 2025 • 8 tweets • 3 min read

Text-only trolley problems are toy experiments and LLMs know it and act accordingly.
But give them a world that responds to queries, where time passes between tool calls, physics holds across observations? The scenario crosses into hyperreality. Suddenly the lever matters

I have been spending a lot time recently on giving LLMs access to more realistic world sims. I am presenting the results of a trolley thought experiment to showcase what can be done with upcoming carla-env release, read the details on my blog here:
blog.sinatras.dev/Investigating+…

Oct 18, 2025 • 5 tweets • 3 min read

PMPP-Eval is live, together with pmpp env and dataset,

Releasing "Programming Massively Parallel Processors" book turned into environment that lets your LLM practice over QA/Coding exercises. Touched the whole process of going from a book to a optimized CUDA env over on blog.

Blog link:

Results provide a clear view that global pre/post training data started including CUDA pretty late. LLMs catch some of the concepts from C++ but fail to catch specific needs even with code skeletons and long prompts explaining requirements.blog.sinatras.dev/PMPP-Eval+Jour…

Oct 31, 2024 • 4 tweets • 1 min read

We call it SmolLM2 x Entropix now,
I have been testing and tuning Entropix for SmolLM2 for quite some time. According to eval results sampling with Entropix providing average enhancement of 22.18% in GSM8K exact match accuracy within same inference time on 3 different runs.

With correct entropy and varentropy thresholds model utilizes different entropy based sampling strategies together to produce confident results. This increases reasoning capabilities-keeping inference time nearly unchanged. Here’s a sample result from a 1.7B model with Entropix.

Share this page!

Enter URL or ID to Unroll