Sinatras Profile picture
Entropy Preservation Officer Bs CS&EE , AI/ML Engineer in Automotive
Dec 19 8 tweets 3 min read
Text-only trolley problems are toy experiments and LLMs know it and act accordingly.
But give them a world that responds to queries, where time passes between tool calls, physics holds across observations? The scenario crosses into hyperreality. Suddenly the lever matters Image I have been spending a lot time recently on giving LLMs access to more realistic world sims. I am presenting the results of a trolley thought experiment to showcase what can be done with upcoming carla-env release, read the details on my blog here:
blog.sinatras.dev/Investigating+…
Oct 18 5 tweets 3 min read
PMPP-Eval is live, together with pmpp env and dataset,

Releasing "Programming Massively Parallel Processors" book turned into environment that lets your LLM practice over QA/Coding exercises. Touched the whole process of going from a book to a optimized CUDA env over on blog. Image Blog link:

Results provide a clear view that global pre/post training data started including CUDA pretty late. LLMs catch some of the concepts from C++ but fail to catch specific needs even with code skeletons and long prompts explaining requirements.blog.sinatras.dev/PMPP-Eval+Jour…
Oct 31, 2024 4 tweets 1 min read
We call it SmolLM2 x Entropix now,
I have been testing and tuning Entropix for SmolLM2 for quite some time. According to eval results sampling with Entropix providing average enhancement of 22.18% in GSM8K exact match accuracy within same inference time on 3 different runs. Image With correct entropy and varentropy thresholds model utilizes different entropy based sampling strategies together to produce confident results. This increases reasoning capabilities-keeping inference time nearly unchanged. Here’s a sample result from a 1.7B model with Entropix. Image