How to get URL link on X (Twitter) App
Nomos 1 achieved an 87/120 with 8 perfect scores, while Qwen3-30ba3b-Thinking-2507 scored 24/120 when run in the same harness under the same conditions, indicating that the performance is largely due to post-training and data quality rather than the harness.
You can try Hermes 4 in the new, revamped Nous Chat UI.
DeepHermes 24B Preview performs extremely well on reasoning tasks with reasoning mode ON, jumping over 4x in accuracy on hard math problems, and 43% on GPQA, a STEM based QA benchmark. 
This is our first work on reasoning models, and hope our unique approach to user controlled, toggleable reasoning mode furthers our mission of giving those who use DeepHermes more steerability for whatever need they have.
You can watch the run LIVE here: distro.nousresearch.com
The API is built upon three architectures developed at Nous: