World-class performance per parameter language models, from IoT to the data center.
https://t.co/6gcSwU4Puv
Jan 27 • 12 tweets • 3 min read
Today, we’re releasing the first weights from Trinity Large, our first frontier-scale model in the Trinity MoE family.
Trinity Large uses a highly sparse MoE architecture: