Kye Gomez (swarms) Profile picture
May 21 8 tweets 7 min read Read on X
AGI has an assembly index.

In assembly theory (Sharma, Czégel, Lachmann, Kempes, Walker & Cronin, Nature 2023), the assembly index a of an object is the minimum number of recursive joining operations required to construct it from a basis set of elementary parts, where each intermediate is reusable once formed.

The framework was developed to distinguish biotic from abiotic matter: empirically, molecules with a ≳ 15 are not produced by undirected chemistry at detectable abundance, and their occurrence is treated as evidence of an underlying selection process a causal history capable of preserving and recombining intermediates.

The index is thus not a measure of static complexity but of contingent depth: the length of the shortest causal chain compatible with the object's existence.

A thread 🧵⬇️Image
2 /

The construction extends naturally to algorithm-space.

Treat the space of learning systems as an assembly space whose elementary operations are formal primitives (differentiable composition, attention, value iteration, policy gradients, in-context conditioning) and whose objects are trainable architectures.

Under this mapping, contemporary frontier systems occupy a regime of high a, reached through an ordered trajectory backpropagation (Rumelhart et al., 1986) → distributed representations → convolutional and recurrent inductive biases → the attention mechanism (Bahdanau et al., 2014) → the transformer (Vaswani et al., 2017) → neural scaling laws (Kaplan et al., 2020; Hoffmann et al., 2022) → RLHF (Christiano et al., 2017; Ouyang et al., 2022) → tool use and extended-context reasoning.

Each transition is conditionally near-zero-probability absent its predecessors. The trajectory is a constraint on reachability.Image
3 /

A parallel assembly path governs the physical substrate. Programmable shading hardware was repurposed for general-purpose matrix arithmetic (CUDA, 2007), specialized into tensor cores, and embedded in high-bandwidth interconnect fabrics (NVLink, InfiniBand) capable of maintaining gradient synchrony across 10^4–10^5 accelerators.

Algorithmic and hardware paths are mutually gating: the transformer is computationally inert without dense matmul throughput, and dense matmul throughput is economically unjustified without an architecture that consumes it.

The joint assembly index of the algorithm-hardware pair is therefore strictly greater than either component considered in isolation, and capability gains are bounded by the slower of the two trajectories.Image
4 /

This reframes the scaling debate.

The relevant question is not whether AGI requires a single missing insight or additional compute applied to existing methods, but which prerequisite constructions on the joint trajectory remain unrealized.

Candidate gaps include online continual learning without catastrophic interference, a memory architecture supporting selective consolidation, and a credit-assignment mechanism over horizons exceeding current context windows.

Each is plausibly gated by primitives not yet isolated, and the gating structure implies that compute applied to existing primitives yields diminishing returns once the local subtree of the assembly graph is exhausted.

Step-skipping is not available; the order is a property of the space, not of the researcher.
5 /

A caveat on the underlying theory.

The status of the assembly index relative to algorithmic information theory remains disputed.

Abrahão, Hernández-Orozco, Kiani, Zenil and colleagues (PLOS Complex Systems 2024) argue that the index is approximated by LZ-class compression and reducible to Shannon entropy under appropriate normalization.

Kempes et al. (npj Complexity 2025) reply that the index quantifies causation under selection rather than minimum description length, and note that exact computation of a is NP-complete, placing it in a distinct complexity class from polynomial-time compression schemes.

For the present argument the analogy is robust to this dispute: under either interpretation, capability sits behind an ordered sequence of constructions whose order is not optional.

The methodological implication is to model AGI not as a threshold crossed along a single scaling axis, but as an object with a construction history, and to direct research effort toward identifying the rate-limiting prerequisites on the joint algorithm-substrate path.Image
6 /

Conclusion

The framing recasts AGI forecasting as a problem in identifying unrealized prerequisites on a joint algorithm–substrate assembly graph, rather than as extrapolation along a compute axis.

The order of constructions is a property of the space, not a research preference, and step-skipping is not available.

If one accepts the assembly index as causally distinct from algorithmic complexity or treats it as a useful re-parameterization, the methodological conclusion is invariant: capability is gated by ordered prerequisites, and the rate-limiting question is which primitives remain to be isolated.

References below ⬇️
7 /

References

Sharma, Czégel, Lachmann, Kempes, Walker & Cronin (2023). Assembly theory explains and quantifies selection and evolution. Nature 622, 321–328. doi.org/10.1038/s41586…

Kempes, Lachmann, Iannaccone, Fricke, Chowdhury, Walker & Cronin (2025). Assembly theory and its relationship with computational complexity. npj Complexity 2, 27. doi.org/10.1038/s44260…

Abrahão, Hernández-Orozco, Kiani, Tegnér & Zenil (2024). Assembly Theory is an approximation to algorithmic complexity based on LZ compression that does not explain selection or evolution. PLOS Complex Systems 1(1), e0000014. doi.org/10.1371/journa…

Rumelhart, Hinton & Williams (1986). Learning representations by back-propagating errors. Nature 323, 533–536. doi.org/10.1038/323533…

Bahdanau, Cho & Bengio (2014). Neural machine translation by jointly learning to align and translate. arXiv:1409.0473. arxiv.org/abs/1409.0473

Vaswani et al. (2017). Attention is all you need. NeurIPS 30. arxiv.org/abs/1706.03762

Kaplan et al. (2020). Scaling laws for neural language models. arXiv:2001.08361. arxiv.org/abs/2001.08361

Hoffmann et al. (2022). Training compute-optimal large language models. arXiv:2203.15556. arxiv.org/abs/2203.15556

Christiano, Leike, Brown, Martic, Legg & Amodei (2017). Deep reinforcement learning from human preferences. NeurIPS 30. arxiv.org/abs/1706.03741

Ouyang et al. (2022). Training language models to follow instructions with human feedback. NeurIPS 35. arxiv.org/abs/2203.02155
@threadreaderapp unroll

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Kye Gomez (swarms)

Kye Gomez (swarms) Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @KyeGomezB

Apr 19
Introducing OpenMythos

An open-source, first-principles theoretical reconstruction of Claude Mythos, implemented in PyTorch.

The architecture instantiates a looped transformer with a Mixture-of-Experts (MoE) routing mechanism, enabling iterative depth via weight sharing and conditional computation across experts.

My implementation explores the hypothesis that recursive application of a fixed parameterized block, coupled with sparse expert activation, can yield improved efficiency–performance tradeoffs and emergent multi-step reasoning.

Learn more ⬇️🧵Image
2 /

I hypothesize that Mythos is a Recurrent-Depth Transformer (RDT) a class of looped transformer in which a fixed set of weights is applied iteratively across T loop steps within a single forward pass.

Crucially, reasoning occurs entirely in continuous latent space. There is no intermediate token emission between steps. This is structurally distinct from chain-of-thought and has been formally analyzed (Saunshi et al., 2025; COCONUT, 2024).Image
3 / 7

The recurrent block executes one shared TransformerBlock for up to T=16 loop iterations. At each step, the frozen encoded input e is re-injected via a stable LTI update rule: h_{t+1} = A·h_t + B·e + Transformer(h_t, e)

The FFN inside this block is a Mixture-of-Experts layer, following DeepSeekMoE's design a large pool of fine-grained routed experts, with only a sparse top-K subset activated per token, alongside a small set of always-active shared experts that absorb common cross-domain patterns.

Critically, the router is selecting distinct expert subsets at each loop depth meaning every iteration is not merely a repetition, but a computationally distinct pass. MoE provides domain breadth; looping provides reasoning depth.Image
Read 12 tweets
Oct 17, 2024
A thread 🧵on my vast arrays of essays on economics

Like, retweet, and share this with friends Image
1 /

A Theory on Value Creation

I wrote "A Theory on Value Creation" to bridge the gap between traditional economic models and the contemporary economic landscape, where innovation, networks, human capital, and technological advancements play pivotal roles in value creation. This paper formalizes a comprehensive framework for understanding how both tangible and intangible resources interact with technology and time to generate value. It integrates theoretical rigor with practical applications across microeconomic, macroeconomic, and sector-specific contexts.

github.com/kyegomez/A-The…
2 /

Modeling Economic Systems as Neural Networks

This paper introduces an unique approach to economic modeling, where economic systems are conceptualized as intelligent neural networks. By treating economic agents—such as individuals, firms, and governments—as neurons in a neural network, this framework reveals how economies can learn, adapt, and self-organize over time. Through formal mathematical models and a series of theorems, this paper explains how market dynamics can be optimized, how economies recover from crises, and how policy interventions can guide systems toward stability.

github.com/kyegomez/Model…
Read 11 tweets
Jun 15, 2024
Introducing Search Arena – The Ultimate Platform for Evaluating Search-Based Web Agents! 🕵️‍♂️🔍

Having reliable search tools is more critical than ever. But, finding the best search-based web agents can be challenging. That's why we built Search Arena.

buff.ly/4bZjMVj
2 /

There are countless search-based web agents available, but how do you know which one performs the best? The quality and efficiency of these agents can vary widely, making it tough to choose the right one for your needs. 😕

buff.ly/3Vib25q
Image
3 /

Search Arena is designed to rigorously evaluate and compare these agents using a variety of metrics. We ensure you can identify the most effective solutions to optimize your search capabilities.

buff.ly/3Vib25q
Read 9 tweets
Jun 4, 2024
1/

Introducing the Python Documentation Generator Agent, an advanced tool designed to revolutionize the way we handle documentation. Learn how this agent can save thousands of hours by automating the documentation process for your Python projects.

chatgpt.com/g/g-leayXL34J-…
2/

Writing detailed and comprehensive documentation is a time-consuming task. Our agent simplifies this by automatically generating high-quality, multi-page professional documentation tailored to your code's unique structure and functionality.

chatgpt.com/g/g-leayXL34J-…
3/

With the Python Documentation Generator Agent, you can focus on what matters most: coding. The agent takes care of everything from providing class definitions and parameter descriptions to offering extensive usage examples and tips.

Try it out now:
chatgpt.com/g/g-leayXL34J-…
Read 10 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(