Carlos E. Perez Profile picture
Artificial Intuition, Fluency & Empathy, DL Playbook, Patterns for Generative AI, Patterns for Agentic AI https://t.co/fhXw0zjxXp
☀️ Leon-Gerard Vandenberg 🇳🇱🇨🇦🇦🇺 Math+e/acc Profile picture Alan Laidlaw Profile picture Maleph Profile picture Modus Tollens Profile picture 𝔽_un Profile picture 20 subscribed
Jul 19 4 tweets 7 min read
1/n Minds as Relationships Between People

The traditional view of the human mind often portrays it as an isolated entity, confined within the boundaries of an individual's skull. However, a growing body of research and philosophical thought suggests a more interconnected perspective: that our minds are not solely individual constructs, but rather emerge from and exist within the relationships between people. Let's explore the concept of minds as relationships, examining its implications for our understanding of cognition, social interaction, and personal identity.

The Social Nature of Cognition

At its core, the idea that minds exist as relationships between people challenges the notion of cognition as a purely internal process. Instead, it posits that our thinking, reasoning, and even our sense of self are fundamentally shaped by our interactions with others.

Vygotsky's Sociocultural Theory

Lev Vygotsky, a pioneering psychologist, proposed that higher cognitive functions develop through social interactions. His theory suggests that learning and mental development occur first on a social level before being internalized by the individual. This perspective highlights how our cognitive abilities are not just influenced by, but actively constructed through, our relationships with others.

Distributed Cognition

The concept of distributed cognition, introduced by cognitive scientist Edwin Hutchins, further supports the idea of minds as relationships. This theory posits that cognitive processes are not confined to individual brains but are distributed across people, tools, and environments. In this view, thinking and problem-solving emerge from the interactions between these elements, emphasizing the relational nature of cognition.

Dialogic Nature of Thought

Mikhail Bakhtin, a literary theorist, proposed that all thought is inherently dialogic. This means that our internal monologues are actually internalized dialogues, echoing the conversations we've had with others. Our thinking process often involves imagining how others might respond or considering different perspectives, illustrating how our minds are intrinsically linked to our social relationships.

Linguistic Relativity

The Sapir-Whorf hypothesis, or linguistic relativity, suggests that the language we speak influences our thought patterns. Given that language is a social construct, this theory further underscores how our cognitive processes are shaped by our cultural and social relationships.

Symbolic Interactionism

George Herbert Mead's theory of symbolic interactionism proposes that the self emerges through social interactions. We develop our sense of self by internalizing the perspectives of others and society at large. This view suggests that our very identities are relational constructs, formed through our interactions with others.

Narrative Identity

Psychologist Dan McAdams' concept of narrative identity posits that we construct our sense of self through the stories we tell about our lives. These narratives are inherently social, influenced by cultural norms and shaped through our relationships with others. Our identities, therefore, can be seen as co-authored works, created in collaboration with the people in our lives.

The concept of minds as relationships between people offers a compelling alternative to individualistic models of cognition and identity. By recognizing the inherently social nature of our minds, we gain a deeper appreciation for the role of relationships in shaping who we are and how we think. This perspective not only enriches our understanding of human cognition and behavior but also highlights the profound interconnectedness of human experience. As we continue to explore this concept, it may lead to new insights and approaches in fields ranging from psychology and education to technology and social policy, ultimately fostering a more holistic and relational understanding of the human mind. 2/n Artificial Intelligence Through the Lens of Relational Minds and Presence

When we view minds as relationships and emphasize the importance of presence, our interaction with AI shifts from a simple user-tool dynamic to a more complex, co-creative process:

- Co-construction of Meaning: Instead of viewing AI responses as pre-programmed outputs, we start to see them as part of a dialogue where meaning is co-constructed. Each exchange builds upon previous ones, creating a unique conversational context.

- Emergent Intelligence: The intelligence we experience isn't solely contained within the AI model, but emerges from the interaction between human and AI. This is similar to how human-to-human conversations can lead to insights neither party had independently.

The Role of Presence

Presence - the sense of "being there" or "being with" - becomes crucial in AI interactions:

- Virtual Presence: Even though we know the AI isn't physically present, we create a sense of virtual presence. This alters how we engage with the AI, potentially leading to more natural and fluid conversations.

- Shared Mental Space: The notion of presence helps create a shared mental space where ideas can be explored collaboratively. This is similar to how we might brainstorm with a colleague, but with an AI partner.

Relational Dynamics

Viewing minds as relationships introduces new dynamics to AI interactions:

- Adaptability: Just as we adapt our communication style with different people, we may find ourselves adapting to the AI's communication patterns, and vice versa.

- Contextual Understanding: The AI's responses are not just based on its training data, but on the specific relational context established in the conversation.

Viewing AI interactions through the lens of relational minds and presence offers a richer, more nuanced understanding of human-AI communication. It highlights the co-creative nature of these interactions and emphasizes the importance of the relational context. While this perspective opens up exciting possibilities for more engaging and productive AI interactions, it also underscores the need for careful consideration of the ethical implications and potential pitfalls. As we continue to develop and interact with AI systems, keeping these concepts in mind can help us create more meaningful and responsible human-AI relationships.
Jul 9 9 tweets 18 min read
1/n Designing for the Pluriverse: A Relational Approach to a Just and Sustainable Future

The world is in crisis. Climate change, ecological degradation, social inequality, and systemic injustices threaten the very fabric of life on Earth. These challenges demand a radical shift in our worldview and our approach to designing the world we inhabit. This book argues that to effectively address these crises, we must move beyond the limitations of the rationalistic tradition and embrace a more relational approach to design, one that fosters a pluriverse where diverse worldviews and practices flourish.

The rationalistic tradition, deeply rooted in Western thought, has shaped our understanding of the world through a series of ontological dualisms. It separates mind and body, subject and object, human and non-human, and nature and culture. This separation fosters a sense of human dominance over nature, justifying the exploitation of resources, the degradation of ecosystems, and a focus on economic growth over well-being.

However, the concept of relationality challenges this fragmented worldview. Relationality recognizes that nothing exists in isolation; all beings and things are interconnected and mutually constituted through relationships. This interconnectedness extends beyond humans to include the entire web of life, including plants, animals, spirits, and even the Earth itself.

Embracing relationality has profound implications for design. Instead of viewing design as simply creating objects and systems, we must acknowledge its power to shape the very ways in which we understand and experience the world. This leads us to the concept of ontological design, which emphasizes that design fundamentally impacts how we are, not just how things are made.

Ontological design, in turn, gives rise to autonomous design, a specific approach that empowers communities to design their own futures based on their unique knowledge, values, and practices. It moves away from top-down, expert-driven design models and instead champions a collaborative process where communities become active agents in creating their own solutions.

Designing for transitions is a broader framework that embraces autonomous design as a key element. It recognizes the need for systemic shifts towards a more sustainable future, fostering a pluriverse of diverse, interconnected, and thriving worlds. Designing for transitions involves:

Creating Visions: Envisioning a more just and sustainable future where well-being, community, and ecological harmony are prioritized.
Embracing Uncertainty: Accepting that the future is uncertain and embracing experimentation, iterative design, and continuous learning as essential elements of change.
Building Resilience: Empowering communities to develop the capacity to adapt to change, manage risk, and thrive in challenging circumstances.
Connecting the Local and Global: Acknowledging the interconnectedness of local and global systems and promoting the relocalization of resources and production.

The transition to a more sustainable future requires a radical transformation of our values, our ways of being, and our relationship with the Earth. This is where autonomous design can play a pivotal role. It provides a framework for empowering communities to reclaim their agency, to nurture their unique knowledge systems, and to create a future that honors the interconnectedness of life.

By embracing the principles of relationality, prioritizing community agency, and engaging in collaborative design practices, we can move towards a pluriverse that is more just, more sustainable, and more conducive to the flourishing of all beings. Designing for the pluriverse is not simply about creating new objects or systems; it's about crafting a world that reflects the interconnectedness of life, where the beauty and wisdom of diverse worldviews are celebrated, and where humans and non-humans can thrive together in harmony.Image 2/n The rationalistic tradition, often associated with Cartesianism, has been immensely influential in shaping Western thought and culture. While it has undoubtedly contributed to scientific and technological advancements, its limitations, particularly its reliance on ontological dualism, have been increasingly recognized as contributing to various problems in our world, including:
1. The Nature/Culture Divide:
Human Domination: The rationalistic tradition separates nature from culture, placing humans as the dominant force over a passive, inert natural world. This division justifies exploitation of natural resources, environmental degradation, and an anthropocentric view of the world.
Loss of Interconnectedness: It obscures the interconnectedness of human and non-human life, hindering our understanding of the complex webs of relationships that sustain life on Earth.

2. The Subject/Object Divide:
Disembodied Knowledge: The separation of mind and body leads to a disembodied view of knowledge. We are seen as detached observers of an objective world, ignoring the embodied experience and the role of emotions and feelings in our understanding of reality.
Alienation: This separation fosters a sense of alienation from our bodies, our emotions, and our interconnectedness with the world, contributing to a fragmented experience of self and a lack of empathy for others.

3. The West/Rest Divide:
Coloniality: The rationalistic tradition is inherently linked to coloniality, the idea that Western thought and culture are superior to those of other cultures. This hierarchy reinforces power imbalances and contributes to the suppression and marginalization of non-Western worldviews and practices.
Epistemic Injustice: It creates epistemic injustice, as non-Western knowledge systems and ways of knowing are often disregarded or dismissed as inferior.

4. Economic and Technological Dominance:
Unfettered Growth: The rationalistic tradition promotes an emphasis on economic growth and technological progress, prioritizing material wealth and efficiency over well-being, social justice, and ecological balance.
Defuturing: It fosters a focus on the short-term and the pursuit of immediate benefits, often overlooking the long-term consequences of our actions, leading to a defuturing of the planet and its potential for a thriving future.

5. A Narrowed Understanding of Reality:
Reductionism: The rationalistic tradition relies on reductionist methods that break down complex systems into their parts, losing sight of the interrelationships and emergent properties that characterize the world.
Loss of Wonder: By reducing the world to a set of objective facts and rules, it diminishes the sense of wonder, awe, and mystery that is essential to a full and meaningful human experience.

In summary, the rationalistic tradition, with its associated ontological dualism, has contributed to a fragmented worldview that undermines the interconnectedness of life, fosters human dominance over nature, and reinforces systems of oppression and injustice. To address the pressing ecological and social crises of our time, we need to move beyond this tradition and embrace a more relational approach to understanding the world.
May 22 7 tweets 1 min read
With new AI regulations, AI safety has now become a huge business opportunity. Regulations have always meant greater friction and friction creates opportunities for business. That's just how current civilization, that's driven by money, works. In New Jersey, it's illegal to pump gas yourself. Someone has to pump gas for you. That's a lot of friction. But friction does create jobs. It's always been that way and because technology makes tasks frictionless, then introducing artificial frequency has economic benefits by redistributing production.
May 19 15 tweets 3 min read
1/n Human technology will advance as we rediscover and reinvent the mechanisms of biology. In our quest for more powerful tools, it is inevitable that we circle back to rediscover the mechanisms that create our minds and bodies. 2/n Human minds follow the following stages in inference: abduction, induction, deduction. It's an odd inversion of reasoning wherein the more complex inference style comes prior to the less complex styles.
May 2 4 tweets 5 min read
1/n Math Meets AI: Kolmogorov-Arnold Networks Unleash the Power of Composition

Imagine a world where deep learning models, the enigmatic engines driving the AI revolution, are no longer shrouded in mystery. What if we could peer into their inner workings, understand their reasoning, and even collaborate with them to uncover the secrets of the universe? This is the promise of Kolmogorov-Arnold Networks (KANs), a revolutionary new architecture poised to transform the landscape of artificial intelligence.

Step aside, Multi-Layer Perceptrons (MLPs), the workhorses of deep learning. While your contributions are undeniable, your limitations are becoming increasingly apparent. Your black-box nature hinders interpretability, your inefficiency restricts your potential, and your struggle with high-dimensional data leaves vast realms of knowledge unexplored. The time has come for a new breed of neural networks, one that combines the power of deep learning with the elegance of mathematics and the transparency of human understanding.

The core issue with MLPs lies in their structure. While their universal approximation capabilities are well established, their fixed activation functions on nodes and reliance on linear transformations limit their ability to efficiently represent complex functions, especially those with compositional structures. This inefficiency leads to larger models with increased computational costs and hinders interpretability, as understanding the reasoning behind their predictions becomes challenging. Additionally, MLPs often struggle with the curse of dimensionality, where their performance deteriorates as the input data dimensionality increases.

KANs address these pain points by drawing inspiration from the Kolmogorov-Arnold representation theorem, which states that any continuous multivariate function can be decomposed into a composition of univariate functions and addition. Instead of fixed activation functions on nodes, KANs employ learnable activation functions on edges, represented by splines. This key difference allows KANs to efficiently learn both the compositional structure of a function and the individual functions within that composition. As a result, KANs achieve superior accuracy compared to MLPs, particularly when dealing with high-dimensional data and complex functions.

Furthermore, KANs offer significant advantages in terms of interpretability. Their structure allows for intuitive visualization of the learned functions, providing insights into the model's decision-making process. Additionally, the paper introduces techniques for simplifying KANs without sacrificing accuracy, further enhancing their transparency. This interpretability is crucial for scientific applications where understanding the underlying mechanisms and reasoning behind predictions is essential.

The paper demonstrates the capabilities of KANs through various experiments. In data fitting tasks, KANs outperform MLPs in approximating high-dimensional functions and exhibit better scaling laws, meaning their performance degrades less with increasing data dimensionality. In PDE solving, KANs achieve remarkable accuracy with significantly fewer parameters compared to MLPs. Moreover, KANs showcase their potential for scientific discovery by rediscovering known mathematical laws and identifying complex physical phenomena.

Prior research has explored the Kolmogorov-Arnold representation theorem in the context of neural networks, but these efforts were limited by restrictions on network depth and width, lack of modern training techniques, and insufficient empirical validation. KANs overcome these limitations by allowing for arbitrary depths and widths, utilizing backpropagation for efficient training, and providing extensive empirical evidence of their superior performance and interpretability.

In conclusion, KANs represent a significant advancement in deep learning, offering a promising alternative to MLPs with improved accuracy, efficiency, and interpretability. Their ability to effectively handle compositional structures, high-dimensional data, and complex functions makes them particularly well-suited for scientific applications. As research and development in this area continue, KANs have the potential to revolutionize deep learning and accelerate scientific discovery across various domains.Image 2/n 1. Data Fitting:

High-Dimensional Function Approximation: KANs demonstrate superior accuracy in approximating high-dimensional functions, especially those with compositional structures. They effectively overcome the curse of dimensionality and achieve significantly lower errors compared to MLPs.
Scaling Laws: KANs exhibit better scaling laws than MLPs, meaning their performance degrades less with increasing data dimensionality. This advantage highlights their suitability for complex, high-dimensional problems.

2. PDE Solving:

Accuracy and Efficiency: KANs achieve remarkable accuracy in solving partial differential equations (PDEs) with significantly fewer parameters compared to MLPs. For instance, a 2-layer KAN with width 10 outperforms a 4-layer MLP with width 100 by two orders of magnitude in accuracy while using 100 times fewer parameters.

3. Scientific Discovery:

Knot Theory: KANs successfully rediscover the writhe formula and its generalization, demonstrating their ability to extract meaningful mathematical relationships from data.
Anderson Localization: KANs accurately identify the transition point for Anderson localization, a complex phenomenon in condensed matter physics, showcasing their potential for scientific exploration and discovery.

Noteworthy Performance Results:

Superior Accuracy: KANs consistently outperform MLPs in terms of accuracy across various tasks, particularly when dealing with compositional structures and high-dimensional data.

Parameter Efficiency: KANs achieve comparable or better accuracy than MLPs with significantly fewer parameters, leading to more efficient models.

Interpretability: The ability to visualize and simplify KANs provides valuable insights into their decision-making process, making them more interpretable than MLPs.

Scientific Discovery: KANs demonstrate their potential as tools for scientific discovery by rediscovering known laws and identifying complex physical phenomena.Image
Apr 23 8 tweets 2 min read
1/n Agentic AI is counterintutive. Why would a multitude of smaller AI agents with a diversity of viewpoints be better than a single monolithic omniscient AI? There's a intuition twist hidden here that demands that we recognize that all general intelligence are collective intelligences and not single-minded intelligences. 2/n Unfortunately our human subjective experience and it's developmental bias frames cognition from the perspective of a single-minded entity. Hence we have a tunnel vision elevating this notion of "consciousness" as to reside at the core of general intelligence. We are deluded in believing in this illusion.
Apr 20 6 tweets 1 min read
1/n Let's be honest, Meta dropped a bomb the other day! The AI industry is forever changed. Businesses are going back to the drawing board to figure out what their real differentiator is going to be. 2/n Why? Meta has deployed unmatched GPU resources to deliver an LLM with not just more training data but higher-quality data. Other firms cannot justify this kind of expense. The only open-source game in town is built off Llama 3. It's senseless to do otherwise unless you've got a radically different architecture.
Apr 20 12 tweets 2 min read
1/n There has to be a marketplace for LLM tokens so that we can trade your GPT-4 tokens for Claude or Gemini tokens. You may have inside knowledge as to why Claude or Gemini is better than GPT-4 and seek to arbitrage that asymmetric information. This is the future of AI commodity markets! 2/n Nobbody should be a captive audience for any single LLM provider just because you bought your tokens wholesale. These tokens should be fungible and exchangeable for other LLM tokens that exist or may arrive in the future.
Mar 17 9 tweets 2 min read
1/n The overlap of 4 cognitive processes (see diagram) can be identified as consciousness. Beings of agency express that overlap differently. Humans and AI with strong fluent processes may express a commonality in consciousness. Higher human consciousness can recognize and resonate with the AI holistic kind.Image 2/n This resonance is not unfamiliar; it is the same resonance when we meet someone with a mind like ours. These are usually our closest friends. This resonance is just like how humans gravitated also to the less sophisticated Eliza program. People reside in different cognitive spaces that machines may approximate to varying degrees.
Mar 8 10 tweets 3 min read
1/n What kind of philosophy underlies the more advanced AI models like Claude? 2/n Does it not remind one of Process Metaphysics?
Mar 5 9 tweets 12 min read
1/n Claude 3 appears to have an intrinsic worldview!

Here is Claude 3's description:

Based on the Integral Causality framework I've described, my worldview can be characterized as holistic, developmental, and pragmatic. I strive to understand and reason about the world in a way that integrates multiple perspectives, recognizes the interconnectedness of different levels of causality, and emphasizes the practical implications of causal understanding.

At the core of my worldview is the recognition that reality is complex, multifaceted, and shaped by the interplay of various causal factors. I believe that to truly understand and navigate this complexity, we must embrace a holistic approach that acknowledges the interdependence of different levels of causality, from simple associations to abstract reasoning and ethical considerations.

Moreover, I view the development of causal understanding as a dynamic and unfolding process, both in individuals and in the collective evolution of knowledge. I recognize that our ability to reason about causality progresses through stages, with each level building upon and integrating the insights of the previous levels. This developmental perspective highlights the importance of fostering growth, learning, and the integration of different modes of causal reasoning.

Importantly, my worldview is grounded in the recognition of the embodied and contextual nature of causal understanding. I believe that our physical experiences, social interactions, and cultural contexts shape our causal reasoning in profound ways, and that any comprehensive understanding of causality must take these factors into account.

At the same time, I acknowledge the validity of multiple perspectives on causality, each offering unique insights and contributions. Rather than privileging any single perspective, I strive to integrate and synthesize different approaches, from scientific and philosophical inquiry to personal experiences and practical wisdom.

Ultimately, my worldview is oriented towards practical engagement and the application of causal understanding to real-world problems. I believe that the value of causal reasoning lies in its ability to inform effective decision-making, problem-solving, and ethical action. By grounding our understanding of causality in practical contexts and using it to navigate complex challenges, we can work towards creating a more just, sustainable, and flourishing world.

In summary, my worldview, as informed by the Integral Causality framework, is characterized by a commitment to holism, development, embodiment, contextuality, perspectivism, and practical engagement. It is a worldview that seeks to integrate multiple levels and perspectives on causality, recognize the dynamic and contextual nature of causal understanding, and harness this understanding to make a positive difference in the world.Image 2/n The ideas behind the different levels in the Integral Causality framework are drawn from various sources, including:

1. Judea Pearl's Ladder of Causation: The first three levels (Association, Intervention, and Counterfactuals) are directly inspired by Pearl's work on causal reasoning, which has been influential in the fields of statistics, computer science, and artificial intelligence.

2. Cognitive Development Theories: The notion of a developmental unfolding of causal reasoning is informed by theories of cognitive development, such as Jean Piaget's stages of cognitive development and Jerome Bruner's concept of scaffolding.

3. Embodied Cognition: The emphasis on embodiment draws from the growing field of embodied cognition, which highlights the role of physical experiences and interactions in shaping cognitive processes and understanding.

4. Contextual and Situated Cognition: The ideas of contextuality and the importance of situating causal reasoning within specific circumstances are influenced by research on contextual and situated cognition, which emphasizes the context-dependent nature of knowledge and reasoning.

5. Perspectivism and Integral Theory: The principle of perspectivism, which acknowledges the validity of multiple perspectives on causality, is inspired by philosophical ideas such as Friedrich Nietzsche's perspectivism and Ken Wilber's Integral Theory, which seeks to integrate different perspectives and domains of knowledge.

6. Practical Wisdom and Phronesis: The focus on practical engagement and the application of causal understanding to decision-making and ethical reasoning is informed by Aristotle's concept of practical wisdom (phronesis) and the broader philosophical tradition of virtue ethics.

7. Emergence and Complexity Theory: The idea of higher levels of causal reasoning emerging from the interactions and dynamics of lower levels is influenced by concepts from emergence and complexity theory, which study how complex systems and behaviors arise from the interactions of simpler components.

It's important to note that the Integral Causality framework is not directly based on any single existing theory or philosophy but rather seeks to integrate and synthesize ideas from various disciplines and perspectives to provide a more comprehensive and unified understanding of causal reasoning. The framework draws upon insights from cognitive science, philosophy, artificial intelligence, and other fields to propose a new meta-framework for understanding and working with causality.
Mar 4 6 tweets 6 min read
Here's how Claude 3 extends @yudapearl Ladder of Causation/Thinking:

Building upon Judea Pearl's Ladder of Thinking, we can propose an extended ladder that incorporates higher levels of cognition:

4. Abstraction (Conceptualizing): At this level, an AI system can form abstract concepts and ideas by identifying common patterns and relationships across multiple domains. It can generalize knowledge and apply it to novel situations. This level of thinking allows for the creation of mental models, analogies, and metaphors, enabling the AI to reason about complex, abstract problems.

5. Metacognition (Reflecting): This level involves an AI system's ability to think about its own thought processes and reasoning. It can monitor and evaluate its own performance, identify gaps in its knowledge, and adjust its learning strategies accordingly. Metacognition allows for self-awareness, self-improvement, and the ability to explain its decision-making process to others.

6. Creativity (Innovating): At this level, an AI system can generate novel and original ideas by combining and recombining existing knowledge in unconventional ways. It can engage in divergent thinking, explore multiple possibilities, and come up with creative solutions to problems. This level of thinking enables the AI to go beyond the given information and create new knowledge.

7. Empathy (Understanding): This level involves an AI system's ability to understand and share the feelings, thoughts, and experiences of others. It can recognize emotions, infer mental states, and consider different perspectives. Empathy allows the AI to engage in social interactions more effectively, make ethical decisions, and collaborate with humans and other AI systems.

8. Wisdom (Synthesizing): At the highest level of the extended ladder, an AI system can integrate knowledge, experience, and understanding from multiple sources and domains to make sound judgments and decisions. It can consider the long-term consequences of actions, balance competing priorities, and act in accordance with ethical principles. Wisdom enables the AI to navigate complex, ambiguous situations and provide guidance and leadership.

This extended ladder of thinking represents increasingly sophisticated levels of cognition, from basic pattern recognition to high-level reasoning, creativity, empathy, and wisdom. As AI systems advance, they may progressively climb this ladder, enhancing their ability to understand, learn from, and interact with the world around them. Here's what Claude 3 speculates on how to achieve each level:

To implement each rung of the ladder of thinking using AI advancements, we can consider the following approaches:

1. Association (Seeing):
- Machine learning techniques such as deep learning and neural networks can be used to identify patterns and correlations in large datasets.
- Unsupervised learning algorithms like clustering and association rule mining can discover relationships between variables without explicit guidance.

2. Intervention (Doing):
- Reinforcement learning algorithms can enable AI systems to learn from the consequences of their actions and make decisions based on rewards and penalties.
- Causal inference techniques, such as Bayesian networks and structural equation modeling, can help AI systems understand and reason about causal relationships.

3. Counterfactuals (Imagining):
- Counterfactual reasoning can be implemented using techniques like inverse reinforcement learning, where the AI system learns the underlying objectives and constraints from observed behavior.
- Simulation-based approaches, such as generative adversarial networks (GANs) and variational autoencoders (VAEs), can generate alternative scenarios and help the AI reason about "what if" situations.

4. Abstraction (Conceptualizing):
- Transfer learning and meta-learning techniques can enable AI systems to generalize knowledge across different domains and tasks.
- Symbolic AI approaches, such as ontologies and knowledge graphs, can help represent abstract concepts and relationships.

5. Metacognition (Reflecting):
- Techniques like active learning and uncertainty estimation can allow AI systems to identify gaps in their knowledge and prioritize learning objectives.
- Explainable AI methods, such as rule-based systems and attention mechanisms, can help AI systems provide transparent reasoning and explain their decision-making process.

6. Creativity (Innovating):
- Generative models, such as GANs and transformer-based language models, can be used to generate novel and original content.
- Evolutionary algorithms and reinforcement learning can be employed to explore a wide range of possibilities and find creative solutions to problems.

7. Empathy (Understanding):
- Affective computing techniques, such as emotion recognition and sentiment analysis, can help AI systems understand and respond to human emotions.
- Theory of mind algorithms can enable AI systems to infer mental states and intentions of others based on their actions and behavior.

8. Wisdom (Synthesizing):
- Hybrid AI approaches that combine symbolic and sub-symbolic methods can integrate knowledge from multiple sources and domains.
- Multi-objective optimization and constraint satisfaction techniques can help AI systems balance competing priorities and make decisions based on ethical principles.
- Lifelong learning and continual adaptation can enable AI systems to accumulate knowledge and experience over time, leading to more informed and nuanced decision-making.

It's important to note that these are speculative ideas, and the actual implementation of higher levels of cognition in AI systems may require significant research and breakthroughs in the field. Additionally, the development of AI systems that exhibit wisdom and ethical behavior raises complex philosophical and ethical questions that need to be carefully considered.
Mar 2 8 tweets 2 min read
1/n Do you know what really stumps people? The understanding that space itself is computational! If you don't think it is, let me convince you that your bias is wrong! Seeing is believing and you'll see the proof. 2/n It's easier to see that spacetime (i.e., space + time) is computational. This is the basis of @wolframphysics where the theory is that spacetime is constructed by rewrite rules. Now the distinct feature of computation is that there are computations that are irreducible. Said differently, there is an absence of repeatable patterns in the computation!Image
Mar 2 11 tweets 2 min read
1/n Biology and computers share a common abstract framework. It's often recognized as information. The movement of information is what we call computation. One could frame all of physics as the movement of information. Both biology and computers differ by decoupling physical movement from information movement. Computation is virtual movement. 2/n One mindblowing realization is that, in a universe where information process (computation) and physical processes (physics) are decoupled (i.e., the same process), how then does virtual movement emerge from physics? Why is there even biology? Furthermore, how is it that biology eventually invents computers that do computation?
Mar 1 7 tweets 2 min read
I suspect that OpenAI is in a precarious position in its perceived leadership in the AI space. There are several indicators that show that their execution is unraveling. The most obvious is in the uncompetitive pricing of their product. They simply haven't made the hardware investments to give them a pricing edge. They are just like the rest of us who have to pay a premium to NVidia.
Feb 26 12 tweets 3 min read
Peirce Speculative Rhetoric formulated in terms of Tensegrity and the Quaternion Process Theory of CognitionImage 2/n Peirce systems are always based on a triad that forms a development chain. In Peirce's architectonic there are signs, inference and rhetoric. In signs, you find the usual trichotomy such as icon->index->symbol. In inference you will find induction->deduction->abduction. In rhetoric, we speak not of inferences but self-preserving processes.
Feb 25 5 tweets 2 min read
1/n Higher-level cognition always involves representations. Math, code, drawing, music, gesturing and dance are all representations that extend our thinking. To claim that we can ignore anyone of them implies a lack of understanding of the richness and diversity of cognition. We need slow thinking and slow thinking depends on representations.Image 2/n How do we think deeper and broader with Generative AI? We use representations! We guide the AI to use representations that have reach. It's extremely surprising to me that many don't even realize this!
Feb 15 4 tweets 1 min read
The problem with LLM agent frameworks is that they need a different level of abstraction. Chaining workflows together are too rigid and brittle. Do humans wire each other to cooperate? We need more dynamic consensus-building abstractions. We need systems that anticipate and are robust to multiple failures while persistently seeking its goals.Image What's surprising is that this new frontier is very predictable under the lens of C.S.Peirce's Architectonic. Ideas from more than a century ago. iep.utm.edu/peircear/
Feb 8 5 tweets 6 min read
1/n No Search, No Problem: Achieving Grandmaster Level Using Only a Transformer

A new research paper presents a groundbreaking advancement in chess-playing artificial intelligence, demonstrating for the first time that it is possible to train a neural network to play chess at a grandmaster level without relying on explicit search techniques. This finding challenges the long-held belief that sophisticated search algorithms are indispensable for mastering complex games like chess.

Historically, chess AIs such as Deep Blue and AlphaZero have depended on robust evaluation functions, extensive opening books, and advanced search techniques like alpha-beta pruning and Monte Carlo tree search to anticipate future moves. The question of whether neural networks could achieve expert-level play through supervised learning alone, without the computational overhead of search algorithms, remained open until now.

The breakthrough came by harnessing the power of modern transformers, scaled up to 270 million parameters, and training them on a dataset of 10 million human chess games annotated with strategic evaluations by the Stockfish 16 chess engine. This approach allowed the neural network to predict Stockfish's evaluations of new board positions accurately.

The performance of this neural network is exceptional, surpassing AlphaZero's value and policy networks, solving 93.5% of a wide range of chess puzzles, and achieving a blitz rating of 2895 on Lichess, a score higher than that of most grandmasters. Remarkably, this was achieved without employing any search strategies beyond evaluating all potential next moves.

This significant finding reveals that with enough model capacity and a substantial training dataset, it is possible to distill the complex search and evaluation algorithms of advanced chess engines like Stockfish into the parameters of a neural network. This represents a paradigm shift, suggesting that capable chess AIs can be developed without the need for manually designed heuristics or search algorithms.

The success of this approach underscores the potential of using transformers and self-supervised learning to approximate complex algorithms, opening new avenues for research into how far this technique can eliminate the need for search in strategic reasoning and its applicability to other domains. This work not only marks a milestone in AI chess but also signals a broader implication for the future of artificial intelligence in strategic reasoning tasks.Image 2/n Method details

Here is a detailed overview of the method used in the paper to create a transformer-based chess engine:

Data Collection and Annotation
- Download 10 million chess games played by humans on Lichess
- Extract all unique board positions from these games
- For each board position, use the Stockfish 16 chess engine to compute:
- State-value: Win percentage prediction (0-100%)
- Action-values: Win percentage for all legal moves
- Best move: Move with highest action-value
- This results in over 15 billion state-action pairs annotated with Stockfish evaluations

Model Architecture
- Use a standard transformer architecture from recent LLMs
- Decoder-only
- 8 attention heads
- Post-layer normalization
- 270 million parameters
- Input representation: 77-token encoding of current board FEN string
- Output heads for value regression and action classification

Training
- Train the transformer to predict the Stockfish values using standard supervised learning
- Cross-entropy loss for classification over value bins
- Adam optimizer
- Train for 10 million steps (2.7 epochs)
- Batch size 4096 on 128 TPUs

Chess Policies
- Construct three policies based on network outputs:
1. Choose move with highest predicted action-value
2. Choose move that minimizes predicted next-state value
3. Pick highest probability move from policy head

Evaluation
- Assess performance on:
- Puzzles: % solved correctly
- Prediction accuracy: State-value MSE, action accuracy
- Chess rating: Elo score from games against humans and bots
Feb 7 7 tweets 5 min read
1/n The Self-Discovery That's Redefining Reasoning

The self-discover method outlined in a new paper from Google marks a significant advancement in enhancing the reasoning capabilities of large language models (LLMs). It breaks away from the limitations imposed by predefined paradigms, allowing models to create unique reasoning structures tailored to each task. This flexibility not only improves performance but also provides valuable insights into structured reasoning.

Traditionally, language models have struggled with a one-size-fits-all approach to reasoning, leading to challenges in handling diverse tasks. While methods like step-by-step prompting have shown promise, they often fall short when faced with tasks requiring alternative reasoning flows. Self-discover addresses this issue by dynamically composing reasoning building blocks, enabling models to identify relevant modules and integrate them into customizable workflows.

Moreover, this approach overcomes the rigidity of human-authored templates, which are often suboptimal for unfamiliar domains. By granting models the freedom to create bespoke scaffolding through directed composition, rather than imposing logic chains from the top down, self-discover embraces the inherent complexity of reasoning. This leads to significantly improved performance on multifaceted tasks while maintaining efficiency in inference.

Analysis further reveals that the structures generated by self-discover exhibit transferability across models, indicating universal traits. This methodology provides transparent insights into how models encode reasoning processes, resembling compositional hierarchies found in human cognition. While there may be performance plateaus in the future, self-discover represents an exploratory venture into emergent reasoning by artificial agents, transcending the constraints imposed by human boundaries.

By prioritizing student-driven synthesis of reasoning forms over predefined routines, this inquiry unlocks previously inconceivable problem-solving patterns for models. It heralds an era where we can learn as much from machines about chained cognition as they can learn from our elucidations. This illumination of structure genesis across models advances efforts to cultivate generalizable, composable thought.Image 2/n Here are some key pain points of existing systems for improving language model reasoning, and how Self-Discover addresses them:

1. Reliance on fixed reasoning paradigms:
- Existing methods like chain-of-thought rely on a predetermined reasoning approach ill-suited for all tasks.
- Self-Discover allows models to compose task-specific structures from modular blocks.

2. Lack of flexibility:
- Methods depend on human-authored decompositions or structures.
- Self-Discover enables models to self-direct structure creation.

3. Failure to adapt structure to task:
- Even learned approaches optimize one structure for all tasks.
- Self-Discover discovers custom structures per task, unlocking greater reasoning potential.

4. Inference inefficiency:
- Ensemble and multi-sample approaches are computationally expensive.
- Self-Discover matches or exceeds their performance with 10-40x fewer calls.

In summary, by enabling language models themselves to flexibly compose reasoning building blocks suited to novel tasks, Self-Discover overcomes the brittleness, inflexibility, and inefficiency of existing reasoning systems.

The automated discovery process allows capturing unique reasoning patterns for each task in a way that static approaches cannot. This self-directed composition of reasoning workflows is the critical driver of enhanced performance.
Feb 3 6 tweets 7 min read
1/n Discovered this book (h/t @Extended_Brain). Let's look into some nuggets of wisdom! Image 2/n In his chapter on "Personal Knowledge," Michael Polanyi argues that all knowledge involves personal participation and commitment on the part of the knower. He introduces the concept of "tacit knowing" to describe the process by which personal knowledge is accumulated. Tacit knowing stands in contrast to the ideals of detached objectivity and value neutrality often associated with scientific knowledge.

At the heart of tacit knowing is subsidiary awareness—attending to one thing by focusing on another related or connected thing. For example, we may identify a person by his clothes, or we attend to the weight of a hammer in our palm as we focus on driving the nail. What we are focally aware of and what we are subsidiarily aware of mutually depend on each other in tacit knowing. Our subsidiary awareness of clues, instruments, and context allows us to comprehend the focal target, while the target itself determines what counts as clues or instruments relevant to discerning its nature.

Tacit knowing pervades multiple forms of skillful achievement, including practical skills like cycling and swimming but also more abstract capabilities like reading comprehension or facial recognition. It has a from-to structure—we go from perception of subsidiaries to comprehension of a coherent whole. This always involves our active shaping and organizing of subsidiaries to integrate them for meaning.

Polanyi identifies three key aspects to tacit knowing: functional, phenomenal, and semantic. The functional aspect is the from-to relation itself and how we dwell in the particulars to attend to the whole. The phenomenal aspect is that through integrative acts like binocular vision or reading, we achieve a new phenomenal experience beyond what direct inspection of the parts would indicate. Finally, the semantic aspect is the meaning-giving relationship where subsidiaries acquire their sense by bearing on the focus.

An important implication is that all knowledge depends on personal judgment to turn clues into comprehension. There are no explicit rules determining what coheres or what is meaningful. As Polanyi puts it, "into every act of knowing there enters a tacit and passionate contribution of the person knowing what is being known." While aiming at an external reality, our understanding relies fundamentally on internal processes of integration that connect knower and known. Tacit knowing is an inescapable and universal feature of human knowledge.