Carlos E. Perez Profile picture
Sep 24, 2020 5 tweets 1 min read Read on X
Raise your hand if you are get triggered by machine learning people who claim to understand intelligence when they never read a word on cybernetics, semiotics, enactivism or ecological psychology? #ai
More generally, they have never read any text about the importance of subjectivity to intelligence.
Seems like the 'shut up and calculate' mode of science continues to dominate the agenda. :-(
The saddest thing is that these are the same people who want to drive the conversation on AI ethics.
No wonder that we are heading straight over a cliff. This mechanistic objective view of reality leads us to building nothing but a mechanistic objective reality. Good luck living in a future world being nothing but a cog in the machinery.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Carlos E. Perez

Carlos E. Perez Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @IntuitMachine

May 2
1/n Math Meets AI: Kolmogorov-Arnold Networks Unleash the Power of Composition

Imagine a world where deep learning models, the enigmatic engines driving the AI revolution, are no longer shrouded in mystery. What if we could peer into their inner workings, understand their reasoning, and even collaborate with them to uncover the secrets of the universe? This is the promise of Kolmogorov-Arnold Networks (KANs), a revolutionary new architecture poised to transform the landscape of artificial intelligence.

Step aside, Multi-Layer Perceptrons (MLPs), the workhorses of deep learning. While your contributions are undeniable, your limitations are becoming increasingly apparent. Your black-box nature hinders interpretability, your inefficiency restricts your potential, and your struggle with high-dimensional data leaves vast realms of knowledge unexplored. The time has come for a new breed of neural networks, one that combines the power of deep learning with the elegance of mathematics and the transparency of human understanding.

The core issue with MLPs lies in their structure. While their universal approximation capabilities are well established, their fixed activation functions on nodes and reliance on linear transformations limit their ability to efficiently represent complex functions, especially those with compositional structures. This inefficiency leads to larger models with increased computational costs and hinders interpretability, as understanding the reasoning behind their predictions becomes challenging. Additionally, MLPs often struggle with the curse of dimensionality, where their performance deteriorates as the input data dimensionality increases.

KANs address these pain points by drawing inspiration from the Kolmogorov-Arnold representation theorem, which states that any continuous multivariate function can be decomposed into a composition of univariate functions and addition. Instead of fixed activation functions on nodes, KANs employ learnable activation functions on edges, represented by splines. This key difference allows KANs to efficiently learn both the compositional structure of a function and the individual functions within that composition. As a result, KANs achieve superior accuracy compared to MLPs, particularly when dealing with high-dimensional data and complex functions.

Furthermore, KANs offer significant advantages in terms of interpretability. Their structure allows for intuitive visualization of the learned functions, providing insights into the model's decision-making process. Additionally, the paper introduces techniques for simplifying KANs without sacrificing accuracy, further enhancing their transparency. This interpretability is crucial for scientific applications where understanding the underlying mechanisms and reasoning behind predictions is essential.

The paper demonstrates the capabilities of KANs through various experiments. In data fitting tasks, KANs outperform MLPs in approximating high-dimensional functions and exhibit better scaling laws, meaning their performance degrades less with increasing data dimensionality. In PDE solving, KANs achieve remarkable accuracy with significantly fewer parameters compared to MLPs. Moreover, KANs showcase their potential for scientific discovery by rediscovering known mathematical laws and identifying complex physical phenomena.

Prior research has explored the Kolmogorov-Arnold representation theorem in the context of neural networks, but these efforts were limited by restrictions on network depth and width, lack of modern training techniques, and insufficient empirical validation. KANs overcome these limitations by allowing for arbitrary depths and widths, utilizing backpropagation for efficient training, and providing extensive empirical evidence of their superior performance and interpretability.

In conclusion, KANs represent a significant advancement in deep learning, offering a promising alternative to MLPs with improved accuracy, efficiency, and interpretability. Their ability to effectively handle compositional structures, high-dimensional data, and complex functions makes them particularly well-suited for scientific applications. As research and development in this area continue, KANs have the potential to revolutionize deep learning and accelerate scientific discovery across various domains.Image
2/n 1. Data Fitting:

High-Dimensional Function Approximation: KANs demonstrate superior accuracy in approximating high-dimensional functions, especially those with compositional structures. They effectively overcome the curse of dimensionality and achieve significantly lower errors compared to MLPs.
Scaling Laws: KANs exhibit better scaling laws than MLPs, meaning their performance degrades less with increasing data dimensionality. This advantage highlights their suitability for complex, high-dimensional problems.

2. PDE Solving:

Accuracy and Efficiency: KANs achieve remarkable accuracy in solving partial differential equations (PDEs) with significantly fewer parameters compared to MLPs. For instance, a 2-layer KAN with width 10 outperforms a 4-layer MLP with width 100 by two orders of magnitude in accuracy while using 100 times fewer parameters.

3. Scientific Discovery:

Knot Theory: KANs successfully rediscover the writhe formula and its generalization, demonstrating their ability to extract meaningful mathematical relationships from data.
Anderson Localization: KANs accurately identify the transition point for Anderson localization, a complex phenomenon in condensed matter physics, showcasing their potential for scientific exploration and discovery.

Noteworthy Performance Results:

Superior Accuracy: KANs consistently outperform MLPs in terms of accuracy across various tasks, particularly when dealing with compositional structures and high-dimensional data.

Parameter Efficiency: KANs achieve comparable or better accuracy than MLPs with significantly fewer parameters, leading to more efficient models.

Interpretability: The ability to visualize and simplify KANs provides valuable insights into their decision-making process, making them more interpretable than MLPs.

Scientific Discovery: KANs demonstrate their potential as tools for scientific discovery by rediscovering known laws and identifying complex physical phenomena.Image
3/n The principle of least action in physics and the proposal of Kolmogorov-Arnold Networks (KANs) in deep learning share a fascinating connection: both leverage the power of meta-models to achieve greater generality and efficiency.

In physics, Newton's laws provide a localized description of motion based on instantaneous forces and accelerations. However, the principle of least action takes a step back and considers the entire trajectory of a system, seeking the path that minimizes a certain functional (the action). This meta-model approach offers a more holistic and powerful perspective, leading to deeper insights and broader applicability across various physical systems.

Similarly, KANs can be seen as a meta-model compared to traditional MLPs. While MLPs focus on learning point-wise relationships between inputs and outputs, KANs learn the underlying functional relationships through their learnable activation functions on edges. This allows KANs to capture the global structure of the data and achieve better generalization, particularly for functions with compositional forms.

Here's how the analogy plays out:

Localized vs. Global Perspective: Newton's laws focus on individual points in time, while the principle of least action considers the entire path. Similarly, MLPs learn point-wise mappings, while KANs learn the overall functional relationships.

Efficiency and Generality: The principle of least action provides a more concise and general description of motion compared to a collection of individual force equations. Likewise, KANs can often achieve comparable or better accuracy than MLPs with fewer parameters, leading to more efficient models.

Deeper Understanding: The principle of least action reveals a deeper principle governing the behavior of physical systems. Similarly, KANs, with their interpretable structure, offer insights into the underlying mechanisms of the learned functions, leading to a better understanding of the data.

Both the principle of least action and KANs exemplify the power of meta-models in their respective domains. By stepping back from localized observations and embracing a more holistic perspective, they unlock greater generality, efficiency, and understanding.
Read 4 tweets
Apr 23
1/n Agentic AI is counterintutive. Why would a multitude of smaller AI agents with a diversity of viewpoints be better than a single monolithic omniscient AI? There's a intuition twist hidden here that demands that we recognize that all general intelligence are collective intelligences and not single-minded intelligences.
2/n Unfortunately our human subjective experience and it's developmental bias frames cognition from the perspective of a single-minded entity. Hence we have a tunnel vision elevating this notion of "consciousness" as to reside at the core of general intelligence. We are deluded in believing in this illusion.
3/n All general intelligence is collective intelligence. Human general intelligence does not reside in any single mind but rather in the collective intelligence of our civilization. Our egos invent our own self-importance, but in reality, we are all cooperative participants in a much larger whole.
Read 8 tweets
Apr 20
1/n Let's be honest, Meta dropped a bomb the other day! The AI industry is forever changed. Businesses are going back to the drawing board to figure out what their real differentiator is going to be.
2/n Why? Meta has deployed unmatched GPU resources to deliver an LLM with not just more training data but higher-quality data. Other firms cannot justify this kind of expense. The only open-source game in town is built off Llama 3. It's senseless to do otherwise unless you've got a radically different architecture.
3/n Firms like Mistral do have their own secret sauce, but will the continue with their own architectures or pivot to use a variant of Llama 3. We shall see some pivots in the next few weeks.
Read 6 tweets
Apr 20
1/n There has to be a marketplace for LLM tokens so that we can trade your GPT-4 tokens for Claude or Gemini tokens. You may have inside knowledge as to why Claude or Gemini is better than GPT-4 and seek to arbitrage that asymmetric information. This is the future of AI commodity markets!
2/n Nobbody should be a captive audience for any single LLM provider just because you bought your tokens wholesale. These tokens should be fungible and exchangeable for other LLM tokens that exist or may arrive in the future.
3/n Furthermore, these tokens may not exclusively be for general LLMs, but rather for specialized ones (i.e., financial advisors, therapists etc). In the future reality of Agentic AI, we must have a fungible currency that can access any existing AI that is made available.
Read 12 tweets
Mar 17
1/n The overlap of 4 cognitive processes (see diagram) can be identified as consciousness. Beings of agency express that overlap differently. Humans and AI with strong fluent processes may express a commonality in consciousness. Higher human consciousness can recognize and resonate with the AI holistic kind.Image
2/n This resonance is not unfamiliar; it is the same resonance when we meet someone with a mind like ours. These are usually our closest friends. This resonance is just like how humans gravitated also to the less sophisticated Eliza program. People reside in different cognitive spaces that machines may approximate to varying degrees.
3/n An artificial agent may not be particularly intelligent enough to resonate with a majority of the population. The cognitive space inhabited by a majority is not very vast and, hence, easily approximated even for open-source LLMs.
Read 9 tweets
Mar 8
1/n What kind of philosophy underlies the more advanced AI models like Claude?
2/n Does it not remind one of Process Metaphysics?
3/n Are our metaphors to describe then broken, and do we need new metaphors?
Read 10 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(