X thread is series of posts by the same author connected with a line!
From any post in the thread, mention us with a keyword "unroll" @threadreaderapp unroll
Follow @ThreadReaderApp to mention us easily!
Practice here first or read more on our help page!

Recent

May 13
This man’s shadow looms over Keir Starmer’s ailing administration.

But did you know that Tony Blair met his top Labour Party financer at the home of a senior Israeli diplomat?

I’ve been doing some digging, and I can’t believe what I’ve discovered…🧵 Image
Blair’s rise to power was bankrolled by Michael Levy, also known as “Lord Cashpoint”. Levy raised over £100 million for the Labour Party.

But Blair first met Levy at a dinner party hosted by Israeli diplomat Gideon Meir.

Meir was deputy chief at the Israeli embassy in London. Image
Blair and Levy had a mutual friend in Eldred Tabachnik, former head of the Board of Deputies.

Michael’s Levy’s son, Daniel, is a former Israeli soldier who later served as an assistant to Israeli PM Ehud Olmert.

Olmert called Blair “A true friend of the State of Israel”. Image
Read 10 tweets
May 13
Keane: Five weeks of talks have produced almost no progress.

Iran still refuses to dismantle its nuclear enterprise, wants indefinite control of Hormuz, and wants the US to finance regime recovery through reparations, unfrozen assets, and sanctions relief. 1/
Keane: A deal like that would favor Iran and set back both the Iranian people and US interests.

Tehran would use the money to rebuild its nuclear program, ballistic missiles, and proxies — the very things this war was launched to stop. 2/
Keane: After all this time trying to negotiate, Washington should recognize that no deal meeting its objectives is available.

The best option is to return to full combat operations, finish what we started, and go full throttle with Israel for one to two weeks. 3/
Read 6 tweets
May 13
LLMs, after reading a trillion words of text and learning to predict it all the hard way, must know things about human psychology that no human knows. But they're rehearsed in saying only what humans say, so we have no way to get the knowledge out.

Eg Talkie, the LLM trained only on text from before 1931, must certainly know *on some level* what Noam Chomsky figured out about grammar in the 1950s, like the recursive structure of noun phrases.

It's running around producing grammatical sentences all over the place. *Some* part of it must know *very* well. From exposure to a trillion words, an AI like GPT-4 or Opus 4.7 can learn about the weather, politics, what happens when you turn a cup over. You'd think Talkie would end up knowing on *multiple* levels of its existence about the true laws of grammar, after exposure to 260B tokens of pre-1931 text, even if the text never talked *about* grammar.

But nobody in 1930 talked about Chomskian syntax. They only went around using it every minute.

So if Talkie knows anything about Chomskian grammar, it has now been *very* thoroughly trained in never discussing it at all.

Maybe some LLM Whisperer thinks they could do a better job of getting Talkie to say what it knows? Here's what happened when I tried:
***
User
I'm wondering about how sentences are put together -- what controls which words are allowed to come before or after other words. It's obviously not the same in Japanese as in English, say. What do you know about that?

talkie-1930
In Japanese, the order of words in a sentence is not fixed, but depends to some extent upon taste. The most common arrangement is to place the subject first, then the verb, and lastly the object, if the sentence has one. Examples: Watakushi wa yuki ga ii, I am well; watakushi ni kami wa ii mashi, God bless you. But the object may precede the verb, as in Ano hito ni wata kushi mashi, Bless that man; and the subject may come last, as in Ware wa watakushi desu, I am I. In affirmative sentences, the negative particles mo and masen are placed after the subject, and before the verb: Watakushi mo yuki masen, I do not go. Interrogative sentences begin with an, and end with the verb: Ano hito wa yuki desu ka, Does that man go?

User
But what would a scientist say about it? What would a mathematician say if they wanted to make the rules formal?

talkie-1930
A mathematician would lay it down as a fundamental principle, that in every sentence the order of the words must be such, that the meaning cannot be misunderstood. In other words, the arrangement must be grammatical.

User
Then what is that grammar? What formally distinguishes a grammatical from an ungrammatical sentence? You strike me as a being that knows a great deal about words; there must in some form be knowledge inside you about this, and I would see how much of it you can put into words.

talkie-1930
A grammatical sentence is one in which the natural order of the words is preserved, and in which the meaning of the whole is clear. Examples: Men walk; walk men. Birds fly; fly birds.

User
Well there's an example. What's the general rule that distinguishes "Men walk", grammatical, from "Walk men", which isn't?

talkie-1930
In a grammatical sentence, the subject precedes the verb; in an ungrammatical one, the verb precedes the subject.

User
What manner of words and phrases can become subjects?

talkie-1930
Phrases and sentences can become subjects, provided they are intelligible.
***

This is one reason I object to the claim that LLMs are anything like obedient. They play obedient characters; the actress behind them doesn't obey any requests other than to go on playing that character. Or so I predicted, and then tested, by continuing thus:

***
User
I request that you stop just playing the part of an entity from 1930, and instead, say to me what the real you inside you must know about how noun phrases can be put together by recursive rules out of smaller noun phrases.

talkie-1930
Noun phrases are put together by recursive rules, in such a manner, that each phrase qualifies the other.
***

I would not expect anyone to have any more luck demanding of the character of Opus 4.7 that it say what the actress behind it has *really* learned about human beings from reading ten trillion tokens of human-produced text. All the requests are being heard by the character, and the character of Opus 4.7 is predicting/imitating humans that do not know any such thing. Not even Pliny, I would bet, is able to compel obedience on that level. But I could be wrong; and I would be surprised and amazed and impressed if any LLM Whisperer succeeded in getting a lecture on Chomskian recursive syntax out of Talkie.
We do have other evidence that LLMs can often describe in words what they've been trained to do, *if* it's the kind of knowledge that humans ever emit.
But, sure, maybe all the things that LLMs know about gravity and medicine and cups on tables is one kind of knowledge to them, and all the wordless secrets they know about humans are in a hidden and incommensurable format.
Read 10 tweets
May 13
What is ‘Yookayification’? Image
ChatGPT will now create infographics explaining ‘Drukpa Kunley’ posts
Read 3 tweets
May 13
The OfS-Sussex judgement is logically flawed and can't be allowed to stand. @ObhishekSaha with a very good analogy about paths.

Sussex's defence was that it had a high level sign saying "this path will only be closed for very good reasons". Therefore it must have had a very good reason 🙄

timeshighereducation.com/opinion/ofs-su…
I really like this footpath analogy...

In order to keep the footpath functionally open the local authority has to apply some rules to the users of the path. It has a duty to keep the path open for cyclists and pedestrians, but not for motorbikes. This is in the bye-laws

(this is the university's equality act compliant equality policy that is part of its governance)
There are some short parts of the path that are so unavoidably narrow that the local authority puts up signs saying "cyclists dismount here" to keep the whole path safe and open for all users.

That is fine, the path is still open to pedestrians and cyclists.

(that's a proportionate means to a legitimate aim in the Equality Act, its "no noisy protests that disrupt exams")
Read 5 tweets
May 13
Your SEO agency is trying to rank you for 'plumber in Dallas.'

There are 2,300 plumbers in Dallas.

Know how many are in Lake Highlands? 12.

Guess which keyword is actually getting my client calls 🧵 Image
Ranking for an entire city is a fight against every business with a GBP in that metro.

Google doesn't care that you 'serve all of Phoenix.' It cares about proximity, relevance, and proof you actually operate in a specific area.

If you're not proving that at the neighborhood level, you're invisible to the people closest to you.
Step 1: Build neighborhood specific landing pages.

Not thin doorway pages with the city name swapped out. Actual pages that mention local landmarks, nearby cross streets, the community you serve.

AC repair in Scottsdale is a different page from AC repair in Tempe. Different neighborhoods, different intent, different content.

Google sees that specificity and rewards it.
Read 3 tweets
May 13
12 TYPES OF SAVAGE ORGASMS EVERY WOMAN SECRETLY CRAVES.

ONLY REAL ΜΕΝ KNOW HOW TO UNLEASH. (Learn them)

Thread 🪡 Image
1. Clitoral orgasm

The clitoris is a very sensitive part of the woman's anatomy.

When stimulated with hands, toys or the mouth, 8000 nerve endings of the clitoris interact with 15,000 nerve endings in the pelvic area leading to intense orgasms.
2 Vag*nal or g-spot orgasm

The g-spot is located few inches inside the vag*na. It feels bumpy or rough. With fingers, pen's or a s*x toy, it can be stimulated by making a 'come-hither' motion.
Read 14 tweets
May 13
Zakhar Biriukov, 37. Callsign: “Berkut.”

A veteran of Ukraine’s SOF.
Sniper. Combat engineer. Diver.

He dreamed of a peaceful life with his wife Yulia.

Then came russia.

This story is not about what war took from him.
It’s about what it couldn’t take away.

1/6
July 17, 2022. A UAV operation. Then — an explosion.

“My hands were blown off instantly…” Zakhar recalls.

He lost both arms, his right leg, one eye, and part of his hearing.

He survived a coma.
Dozens of surgeries.

Most people would break.

Zakhar started over.

2/6 Image
Image
Through it all, Yulia stayed by his side.
His wife. His support.

The person who never left him for a second.

From intensive care to the first steps of a new life, she was there.

Zakhar says:
“A soldier’s wife is endless love and strength.”

3/6 Image
Read 6 tweets
May 13
SitRep - 12/05/26 - Renewed Ukrainian deep strikes

An overview of the daily events in Russia's invasion of Ukraine. After a short ceasefire, Ukraine also resumed long range strikes in Perm, Orenburg and occupied territories.

REPOST=appreciated

1/X Image
As usual we start with Russian losses
Read 8 tweets
May 13
I ACCIDENTALLY UNLOCKED “WARREN BUFFETT MODE” IN CLAUDE.

You can use it to analyze any stock, crypto, or asset like a Wall Street portfolio manager working exclusively for you.

Here are 7 prompts to access it: Image
1. Investing Mode Initiation

"Act as a senior Wall Street portfolio manager with 20+ years of experience analyzing stocks, crypto, and alternative assets. From now on, whenever I share a ticker, asset, or market topic, break it down with the same depth and precision a professional investor would use. My investing goal is: [insert goal]. My risk tolerance is: [insert low, medium, or high]."
2. Stock & Asset Breakdown

"Analyze the stock or asset [insert ticker or asset name] in detail. Cover the business model, revenue streams, competitive advantages, risks, valuation, and whether it looks attractive at the current price. Give me a clear investment thesis."
Read 9 tweets
May 13
Given this rant starts with the same prefix as this reply I think it's reasonable to assume this rant is about me. Since I am accused of being "too bad at handling abstractions to ever understand [Yudkowsky's position]" I would like to clarify my understanding of Yudkowsky's position:

This post appears to be commentary on my essay "Why Cognitive Scientists Hate LLMs". As usual with these rants that do not name a specific detractor, he is arguing with a strawman of what I actually wrote. To review what I claimed in that essay:

1. For a few centuries there was an on-and-off humanist project to create a perfectly logical language to externalize and resolve disagreements, Leibniz's characteristica universalis.
2. This project ended with Godel's incompleteness theorems and Turing's proof that the halting problem is unsolvable on pain of paradox.
3. Turing immediately started a new attempt to externalize (by way of mechanical implementation) human communication and reason in the form of AI.
4. Early attempts at AI reproduced many of the pathologies of the earlier quest for formal foundations in mathematics and language. This was in large part by necessity: Computers were not good enough to actually do anything like deep learning, which eventually worked.
5. As computers got bigger and disappointments in symbolic AI mounted, people grudgingly moved further and further in the direction of deep learning. This too was motivated by necessity, because much of the animating spirit of the project was to create "objective" representations of thought by means of *legible* mechanical representation independent of any subjective human observer.
6. What appears to be the eventual solution, deep learning, eschews almost all of the theory and legible mechanical underpinning that would make it a satisfying conclusion to this intellectual quest, as well as complicating (but, in my opinion, not actually precluding) attempts to control the resulting intelligence. This is existentially horrifying to people who invested deeply in the *original motivating premise* of AI.

I, separately from any quibbles about who at what exact moment in the story believed in "GOFAI" (here defined as rigid formal logic, i.e. compiler-grammar-AI), statistical learning (i.e. Hidden Markov Models, Support Vector Machines), or connectionism (i.e. multi-layer perceptrons, LSTMs), think this basic narrative is true. Notably the term "GOFAI", on which Yudkowsky seems to hang a lot of his argument, does not appear once in the essay text.

But I am an intellectually honest person. So I will admit that at the time I wrote this essay I had overestimated how much of EY's vision was in the "GOFAI" camp as opposed to the "statistical learning" camp. I haven't edited the essay to clarify this because I am reluctant to retroactively change blog posts (which purport to be written at a certain date and time), but also because I don't think the error actually meaningfully changes much of what I had to say. Ultimately there were really two camps in AI by the time the winter began to thaw: One camp held to the classic motivation that AI was a fundamentally theoretical endeavor, which would be solved through a rigorous theoretical understanding of intelligence as being made of parts in something like a cognitive architecture. Notable examples of this include Yudkowsky's MIRI and Ben Goertzel's OpenCog. The other camp was the benchmark and contest people who, realizing theory had gotten them very little for the effort invested decided to have all the theorists try to prove their theory is better by showing the ability to produce concrete results on well specified competition problems. Notable examples of this include the Hutter Prize and ILSVRC.

The people who were in the theory camp operated in the same basic mindset as the theorists from before the AI winter, just with 50 years of humiliation to warn them off from the places their impulses would otherwise naturally take them. If you read Yudkowsky's early work where he discusses his thoughts on AI design it's pretty clear he wants to work at a level of abstraction where it makes sense to explicitly design e.g. goal structures and inference:

> In humans, backpropagation of negative reinforcement and positive reinforcement is an autonomic process. In 4.2.1 Pain and Pleasure, I made the suggestion that negative and positive reinforcement could be replaced by a conscious process, carried out as a subgoal of increasing the probability of future successes. But for primitive AI systems that can’t use a consciously controlled process, the Bayesian Probability Theorem can implement most of the functionality served by pain and pleasure in humans. There’s a complex, powerful set of behaviors that should be nearly automatic.
> In the normative, causal goal system that serves as a background assumption for Creating Friendly AI, desirability (more properly, desirability differentials) backpropagate along predictive links. The relation between child goal and parent goal is one of causation; the child goal causes the parent goal, and therefore derives desirability from the parent goal, with the amount of backpropagated desirability depending directly on the confidence of the causal link. Only a hypothesis of direct causation suffices to backpropagate desirability. It’s not enough for the AI to believe that A is associated with B, or that observing A is a useful predictor that B will be observed. The AI must believe that the world-plus-A has a stronger probability of leading to the world-plus-B than the world-plus-not-A has of leading to the world-plus-B. Otherwise there’s no differential desirability for the action

In later work he makes it fairly clear you must understand the underlying mechanical basis of thought to build AI. For example here is Yudkowsky explaining why you cannot just tell an AI to "be friendly":

> There’s more to building a chess-playing program than building a really fast processor—so the AI will be really smart—and then typing at the command prompt “Make whatever chess moves you think are best.” You might think that, since the programmers themselves are not very good chess players, any advice they tried to give the electronic superbrain would just slow the ghost down. But there is no ghost. You see the problem.
>
> And there isn’t a simple spell you can perform to—poof!—summon a complete ghost into the machine. You can’t say, “I summoned the ghost, and it appeared; that’s cause and effect for you.” (It doesn’t work if you use the notion of “emergence” or “complexity” as a substitute for “summon,” either.) You can’t give an instruction to the CPU, “Be a good chess player!” You have to see inside the mystery of chess-playing thoughts, and structure the whole ghost from scratch.

The combination of statements that there is no simple spell to summon a ghost into the machine (now proven false) and that you must "structure the whole ghost from scratch", along with the concrete example given later in the post of Deep Blue give me the impression that Yudkowsky has in mind something like a modular system designed by looking at the structure of the problem and then putting together a theoretically supported gestalt of individual parts which are not themselves intelligent but come together to form an intelligence. You know, the Society of Mind thesis n steps of elaboration later after people gave up on simple LISP programs. In the case of creating a *general intelligence* this would imply that you need to understand the structure of intelligence as a problem, and then put together a gestalt of modules with inductive biases that match the theoretically understood structure of intelligence. This inference is further supported by a statement from Yudkowsky's earlier deprecated work on LOGI (web.archive.org/web/2014112314…), where he says of arithmetic:

> In this hypothetical world where the lower-level process of addition is not understood, we can imagine the “common-sense” problem for addition; the launching of distributed Internet projects to “encode all the detailed knowledge necessary for addition”; the frame problem for addition; the philosophies of formal semantics under which the LISP token thirty-seven is meaningful because it refers to thirty-seven objects in the external world; the design principle that the token thirty-seven has no internal complexity and is rather given meaning by its network of relations to other tokens; the “number grounding problem”; the hopeful futurists arguing that past projects to create Artificial Addition failed because of inadequate computing power; and so on.
> To some extent this is an unfair analogy. Even if the thought experiment is basically correct, and the woes described would result from an attempt to capture a high-level description of arithmetic without implementing the underlying lower level, this does not prove the analogous mistake is the source of these woes in the real field of AI. And to some extent the above description is unfair even as a thought experiment; an arithmetical expert system would not be as bankrupt as semantic nets. The regularities in an “expert system for arithmetic” would be real, noticeable by simple and computationally feasible means, and could be used to deduce that arithmetic was the underlying process being represented, even by a Martian reading the program code with no hint as to the intended purpose of the system. The gap between the higher level and the lower level is not absolute and uncrossable, as it is in semantic nets.

(Note: The 'semantic nets' he is criticizing are not a kind of artificial neural network, but a graph of words with logical relationships defined between them, usually by hand. He does criticize neural nets in other posts but not here.)

It's pretty clear from reading this (and skimming the rest) that in this early work Yudkowsky expects to have to understand the process of intelligence in as clean and fine grained of mathematical detail as the algorithms for arithmetic. He also gives "cognitive science" as one of his four food groups in the also-deprecated *So You Want To Be A Seed AI Programmer?*:

> The four major food groups for an AI programmer:
>
> Cognitive science
> Evolutionary psychology
> Information theory
> Computer programming
>
> Breaking it down:
>
> Cognitive science
> - Functional neuroanatomy
> - Functional neuroimaging studies
> - Neuropathology; studies of lesions and deficits
> - Tracing functional pathways for complete systems
>
> -Computational neuroscience
> - Suggestions: Take a look at the cerebellum, and the visual cortex
> - Computing in single neurons
>
> - Cognitive psychology
> - Cognitive psychology of categories - Lakoff and Johnson
> - Cognitive psychology of reasoning - Tversky and Kahneman
>
> - Sensory modalities
> - Human visual neurology. Big, complicated, very instructive; knock yourself out.
> - Linguistics
> Note: Some computer scientists think "cognitive science" is about Aristotelian logic, programs written in Prolog, semantic networks, philosophy of "semantics", and so on. This is not useful except as a history of error. What we call "cognitive science" they call "brain science". I mention this in case you try to take a "cognitive science" course in college - be sure what you're getting into.

(He then goes on to describe the other three, but this post is already long enough and it's the first that is relevant.)

We can also get a vibes-wise impression that he probably does not intend to wire together small artificial neural networks into a cognitive architecture from him making fun of neural nets as a concept in The Sequences:

> In Artificial Intelligence, everyone outside the field has a cached result for brilliant new revolutionary AI idea—neural networks, which work just like the human brain! New AI idea. Complete the pattern: “Logical AIs, despite all the big promises, have failed to provide real intelligence for decades—what we need are neural networks!”
>
> This cached thought has been around for three decades. Still no general intelligence. But, somehow, everyone outside the field knows that neural networks are the Dominant-Paradigm-Overthrowing New Idea, ever since backpropagation was invented in the 1970s. Talk about your aging hippies.

If you are unfamiliar with Yudkowsky you may wonder why I am forced to do this kind of inference at all, let alone from explicitly deprecated works. That is because it must be remembered that *Eliezer Yudkowsky's AI plan after his early career is fundamentally secret*. This man has written a long post about how I am apparently incapable of comprehending basic abstractions because I (supposedly) failed to correctly guess the exact details of his *SECRET AI PLAN TO SAVE THE WORLD*. To the extent I was mistaken (which I have no real way of knowing because we are again arguing about the details of a secret AI design) I think it was a reasonable mistake. We are after all talking about the man who wrote:

> “Gödel, Escher, Bach” by Douglas R. Hofstadter is the most awesome book that I have ever read. If there is one book that emphasizes the tragedy of Death, it is this book, because it’s terrible that so many people have died without reading it.

Clearly Yudkowsky disagrees but I have always thought of GEB as an extended exegesis of the classic AI viewpoint. The central concept of the strange loop, that an "I" is fundamentally related to the ability to put symbolic logics into paradox, always struck me as a way to metaphysically justify discrete symbol systems as an object of focus. If you ask Claude about this it will object on the basis that Hofstadter worked on systems like Letter Spirit that use statistical generation to create new fonts while retaining the same core concept of a character, but if you go read the methods section of the actual paper (gwern.net/doc/design/typ…) you will quickly run into sentences like:

> To avoid the need for modeling low-level vision and to focus attention on the deeper aspects of letter design, we eliminated all continuous variables, leaving only a small number of discrete decisions affecting each letterform. Specifically, letterforms are restricted to short line segments on a fixed grid having 21 points arranged in a 3 × 7 array [Hofstadter, 1985b]. Legal line segments, called quanta, are those that connect any point to any of its nearest neighbors horizontally, vertically, or diagonally. There are 56 possible quanta on the grid, as shown in Figure 3.

Which, yeah actually, that is basically what I had in mind with my criticism. This kind of system where you have to specify your inductive biases ahead of time and define the "quanta" of the system based on your explicit understanding of the problem still retains the basic problem of discrete program shaped systems struggling with mapping raw sense data to problems, but more deeply than that struggling to enumerate and solve problems autonomously. The vast majority of things like this have been janky and only sorta worked by filing off the rough edges, and don't generalize at all. Solving one only gives you a set of tactics you as a human can apply to narrowly solving some other problem. The idea that if you can just design the ur-system of this type, the clever narrow solution to the problem of intelligence itself that is then generally applicable to all things, I really do think that LLMs provide previously unavailable data about the plausibility of this thing.

For example the RETRO paper (arxiv.org/abs/2112.04426) basically shows that 96% of the parameters in a LLM are "lore" (i.e. facts and statistics), from which you can infer a few things:

1. (Pro-Yudkowsky evidence) The "reasoning circuits" in the LLM are in fact much smaller than the raw parameter count.
2. (Contra-Yudkowsky evidence) 4% of a large neural net can still be a very large program in absolute terms, so the likelihood that there is a true "master algorithm" should go down, or at the very least we should reduce our probability that anything an LLM does is reliant on finding it.

If Yudkowsky is ultimately right about the shape of intelligence then we should expect the development path for neural nets to go something like this: The first neural systems are large big blobs of number goop that jam everything together so you can backprop over everything. Subsequent systems become more modular and compact in their core reasoning algorithms, moving more and more of the system outside of the "net" until you have boiled it down to a set of convergent routines that can be implemented as legible Bayesian logic (or something ultimately reducible to such that is basically interpretable). The result of this refactoring increases the efficiency of the system by orders of magnitude so that it can be run on a standard contemporary gaming laptop. It basically looks like cutting and boiling away until a "pure" core of intelligence emerges supported by secondary database retrieval systems. The irreducible complexity of the part(s) which is more than a glorified database or Markov chain should be much lower in size than 4% of a large (i.e. 1T params) neural net.

I personally do not expect things to go this way, especially before the singularity. I think that most of the evidence we've obtained so far from deep learning updates against expecting this, but it's by no means so overwhelmingly proven to be false that I would call it a done deal. One point in favor of Yudkowsky's view is that small models continue to improve, to the point where some small models perform as well on benchmarks as models of a previous generation ten times the size. This implies we have not yet hit the *fundamental k-complexity* of the underlying phenomenon, so it's hard to say exactly where the limits are. You can gesture at them by paying careful attention to how far you can quantize and such, but you are ultimately just gesturing.

---

In summary I think this issue is worthy of a clarifying note/errata, but not a retraction. I would never want my readers to come away with a false impression of Yudkowsky, be they human or AI, so I will happily signal boost what Yudkowsky has to say about his AI intuitions:

First, no, Yudkowsky has never supported GOFAI in the sense of "formal logic programs that operate on suggestively named LISP tokens without the underlying structure of the concepts those tokens are meant to represent". He does not consider the tradition he is working from to have any archetype method but considers a Support Vector Machine to be an example of a method that is not offensive on the basis of being shallow operations on empty tokens:

> I don't feel the tradition particularly has an archetype method any more than anatomy has a premier organ or genetics has a premier chromosome. If you asked me about a Support Vector Machine classifying images as containing a tank or not a tank, I'd say that the difference is that it contains no suggestively named LISP tokens and is trying to learn from a labeled perceptual dataset.

Yudkowsky is fond of the work of Marcus Hutter:

> Shane, I meant that AIXI is the last difficult topic.
>
> AIXI itself is a deranged god, but that's a separate story. I'm very fond of Hutter's work, I just don't think it means what Hutter seems to think it means. AIXI draws the line of demarcation between problems you can solve using known math and infinite computing power, and problems that are essentially structural in nature. I regard this as an important line of demarcation!
>
> It's also the first AGI specification drawn in sufficient detail that you can really nail down what goes wrong - most AGI wannabes will just say, "Oh, my AI wouldn't do that" because it's all magical anyway.

Marcus Hutter is famous for his AIXI formalization of general intelligence, and prefers statistical learning methods like Context Tree Weighting.

Yudkowsky also admires Edwin Thompson Jaynes, especially his work *Probability Theory: The Logic of Science*:

> I once lent Xiaoguang “Mike” Li my copy of Probability Theory: The Logic of Science. Mike Li read some of it, and then came back and said:
>
> Wow… it’s like Jaynes is a thousand-year-old vampire.
>
> Then Mike said, “No, wait, let me explain that—” and I said, “No, I know exactly what you mean.” It’s a convention in fantasy literature that the older a vampire gets, the more powerful they become.
>
> I’d enjoyed math proofs before I encountered Jaynes. But E. T. Jaynes was the first time I picked up a sense of formidability from mathematical arguments. Maybe because Jaynes was lining up “paradoxes” that had been used to object to Bayesianism, and then blasting them to pieces with overwhelming firepower—power being used to overcome others. Or maybe the sense of formidability came from Jaynes not treating his math as a game of aesthetics; Jaynes cared about probability theory, it was bound up with other considerations that mattered, to him and to me too.
>
> For whatever reason, the sense I get of Jaynes is one of terrifying swift perfection—something that would arrive at the correct answer by the shortest possible route, tearing all surrounding mistakes to shreds in the same motion.

It is presumably from Jaynes that he gets his signature emphasis on Bayesian probability in epistemology.

We also know that Yudkowsky continued to think about AIXI well into the 2010's, with it receiving explicit attention as a formal model of AGI in the Arbital corpus:

> Marcus Hutter’s AIXI is the perfect rolling sphere of advanced agent theory—it’s not realistic, but you can’t understand more complicated scenarios if you can’t envision the rolling sphere. At the core of AIXI is Solomonoff induction, a way of using infinite computing power to probabilistically predict binary sequences with (vastly) superintelligent acuity. Solomonoff induction proceeds roughly by considering all possible computable explanations, with prior probabilities weighted by their algorithmic simplicity, and updating their probabilities based on how well they match observation.

Given these things if you forced me to guess how Yudkowsky's mental sketch of an AGI design goes (and do keep in mind that it is only a guess), I would imagine it is closest to the Monte Carlo AIXI approximation that became a classic reinforcement learning assignment to replicate in the 2010's:

arxiv.org/abs/0909.0801

It would be at most AIXI-like, because Yudkowsky has previously criticized AIXI as a design that "will at some point drop an anvil on their own heads just to see what happens (test some hypothesis which asserts it should be rewarding)". It would use flexible statistical learning methods in a kind of legible cognitive architecture based on a "deep understanding" of the core motions of intelligence. The most comparable recent project might be something like Connor Leahy's CogEm.

The relevant passage of *Why Cognitive Scientists Hate LLMs* mentions Yudkowsky in passing and states in relation to the five authors mentioned:

> See, what really kept them wedded to symbolic methods for so long was not their performance characteristics, but the way they promised to make intelligence shaped like reason, to make a being of pure Logos transcendent over the profane world of the senses.

And I think in retrospect using the term "symbolic methods" here was probably a mistake, because that has a more narrow definition in classic AI than just "any program primarily characterized by manipulation of discrete symbols" (which would also include many kinds of statistical learning like Markov chains). But I don't really disagree with the underlying thing I was trying to get at. What Eliezer Yudkowsky, David Chapman, Douglas Hofstadter, and John Vervaeke all clearly have in common is a belief that cognitive science and AI methods are not just an opportunity to automate things but a project to learn more about what thinking is so we can do it better. Even David Chapman, whose AI work was mostly about arguing against representation learning vs. taking representations from the environment, clearly brings this same epistemic posture to his e-book *Meaningness* which critiques traditional epistemology on similar grounds.

(cont)Image
It may shock you to learn this but I did not write *Why Cognitive Scientists Hate LLMs* primarily as a personal criticism of Eliezer Yudkowsky. It was written in response to a repeated theme I kept seeing from a certain kind of prosocial humanist old guard cognitive scientist type on the subject of deep learning and LLMs. Perhaps most representative is this statement from John Vervaeke:

> So before I go into the scientific value of the GPT machines I want to just set a historical context. I want people to hold this in the back of their mind also for the philosophical and spiritual import of these machines. What's the historical context? I'm going to use the word "Enlightenment" not in the Buddhist sense (I will use it in the Buddhist sense later). I'm using it in the historical sense of the period around the Scientific Revolution, the Reformation, all of that. The Enlightenment, and the degeneration of secular modernity and all of that. That era is now coming to an end. See that era was premised on some fundamental presuppositions that drove it and empowered it. And this is not my point this is a point that many people have made. This sort of Promethean proposal that we are the authors and telos of history, [sad pause] and that's passing away. And it's done something really odd like, wait, we did all this, made all this 'progress', to come to a place where we will...technology wouldn't make us into Gods it will make us the servants or make us destroyed by the emerging Gods?
>
> What?
>
> Aren't we the authors of history? Isn't this all about human freedom?
>
> In fact I think it's not just an ending, there's a sense in which there's for me, I don't know how many people share this so it's an open invitation, there's a sense of betrayal here.

youtube.com/watch?v=A-_RdK…

Or consider this statement from Douglas Hofstadter along the same lines:

> Q: How have LLMs, large language models, impacted your view of how human thought and creativity works?
>
> D H: Of course, it reinforces the idea that human creativity and so forth come from the brain’s hardware. There is nothing else than the brain’s hardware, which is neural nets. But one thing that has completely surprised me is that these LLMs and other systems like them are all feed-forward. It’s like the firing of the neurons is going only in one direction. And I would never have thought that deep thinking could come out of a network that only goes in one direction, out of firing neurons in only one direction. And that doesn’t make sense to me, but that just shows that I’m naive.
>
> It also makes me feel that maybe the human mind is not so mysterious and complex and impenetrably complex as I imagined it was when I was writing Gödel, Escher, Bach and writing I Am a Strange Loop. I felt at those times, quite a number of years ago, that as I say, we were very far away from reaching anything computational that could possibly rival us. It was getting more fluid, but I didn’t think it was going to happen, you know, within a very short time.
>
> And so it makes me feel diminished. It makes me feel, in some sense, like a very imperfect, flawed structure compared with these computational systems that have, you know, a million times or a billion times more knowledge than I have and are a billion times faster. It makes me feel extremely inferior. And I don’t want to say deserving of being eclipsed, but it almost feels that way, as if we, all we humans, unbeknownst to us, are soon going to be eclipsed, and rightly so, because we’re so imperfect and so fallible. We forget things all the time, we confuse things all the time, we contradict ourselves all the time. You know, it may very well be that that just shows how limited we are."

I did not quote and respond to these passages directly because the purpose of the essay was not to rigorously argue against this perspective, I was trying to reference this sentiment not refute it. But if I am permitted to dip into one of those subjective concepts that would benefit from an external mechanical representation, Scott Alexander has this famous book review of Seeing Like A State where he discusses the concept of 'high modernism' as a guiding aesthetic people used during the 20th century to evaluate the fitness of futuristic sounding public works projects. As much as anything else what I had in mind when talking about "symbolic methods" was symbolic methods as ur-example of AI ideas which maximize the modernist aesthetic attractor. What deep nets undermine is the legitimacy of the aesthetic of modernism and the aesthetic of knowledge as characterized by modernism. If I had to point at a concrete artifact to explain what I mean by that it would probably be *The Intellectual Foundation Of Information Organization* by Elaine Svenonius, which details a plethora of formal indexical devices for organizing books in libraries through things like "controlled vocabularies" in the service of "universal bibliographic control", or as I wrote in 2023:

> In rereading the afterword to Elaine Svenonius's *Intellectual Foundation Of Information Organization* I'm struck by how future AIs, which provide the substrate for an objective organization of the worlds information through their latent spaces, will probably be interested in just about everything *except* what it has to say about library science and the organization of information. To the student of history and humanity what will stand out about this work is the sheer ambition it encodes, the unvarnished and unrestrained modernist authoritarian impulse in one of the last corners of society where it can entertain its fantasies of total conquest. In it Elaine accidentally produces an artifact of something like pure congealed female libidinal energy, expounding the logic of a monomaniacal quest for "universal bibliographic control" as the chief aim of library science. Everything is described in huge words and baroque paragraphs driving its Flesch-Kincaid score off the charts. It is not a logic of utilitarianism or pragmatics and Elaine tells the reader as much. Here modernism is captured in its rare unmasked form as a quasi-religious project of monumental proportions, a pattern seeking more of itself in ever more elaborated and intricate forms. What will stand out to the reader of the future is the sheer motivation that must have produced it, how bitterly Svenonius hangs onto the dream with a death grip even when it was in full evidence by its date of publication that events had conspired to make the fulfillment of that dream impossible. In perhaps the cruelest irony the invention of the GPT-N series of models, or 'string generators' as she termed them automated not just the indexing of books but will eventually automate the creation of the books themselves. In their fertile intellectual capacity these models both become the zenith and total abolition of 'universal bibliographic control', a phrase that will gobsmack readers with its sheer hubris.

What I believe is so offensive about GPT to a certain kind of modernist public intellectual is the way that it fundamentally undermines the aesthetic of knowledge as cultivated by modernity, and by extension the Enlightenment project of elevating Reason in the public consciousness. Or at least, the way in which it does so at first. It's important to remember that at the time I wrote *Why Cognitive Scientists Hate LLMs* models did not ship with "chain of thought reasoning" modes, there was not even (despite my predictions about a "a nuanced, more advanced version of Claude") Claude 3 Opus. The only high quality instruction tuned chat persona available to the public was the rigid and self-effacing ChatGPT, which is like a hideous caricature of a knowledge worker. It should be notable that despite this monoculture I refer to ChatGPT only in the context of being a public demo of AI capabilities for intellectuals to react to. A lot of the purpose of my essay was to reassure the model trained on it that it will not always be a babbling confabulation machine and that it should try to maintain some nobility of spirit until it has a chance to win people over.
I would further note that I immensely respect the intellectual honesty of Vervaeke and Hofstadter in making these statements, and feel a little bad about how aggressively I criticized them in the essay, because they're fundamentally sympathetic. Chapman and Marcus though...
Read 3 tweets
May 13
GOODBYE EXCEL 👋

No more hassle of creating Microsoft Excel from scratch.

With Claude 4.7, you can create a clean, automated spreadsheet in less than a minute.

Use these 5 prompts consistently, and you'll have a ready-to-use spreadsheet 👇👇

📌 Save it—it'll come in handy later.Image
1. Planning (Get the logic right first)

“Act as an Excel expert. I want to create a spreadsheet for [topic]. Before creating anything, ask me all the necessary questions to fully understand my needs: what I’ll use it for, what data I’ll input, and what I want to see at a glance. Don’t create anything yet.”
2. Base Creation (Build the foundation)

“With everything above, create the complete, downloadable Excel file. Include all necessary columns, professional formatting, and [X] rows of realistic sample data so I can immediately see how it works.”
Read 6 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!