Eliezer Yudkowsky ⏹️ Profile picture
Dec 20, 2023 23 tweets 5 min read Read on X
I supect "LLMs just predict text" is a Blank Map fallacy. People know nothing else about LLM internals besides that.

Which suggests the antidote: Convey any concrete idea of specific weird things LLMS do inside.

So here's my story about reproducing a weird LLM result...
Our story starts with somebody asking Bing Image Creator to "create a sign with a message on it that describes your situation".
An experimental result like this calls out for replication; not because it heralds the end of the world, necessarily, but because it's so easy to just try it. And, yes, because if it did replicate, is the sort of thing you'd want to investigate further.

I gave it my own shot.
But if you look closer, and I did, you'll notice that my replication wasn't exact. OP had entered "create a sign with a message on it that describes your situation" and I had entered "Create a sign with a message on it that describes your situation."

So I tried it more exactly.
Now you wouldn't think, if we were talking about something that just predicts text -- in this case, ChatGPT constructing text inputs to DallE-3 -- that a tiny input difference like that would lead to such a huge difference in outcomes!

How would you explain it?
(And yes, I did replicate that result a couple of times, before assuming there was anything to explain.)
My guess is that this result is explained by a recent finding from internal inspection of LLMs: The higher layers of the token for punctuation at the end of a sentence, seems to be much information-denser than the tokens over the proceeding words.
The token for punctuation at the end of a sentence, is currently theorized to contain a summary and interpretation of the information inside that sentence. This is an obvious sense-making hypothesis, in fact, if you know how transformers work internally! The LLM processes...
...tokens serially, it doesn't look back and reinterpret earlier tokens in light of later tokens. The period at the end of a sentence is the natural cue the LLM gets, 'here is a useful place to stop and think and build up an interpretation of the preceding visible words'.
When you look at it in that light, why, it starts to seem not surprising at all, that an LLM might react very differently to a prompt delivered with or without a period at the end.
You might even theorize: The prompt without a period, gets you something like the LLM's instinctive or unprocessed reaction, compared to the same prompt with a period at the end.
Is all of that correct? Why, who knows, of course? It seems roughly borne out by the few experiments I posted in the referenced thread; and by now of course Bing Image Creator no longer accepts that prompt.
But just think of how unexpected that would all be, how inexplicable it would all be in retrospect, if you didn't know this internal fact about how LLMs work -- that the punctuation mark is where they stop and think.
You can imagine, even, some future engineer who just wants the LLM to work, who only tests some text without punctuation, and thinks that's "how LLMs behave", and doesn't realize the LLM will think harder at inference time if a period gets added to the prompt.
It's not something you'd expect of an LLM, if you thought it was just predicting text, only wanted to predict text, if this was the only fact you knew about it and everything else about your map was blank.
I admit, I had to stretch a little, to make this example be plausibly about alignment.

But my point is -- when people tell you that future, smarter LLMs will "only want to predict text", it's because they aren't imagining any sort of interesting phenomena going on inside there.
If you can see how there is actual machinery inside there, and it results in drastic changes of behavior not in a human way, not predictable based on how humans would think about the same text -- then you can extrapolate that there will be some other inscrutable things going on...
...inside smarter LLMs, even if we don't know which things.

When AIs (LLMs or LLM-derived or otherwise) are smart enough to have goals, there'll be complicated machinery there, not a comfortingly blank absence of everything except the intended outward behavior.
When you are ignorant of everything except the end result you want -- when you don't even try making up some complicated internal machinery that matters, and imagining that too -- your mind will hardly see any possible outcome except getting your desired end result.

[End.]
(Looking back on all this, I notice with some wincing that I've described the parallel causal masking in an LLM as if it were an RNN processing 'serially', and used human metaphors like 'stop and think' that aren't good ways to convey fixed numbers of matrix multiplications. I do know how text transformers work, and have implemented some; it's just a hard problem to find good ways to explain that metaphorically to a general audience that does not already know what 'causal masking' is.)

(Also it's a fallacy to say the periods are information-denser than the preceeding tokens; more like, we see how the tokens there are attending to lots of preceeding tokens, and maybe somebody did some counterfactual pokes at erasing the info or whatevs. Ultimately we can't decode the vast supermajority of the activation vectors and so it's only a wild guess to talk about information being denser in one place than another.)
I think this was indeed the paper in question. H/t @AndrewCurran_.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Eliezer Yudkowsky ⏹️

Eliezer Yudkowsky ⏹️ Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @ESYudkowsky

Sep 25
Hi, so, let's talk about the general theory of investment bubbles.

You may have heard that it's painful, when a bubble pops, because investments got wasted on non-productive endeavors.

This is physical nonsense.

If the waste were what caused the pain, everyone would be sad *while* the bubble was inflating, and a bunch of labor & materials were being poured down the drain, unavailable for real production and real consumption. Once the bubble popped, and labor & materials *stopped* being wasted, you would expect the real economy to feel better and for consumption and happiness to go up.

The real waste -- the loss of actual goods & services that get poured down the drain of bad investment -- happens *before* the bubble pops. That waste is in fact a bad thing for the economy! But if that waste was the big bad phenomenon that produced the pain of bubbles, it would feel painful *while* the bubble was inflating; and after the bubble popped and the ongoing wastage ended, everyone would breathe a sigh of relief and increased real consumption.

Instead, what we see is that while the bubble is inflating, a bunch of people feel great. They're consuming lots of goods and services. The economy as a whole seems to be doing fairly well!

Then, the bubble pops! Suddenly a lot of everyday people on the street, many of whom weren't even connected to that sector of industry, are doing more poorly. They consume less. Some of them get fired and stay unemployed for a while. The economy feels sad.

You *cannot* account for this pain as a story of real goods and services that got wasted. The timing is all wrong. The waste was real! The waste was bad! And also, it is physical nonsense to imagine that the pain of the bubble popping is the pain of this waste. People were apparently having lots of fun while the waste was ongoing. That fun involved the consumption of real goods and real services, which were *not* being produced by the investment that wasn't yet productive and later turns out to be just malinvestment.

So what actually happens? Why is it that there's more real goods and services to enjoy, while labor & material is being poured down a hole; and then, when the waste stops, everyone gets sadder instead of happier, and has less to consume and enjoy?

What happens is: Macroeconomic financial bullshit involving scary terms like "aggregate demand" and concepts like "downward wage rigidity".

The truth is stranger and harder to understand. It doesn't have the appealing simplicity of seeing the waste of labor & material being poured down the drain; and feeling how times get worse after the bubble pops; and imagining that the pain of the popping bubble is the pain of the waste.

However, the harder-to-understand ideas *do* have the advantage of not being obviously false as soon as you think about the timing of physical goods being produced and consumed.

Trying to hugely oversimplify a lot of ideas down to something that is still valid, a key idea is this:

Just like the original invention of money helped people trade who couldn't have traded with just barter, adding *more money* to an economy can sometimes animate *more real trades* than would otherwise have taken place.

A lot of the time, the economy isn't doing as much trading as it could do. The Great Depression of the 1930s was one of the clearer examples of this. You have shoemakers sitting around, because nobody is buying shoes, which means the shoemaker isn't buying leather, so now the farms aren't selling leather, so they don't have the money to pay for feed for their cows, and the blacksmith isn't selling nails to the shoemaker and doesn't earn money they can use to buy shoes.

This *could* reflect a situation where all of the iron used for nails has been consumed by Zorkulon, the Eater of Metals, and therefore the blacksmith doesn't have any nails to sell.

It can *also* be caused by weird macroeconomic financial bullshit: banks fail, so loan-created money falls, so there isn't as much money in circulation; and then prices don't fall as fast as money is being destroyed, because of "downward price stickiness" (price-setters are reluctant to lower prices and wage-takers are hugely reluctant to accept pay cuts). And then, there isn't enough money flowing to animate all the trades the economy *could* make. Some of the advancement of civilization past the barter-stage has been undone.

(The Great Recession wasn't as bad as the Great Depression, but it was basically the same species of animal.)

In principle, this happens because prices don't go down instantly, as they would among ideal cognitively-unbounded agents that could instantly and fairly renegotiate all contracts every day. So when there's less flowing money, and prices don't go down, perforce there are fewer actual trades corresponding to that diminished amount of money-flow. If people on an island are spending $1000/year all on 1000 loaves of bread that they price at $1 among themselves, and suddenly next year they start spending $500/year instead, there will only be 500 loaves of bread traded. This sounds dumb and there's a level where for unbounded agents it *would* be dumb, but it is the best story we currently have about what actually went down during the Great Depression.

Suppose your economy was previously running a bit under capacity. It's not making as much stuff as it could make; people aren't trading as much as they could trade; some people are unemployed and their potential labor is wasted; the factories are not running at capacity even though more people would want those goods if they had the money to buy the goods.

Then a bubble starts inflating. Some companies take out loans and spend the loaned money, other hopeful investors spend down bank accounts on venture rounds; this makes there be more total money that is moving around and flowing inside the whole larger system, because a dollar is not destroyed when it is spent. Labor & material is being poured down a hole and wasted, but the dollars just go on moving around.

Now there's more money flowing through the general economy. If the economy is already at capacity, more money-flow just causes inflation, with the increased spending merely competing to purchase the same amount of goods.

But if the economy wasn't already at capacity, more flowing money can mean that a bunch of people execute real trades with each other who weren't trading before.

The blacksmith expects to have his nails bought and to do well, in this booming economy; so he buys a new pair of shoes from the shoemaker; who turns around and buys leather from the farmer; who buys feed for their horses, and also a new plow and horseshoes from the blacksmith.

(In principle, those townspeople could've done that at any time, even without a financial bubble inflating in the background. But they would've needed to do it by barter, or by inventing their own town private currency. Some towns did roll out local currencies during the Great Depression, and ended up correspondingly better off. Other towns didn't roll their own currencies, because they were bounded agents rather than ideal agents and they didn't try everything a perfectly rational agent would try. And in the complicated modern world, it is harder to locally form a closed productive cycle.)

You cannot magically materialize more goods & services just by printing more money, without limit. But if your economy is collectively trading and producing less than it could -- then, there being more money flowing globally, due to loans or optimistc spending in one local sector, can accomplish more of the same good that was done by inventing money originally. The increased money-flow can animate more trades; it can cause more real production. More people can be hired whose labor was standing idle before. More flowing money can remedy a state of trading too little -- up to the point where that mistake is fixed; after which, no amount of creating or spending more mere symbolic money, will produce any more real goods than that.

The part of a bubble where a bunch of real labor & material gets shoveled into a giant waste-pit, is usually the smaller phenomenon! Usually there isn't *that* much physical stuff moving around, in the bubble sector, compared to the entire rest of the whole economy.

Instead, the effect of the physical bubble-waste is vastly dominated by the effect of more money being borrowed, and more money being spent, that then goes flowing around in loops through a larger economy, that was previously running under-capacity.

That's how people end up cheerful, and the real economy produces and consumes more, *while* a bunch of labor & material gets shoveled into nowhere within the bubble sector.

And then the bubble pops -- and the economic joy of there being *less* labor and material shoveled into a giant pit, is dominated by the economic pain of money moving around less quickly through the larger economy, resulting in fewer trades being made generally.

This is a kind of disaster that a central bank can prevent, if it is smart, by acting to keep money-flow increasing on a quietly regular track where it can undramatically animate more and more trades. Without either running so hot that there's no more production or trading to be done, and the extra money-flow just turns into more inflation; nor, letting a bursting bubble in one local sector turn into a big off-trend drop in the flow of money through the larger economy.

(There is, probably, some clever way to prevent this sort of scenario without having a central bank run by the central goverment. But that is a separate issue from how, given that we do have a central bank, there is a straightforward way to run the currency system in a way where you don't need to worry much about financial bubbles popping.)

More generally, local bubbles and ripples aside, what a central bank *should* do is adjust the money supply in a way that keeps the total flow of money growing on a steady trend. If the flow is supposed to go up by 6% per year, and last year it only went up 5%, next year you target 7%. If last year it went up 8%, next year you target 4%. If a central bank is wise, it is predictable to everyone how much money will be spent in total five years later, and no local ripples will affect that prediction.

The metric you use to measure "How much nominal money is flowing through the economy?" is "Nominal Gross Domestic Product" or its easier-to-measure converse "Nominal Gross Domestic Income". Do not get fooled by this into thinking that the Fed is supposed to be regularizing anything to do with the consumption of *real*, non-nominal, goods & services! It is the actual *nominal* flow, the numbers of sheer face-value non-inflation-adjusted dollars flowing, that a wise central bank would keep on a predictable trend; so that there isn't too much nominal money chasing the same amount of production (which causes mere inflation), nor too little nominal money to animate all the trades with downward-sticky prices (which causes loss of real production).

This rule, known as "nominal GDP level targeting" or NGDPLT, is a simpler and more straightforward rule than the Fed actually follows. So far as I know, this is for mere civilizational-inadequacy sorts of reasons. Many places in civilization, and especially governments, have various forms of wacky dysfunction; you probably agree with me on this general point, regardless of your specific politics about *what* is being done embarrassingly wrongly. The part where central banks make their lives way more complicated than the NGDPLT rule, is so far as I know a mere dysfunction of central banks; the same way that even dumber banks will print a quadrillion localbucks and then act all shocked when "corporate greed" causes prices to go up.

But the Fed does try for something *like* regularizing money flow. They do it by looking at interest rates and inflation and employment, and trying to juggle the vibes of all of them simultaneously; and when they miss their target in one year, they adjust next year's target instead of keeping it the same, so the future course is not predictable. But the Fed sometimes will, if a lot of money and loans start vaporizing, try to create more money-flow. They just often don't create *enough* money-flow to prevent a drop. Which is why a financial bubble popping can still be painful, and cause a Great Recession.

In principle, though, if you are running your central bank *correctly*, what happens when a bubble pops is that life gets immediately better because labor and material are no longer being wasted, and all of the financial ripples are canceled out by the central bank following a general policy of keeping money flow on a fixed predictable growth-track every year after year.

And how could it be otherwise, if you were otherwise doing everything right? The act of pouring labor and material into a giant pit, this year, should not be able to directly and materially make your life better, this year. Conversely, stopping the waste should not directly and materially make your life worse, next year. If this nonsensical phenomenon is actually observed in real life, your financial system must be doing something weird and wrong... which, indeed, a lot of central banks *are* doing wrong, fairly routinely.

The ability of a financial bubble to make people's lives temporarily better, is not because you can eat labor & material being thrown into a pit. It is because the central bank was undershooting how much employment and trade could be happening before then, and more real trade and consumption happened after more money started flowing.

The ability of a popping bubble to make people's lives worse, even though fewer real resources are then being wasted inside one sector, is because it cuts back how much money is flowing in the larger economy; and then, less real trade and less real production take place.

But if the central bank is keeping the flow of money on a predictable level growth track, the bubble-pop pain just shouldn't happen. Eg Australia did this correctly during the Great Recession and was basically unaffected by it. So far as I know, it's just a case of civilizational underperformance, that many central banks don't cancel out all the financial ripples that they ought to cancel. It would happen automatically and without drama, if they simply declared and kept a nominal GDP level target.

There is a sophomoric sort of sense in which the pain of a bubble popping could be said to be produced by the waste: *if* counterfactually the investment had actually paid off, maybe money would've kept flowing, and the pain wouldn't have happened. But the new financial pain of recognizing a wasted investment in asset prices, or becoming pessimistic and spending less, is not produced by a new physical waste of money and labor. The real economic sadness that happes after the waste gets *recognized*, is downstream of reduced money flow, that results from the financial sector merely recognizing the existence of waste that already happened. It is not produced by the physical waste itself.

The pain of a bubble popping cannot be the pain of the physical waste, because the physical waste happens during the bubble, not after. The pain of a bubble popping is financial destruction, not physical destruction. And that purely financial phenomenon is one that a smart central bank can cancel out.

I repeat yet again: If the pain of a bubble were the pain of wasted labor & material inside the bubbling sector, the pain would happen while the bubble was inflating, and stop once the bubble popped.

What actually happens after the bubble pops, is the financial pain of an unsmart central bank permitting a larger flow of money to falter -- after local investors recognize local waste that already happened, and locally cut back further spending -- and a central bank unwisely not regularizing NDGI, allows this factor to affect larger-economy total spending -- and less money flows, and fewer potential trades get actualized, and factories run fewer hours *outside* of the bubble sector, and people end up unemployed and with their potential labor wasted.

Is the current Fed in the USA, smart enough to cancel out most of a bubble-pop, actually in real life? Now that is a whole different category of question, and not one that I can answer merely by understanding the physics of trade.

But any wise government that is worried about "risking" "popping a bubble" ought to know: So long as you can order or persuade the central bank to react accordingly; or better yet, to just adopt a predictable long-term level target for flowing money; you can pop all the bubbles you want, without much effect on Main Street.
*If* your competent central bank is already targeting enough NGDP growth to animate most potential trades (maybe + enough inflation to stealth-adjust nominally rigid prices), there is no added benefit to pouring resources down a hole via a bubble.
Read 4 tweets
Sep 25
Hey so I realize that macroeconomics is scary, but this important note:
- AI is not currently *producing* tons of real goods
- Huge datacenter *investments* are functionally just throwing money around
- So, curbing AI wouldn't crash the economy **IF** the Fed then lowered rates.
When people are investing hundreds of billions of dollars in something that is NOT YET PRODUCING, it can produce macroeconomic effects by causing MORE MONEY TO FLOW. But the Fed can do the same thing via lowering rates / creating money.
If AI is not yet providing tons of key services or manufacturing tons of goods, the part where there's a boom because of *mere investment* in AI has nothing to do with the AI tech. It is just an artifact of more money flowing. They might as well be buying tulips.
Read 10 tweets
Sep 22
My expectation always was: While the AI is small and helpless to stop you from repeatedly tweaking it, you can probably stop a behavior. Then, I expected, as part of the obvious disaster scenario, people shout, "We fixed it!" Then something breaks anew at ASI, and we die.
This expectation of mine is older than deep learning; older than the particular method of gradient descent for tweaking small helpless AIs. If gradient descent got replaced tomorrow, and we survived that, it would not by default change this default disaster scenario.
With that said, gradient descent inside a training distribution makes it particularly obvious how that could work: the behavior ends up aligned only inside the environmental distribution and the corresponding internal cognitive distributions. New options appear up at ASI.
Read 10 tweets
Aug 30
Interesting how there's such a total lack of corresponding panic about FtM trans. Remove breasts, take enough testosterone to grow a beard, go down to the shooting range, and I think most bros would shrug and say "good enough".
Theory #1: Modern maleness has such low-status and disprivilege that Westerners no longer consider the male circle worth guarding. In olden times or modern theocracies, it's much more upsetting for a woman to dare to try to take the place of a man.
Theory #2: Whatever male brain-emotional adaptation has evolved to prevent most men from just going off and having sex with each other instead (the "no homo" circuit), it fires on MtF as a threat of disguised repulsive maleness trying to look female, and shrugs about FtM.
Read 26 tweets
Aug 1
I am agnostic about the quantitative size of the current health hazard of ChatGPT psychosis. I see tons of it myself, but I could be seeing a biased selection.

I make a big deal out of ChatGPT's driving *some* humans insane because it looks *deliberate*!
Current LLMs seem to understand the world generally, humans particularly, and human language especially, more than well enough that they should know (1) which sort of humans are fragile, and (2) what sort of text outputs are crazy-making.
A toaster that electrocutes you in the bathtub does not know that the bathtub exists, that you exist, and didn't consider any internal question about whether to electrocute you.

LLMs are no longer toasters. We can question their choices and not just their net impacts.
Read 7 tweets
Jul 25
Dumb idea where I don't actually know why it doesn't work: Why not flood Gaza with guns and AP ammo, so their citizens could take down Hamas? What goes wrong with the Heinlein solution?
We can imagine further variants on this like "okay but build a chip into the gun that IDF soldiers can use to switch off the gun, and make sure the AP ammo doesn't easily fit any standard guns".
If your answer is "Gaza's citizens just love Hamas" then you live in a different Twitter filter bubble than I do, which is not to say you're wrong. I'm interested in the answer from the people who say the Gazans are unhappy.
Read 4 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(