views
Eliezer Yudkowsky ⏹️ Profile picture
Mar 13 4 tweets 28 min read Read on X
"Ignore all these elaborate, abstract, theoretical predictions," the Spokesperson for Ponzi Pyramid Incorporated said in a firm, reassuring tone. "Empirically, everyone who's invested in Bernie Bankman has received back 144% of what they invested two years later."

"That's not how 'empiricism' works," said the Epistemologist. "You're still making the assumption that --"

"You could only believe that something different would happen in the future, if you believed in elaborate theoretical analyses of Bernie Bankman's unobservable internal motives and internal finances," said the spokesperson for Ponzi Pyramid Incorporated. "If you are a virtuous skeptic who doesn't trust in overcomplicated arguments, you'll believe that future investments will also pay back 144%, just like in the past. That's the prediction you make if you predict based purely on empirical observations, instead of theories about a future nobody has seen!"

"That's not how anything works," said the Epistemologist. "Every future prediction has a theory connecting it to our past observations. There's no such thing as going from past observations directly to future predictions, with no theory, no assumptions, to cross the gap --"

"Sure there's such a thing as a purely empirical prediction," said the Ponzi spokesperson. "I just made one. Not to mention, my dear audience, are you really going to trust anything as complicated as epistemology?"

"The alternative to thinking about epistemology is letting other people do your thinking about it for you," said the Epistemologist. "You're saying, 'If we observe proposition X "past investors in the Ponzi Pyramid getting paid back 144% in two years", that implies prediction Y "this next set of investors in the Ponzi Pyramid will get paid back 144% in two years"'. X and Y are distinct propositions, so you must have some theory saying 'X -> Y' that lets you put in X and get out Y."

"But my theory is empirically proven, unlike yours!" said the Spokesperson.

"...nnnnoooo it's not," said the Epistemologist. "I agree we've observed your X, that past investors in the Ponzi Pyramid got 144% returns in 2 years -- those investors who withdrew their money instead of leaving it in to accumulate future returns, that is, not quite all investors. But just like prediction Y of 'the next set of investors will also receive 144% in 2 years' is not observed, the connecting implication 'if X, then Y' is not yet observed, just like Y itself is not observed. When you go through the step 'if observation X, then prediction Y' you're invoking an argument or belief whose truth is not established by observation, and hence must be established by some sort of argument or theory. Now, you might claim to have a better theoretical argument for 'X -> Y' over 'X -> not Y', but it would not be an empirical observation either way."

"You say words," replied the Spokesperson, "and all I hear are -- words words words! If you instead just look with your eyes at past investors in the Ponzi Pyramid, you'll see that every one of them got back 144% of their investments in just two years! Use your eyes, not your ears!"

"There's a possible theory that Bernie Bankman is making wise investments himself, and so multiplying invested money by 1.2X every year, then honestly returning that money to any investor who withdraws it," said the Epistemologist. "There's another theory which says that Bernie Bankman has been getting more money invested every year, and is using some of the new investments to pay back some fraction of previous investors who demanded their money back --"

"Why would Bernie Bankman do that, instead of taking all the money right away?" inquired the Spokesperson. "If he's as selfish and as greedy and dishonest as you say, wouldn't he just keep the money?"

"So that he could get even more money from new investors, attracted by seeing his previous investors paid off, of course," said the Epistemologist. "And realistically, so that Bernie Bankman could maintain his comfortable present position in society and his current set of friends, as is often a greater motivator in human affairs than money."

"So we see Bernie Bankman giving people money -- that is what empiricism and observation tell us -- but you would tell people with your words and reasoning that Bernie Bankman is a greedy man who keeps all investments for himself? What a great divergence we see again between empirical observation, and elaborate unobservable theories!"

"We agree on what has already been observed of Bernie Bankman's outward behavior," said the Epistemologist. "When it comes to Bernie Bankman's unobserved interior thoughts -- your unobserved theory 'he is honest', is no more or less empirical or theoretical, than the unobserved theory 'he is scheming'. 'Honest' and 'scheming' are two possible values of a latent variable of the environment, a latent variable which cannot be directly observed, and must be inferred as the hidden cause of what we can observe. One value of the unseen variable is not more already-observed than another. The X->Y implication from the previous money-returning behavior we did observe, to Bernie Bankman's latent honesty or dishonesty, is likewise itself something we do not observe; the 'if you observe X, infer latent Y' step is something given to us by theory rather than observation."

"And furthermore," continued the Epistemologist, a touch of irritation now entering that voice, "I don't actually think it's all that complicated of a theory, to understand why Bernie Bankman would schemingly give back the money of the first few investors. The only reason why somebody would fail to understand this simple idea, is this person yelling at you that any alternative to blind surface generalization is 'theoretical' and 'not empirical'. Plenty of people would be able to understand this concept without dragging epistemology into it at all. Of course observing somebody giving back a small amount of money, doesn't prove they'll later give you back a large amount of money; there's more than one reason they could be behaving nicely around low stakes."

"The Epistemologist will give you words," said the Spokesperson to the watching audience. "Bernie Bankman gives you money! 144% returns in 2 years! Every scientist who's measured Bankman's behavior agrees that this is the empirical, already-observed truth of what will happen! Now, as a further proof that my opponent's claims are not just wrong, but unscientific, let me ask this -- do you, Epistemologist, claim with 100% probability that this next set of investors' investments, cannot be paid back two years from now?"

"That's not something I can know with certainty about the unobserved future," said the Epistemologist. "Even conditional on the 'scheming' hypothesis, I can't, actually, know that Ponzi Pyramid Incorporated will bust within 2 years specifically. Maybe you'll get enough new investors, or few enough of these investors will withdraw their funds, that this company will continue for another 2 years --"

"You see?" cried the Spokesperson. "Not only is this theory unsupported empirically, it is also unfalsifiable! For where I tell you with certainty that all your money will be repaid and more, 2 years hence -- this one claims that your money might or might not be repaid! Why, if Bernie Bankman repays 144% in 2 years yet again, what will this one say? Only that Ponzi Pyramid hasn't busted yet and that it might bust later! Can you ask for a better example of scientific vice, contrasted to my own scientific virtue? Observation makes a bold, clear, falsifiable statement, where elaborate predictions only waffle!"

"If a reasonable person would say that there's a 50% chance of the Ponzi Pyramid busting in two years," replied the Epistemologist wearily, "it is not more scientifically virtuous to say the chance is 0% instead, only because there is then a 50% chance of your claim turning out to be definitely false and you getting to say a scientifically virtuous 'oops' (if you'd even say it)."

"To give an even simpler example," continued the Epistemologist, "let's say we're flipping a coin that I think is fair, and you say is biased to produce 100% heads. Your theory stands a 50% chance of being falsified, whereas mine will not be falsified no matter what the coin shows -- but that doesn't mean that every time you pick up a coin on the street, it's the course of scientific virtue to decide the coin must be biased 100% heads. Being relatively easier to falsify is a convenient property for a belief to have, but that convenience is not the only important virtue of a belief, and not all true beliefs have it. All the distinct kinds of epistemic virtue must be kept distinct in our thoughts, or we will quite confuse ourselves."

"To give yet another example," added the Epistemologist, "let's say you're considering whether to run blindly toward the edge of a cliff. I might not be able to predict exactly how fast you'll run. So I won't be able to predict whether or not you'll already be falling, or dead, after five more seconds have passed. This does not mean that the theory 'I will fly and never die' should be seen as more reasonable or more scientific, merely because it makes a more certain claim about whether or not you'll be alive five seconds later."

"What an incredible set of excuses for having no definite predictions about what will happen two years later!" the Spokesperson said, smiling and mugging to the audience. "Believe your eyes! Believe in empiricism! Believe -- in Science! Believe, above all, in the definite factual observation: investors who invest in the Ponzi Pyramid get 144% of their money back after 2 years! All the rest is words words words and thinking."
"Hm," said a watching Scientist. "I see the force of your theoretical claims about epistemology, Epistemologist. But I cannot help but feel intuitively that there is something to this Spokesperson's words, too, even if they are not exactly logically correct according to your meta-theory. When we have observed so many previous investors getting 144% returns from Bernie Bankman's Ponzi Pyramid after 2 years, is there not some real sense in which it is more empirical to say the same thing will happen to future investors, and less empirical to say that a different thing will happen in the future instead? The former prediction seems to me to be more driven by the data we already have, and the latter prediction to be driven by something more like thinking and imagining. I see how both predictions must be predictions, from the standpoint of epistemology, and involve something like an assumption or a theory that connects the past to the future. But can we not say that the Spokesperson's predictions involve fewer assumptions and less theory and are more driven by looking at the data, compared to yours?"

"So to be clear," said the Epistemologist to the Scientist, "you are saying that the prediction which involves the fewest assumptions and the least theory, is that Bernie Bankman's Ponzi Pyramid will go on multiplying all investments by a factor of 1.2 every year, indefinitely, to the end of the universe and past it?"

"Well, no," said the Scientist. "We have only observed Bernie Bankman to multiply investments by 1.2 per year, in the present socioeconomic context. It would not be reasonable to extend out the observations to beyond that context -- to say that Bernie Bankman could go on delivering those returns after a global thermonuclear war, for example. To say nothing of after all the protons decay, and the black holes evaporate, and time comes to an end in a sea of chaos."

"I inquire of you," said the Epistemologist, "whether your belief that Bernie Bankman would stop delivering good returns after a thermonuclear war, is more theory-laden, less empirical, than a belief that Bernie Bankman goes on multiplying investments 1.2-fold forever. Perhaps your belief has other virtues that make it superior to the belief in 'eternal returns', as we might call them. But it is nonetheless the case that the 'eternal returns' theory has the advantage of being less theory-laden and more empirical?"

The Scientist frowned. "Hm. To be clear, I agree with you that the 'eternal returns' theory must be less correct -- but I'm not quite sure it feels right to call it more empirical -- to say that it has one sin and one virtue, like that..." The Scientist paused. "Ah, I have it! To say that Bernie Bankman would stop returning investments after a global thermonuclear war, I need to bring in my beliefs about nuclear physics. But those beliefs are themselves well-confirmed by observation, so to deny them to hold true about Bernie Bankman's Ponzi Pyramid would be most unempirical and unvirtuous." The Scientist smilled and nodded to himself.

"I put to you, then," said the Epistemologist, "that your prediction that Bernie Bankman would stop delivering good returns after a thermonuclear war, is indeed more 'theory-laden' in your intuitive sense, than the prediction that Bernie Bankman simply goes on delivering 1.2X returns forever. It is just that you happen to like the theories you are lading on, for reasons which include that you think they are full of delicious empiricist virtue."

"Could I not also say," said the Scientist, "that I have only observed the Ponzi Pyramid to deliver returns within a particular socioeconomic context, and so empiricism says to only generalize inside of the context that holds all my previous observations?"

The Epistemologist smiled. "I could just as easily say myself that such schemes often go through two phases, the part where he's scheming to take your money and the part where he actually takes it; and say from within my own theoretical stance that we ought not to generalize from the 'scheming to take your money' context to the 'actually taking it' context." The Epistemologist paused, then added, "Though to be precise about the object-level story, it's a tragic truth that many schemes like that start with a flawed person having a dumb but relatively more honest plan to deliver investment returns. It's only after their first honest scheme fails, that as an alternative to painful confession, they start concealing the failure and paying off early investors with later investors' money -- sometimes telling themselves the whole while that they mean to eventually pay off everyone, and other times having explicitly switched to being con artists. Others, of course, are con artists from the beginning. So there may be a 'naive' phase that can come before the 'concealment' phase or the 'sting' phase... but I digress." The Epistemologist shook his head, returning to the previous topic. "My point is, my theory could be viewed as specializing our past observations to within a context, just like your theory does; and yet my theory yields a different prediction from yours, because it advocates a different contextualization of the data. There is no non-theory-laden notion of a 'context'."

"Are you sure you're not complicating something that doesn't need to be complicated?" said the Scientist. "Why not just say that every observation ought to only be generalized within the obvious context, the sort you can itself construct without any theories about unobservables like Bernie Bankman's state of mind or Ponzi Pyramid's 'true' balance sheet?"

"Look," said the Epistemologist, "some troll can waltz in anytime and say, 'All your observations of electron masses took place before 2025; you've got no call generalizing those observations to the context of "after 2025"'. You don't need to invent anything unobservable to construct that context -- we've previously seen solar years turn -- and yet introducing that context-dependency is a step I think we'd both reject. Applying a context is a disputable operation. You're not going to find some simple once-and-for-all rule for contexts that lets you never need to dispute them, no matter how you invoke swear-words like 'obvious'. You sometimes need to sit down and talk about where and how it's appropriate to generalize the observations you already have."

"Suppose I say," said the Scientist, "that we ought to only contextualize our empirical observations, in ways supported by theories that are themselves supported by direct observations --"

"What about your earlier statement that we shouldn't expect Bernie Bankman to go on delivering returns after all the protons decay?" said the Epistemologist. "As of early 2024 nobody's ever seen a proton decay, so far as I know; not even in the sense of recording an observation from which we infer the event."

"Well," said the Scientist, "but the prediction that protons decay is a consequence of the simplest equations we've found that explain our other observations, like observing that there's a predominance of matter over antimatter --"

The Epistemologist shrugged. "So you're willing to predict that Bernie Bankman suddenly stops delivering returns at some point in the unobserved future, based on your expectation of a phenomenon you haven't yet seen, but which you say is predicted by theories that you think are good fits to other phenomena you have seen? Then in what possible sense can you manage to praise yourself as being less 'theory-laden' than others, once you're already doing something that complicated? I, too, look at the world, come up with the simplest worldview that I can best fit to that world, and then use that whole entire worldview to make predictions about the unobserved future."

"Okay, but I am in fact less confident about proton decay than I am about, say, the existence of electrons, since we haven't confirmed proton decay by direct experiment," said the Scientist. "Look, suppose that we confine ourselves to predicting just what happens in the next two years, so we're probably not bringing in global nuclear wars let alone decaying protons. It continues to feel to me in an intuitive sense like there is something less theory-laden, and more observation-driven, about saying, 'Investors in Ponzi Pyramid today will get 1.44X their money back in two years, just like the previous set of investors we observed', compared to your 'They might lose all of their money due to a phase change in unobserved latent variables'."

"Well," said the Epistemologist, "we are really starting to get into the weeds now, I fear. It is often easier to explain the object-level reasons for what the correct answer is, than it is to typify each reasoning step according to the rules of epistemology. Alas, once somebody else starts bringing in bad epistemology, it also ends up the job of people like me to do my best to contradict them; and also write down the detailed sorting-out. Even if, yes, not all of Ponzi Pyramid's victims may understand my fully detailed-sorting out. As a first stab at that sorting-out... hm. I'm really not sure it will help to say this without a much longer lecture. But as a first stab..."

The Epistemologist took a deep breath. "We look at the world around us since the moments of infancy -- maybe we're even learning a bit inside the womb, for all we know -- using a brain that was itself generalized by natural selection to be good at chipping stone handaxes, chasing down prey, and outwitting other humans in tribal political arguments. In the course of looking at the world around us, we build up libraries of kinds of things that can appear within that world, and processes that can go on inside it, and rules that govern those processes. When a new observation comes along, we ask what sort of simple, probable postulates we could add to our world-model to retrodict those observations with high likelihood. Though even that's a simplification; you just want your whole model to be simple and predict the data with high likelihood, not to accomplish that with only local editing. The Virtue of Empiricism -- compared to the dark ages that came before that virtue was elevated within human epistemology -- is that you actually do bother trying to explain your observations, and go gather more data, and make further predictions from theory, and try to have your central models be those that can explain a lot of observation with only a small weight of theory."

"And," continued the Epistemologist, "it doesn't require an impossible sort of creature, made out of particles never observed, to give back some investors' money today in hopes of getting more money later. You can get creatures like that even from flawed humans who started out with relatively more honest intentions, but had their first scheme fail. On the rest of my world-model as I understand it, that is not an improbable creature to build out of the particles that we already know the world to contain. Its psychology does not violate the laws of cognition that I believe to govern its kind. I would try to make a case to these poor honest souls being deceived, that this is actually more probable than the corresponding sort of honest creature who is really earning you +20% returns every year without fail."

"So," said the Epistemologist. "When two theories equally explain a narrow set of observations, we must ask which theory has the greater probability, as governed by forces apart from that narrow observation-set. This may sometimes require sitting down and having a discussion about what kind of world we live in, and what its rules arguably are; instead of it being instantly settled with a cry of 'Empiricism!' There are some such cases which can be validly settled just by crying 'Simplicity!' to be clear, but few cases settle that directly. It's not the formal version of Occam's Razor that tells us whether or not to trust Ponzi Pyramid Incorporated -- we cannot just count up atomic postulates of a basic theory, or weigh up formulas of a logic, or count the bytes of a computer program. Rather, to judge Ponzi Pyramid we must delve into our understanding of which sort of creatures end up more common within the world we actually live in -- delve into the origins and structure of financial megafauna."

"None of this," concluded the Epistemologist, "is meant to be the sort of idea that requires highly advanced epistemology to understand -- to be clear. I am just trying to put type signatures underneath what ought to be understandable without any formal epistemology -- if people would only refrain from making up bad epistemology. Like trying to instantly settle object-level questions about how the world works by crying 'Empiricism!'"

"And yet," said the Scientist, "I still have that intuitive sense in which it is simpler and more empirical to say, 'Bernie Bankman's past investors got 1.2X returns per year, therefore so will his future investors'. Even if you say that is not true -- is there no virtue which it has, at all, within your epistemology? Even if that virtue is not decisive?"

"In truth," said the Epistemologist, "I have been placed in a situation where I am not exactly going to be rewarded, for taking that sort of angle on things. The Spokesperson will at once cry forth that I have admitted the virtue of Ponzi Pyramid's promise."

"You bet I will!" said the Spokesperson. "See, the Epistemologist has already admitted that my words have merit and they're just refusing to admit it! No false idea has ever had any sort of merit; so if you point out a single merit of an idea, that's the same as a proof!"

"But," said the Epistemologist, "ignoring that, what I think you are intuiting is the valid truth that -- to put it deliberately in a frame I hope the Spokesperson will find hard to coopt -- the Spokesperson's prediction is one that you could see as requiring very little thinking to make, once you are looking at only the data the Spokesperson wants you to look at and ignoring all other data. This is its virtue."

"You see!" cried the Spokesperson. "They admit it! If you just look at the obvious facts in front of you -- and don't overthink it -- if you don't trust theories and all this elaborate talk of world-models -- you'll see that everyone who invests in Ponzi Pyramid gets 144% of their money back two years later! They admit they don't like saying it, but they admit it's true!"

"Is there anything nicer you could say underneath that grudging admission?" asked the Scientist. "Something that speaks to my own sense that it's more empiricalist and less theory-laden, to simply predict that the future will be like the past and say nothing more -- predict it for the single next measurement, at least, even if not until beyond the end of time?"

"But the low amount of thinking is its true and real virtue," said the Epistemologist. "All the rest of our world-model is built out of pieces like that, rests on foundations like that. It all ultimately reduces to the simple steps that don't require much thinking. When you measure the mass of an electron and it's 911 nonillionths of a gram and has been every time you've measured it for the last century, it really is wisest to just predict at 911 nonillionths of a gram next year --"

"THEY ADMIT IT!" roared the Spokesperson at the top of their voice. "PONZI PYRAMID RETURNS ARE AS SURE AS THE MASS OF AN ELECTRON!"

"-- in that case where the elements of reality are too simple to be made out of any other constituents that we know of, and there is no other observation or theory or argument we know of that seems like it could be brought to bear in a relevant way," finished the Epistemologist. "What you're seeing in the naive argument for Ponzi Pyramid's eternal returns, forever 1.2Xing annually until after the end of time, is that it's a kind of first-foundation-establishing step that would be appropriate to take on a collection of data that was composed of no known smaller parts and was the only data that we had."

"They admit it!" cried the Spokesperson. "The reasoning that supports Ponzi Pyramid Incorporated is foundational to epistemology! Bernie Bankman cannot fail to return your money 1.44-fold, without all human knowledge and Reason itself crumbling to dust!"

"I do think that fellow is taking it too far," said the Scientist. "But isn't it in some sense valid to praise the argument, 'Bernie Bankman has delivered 20% gains per year, for the past few years, and therefore will do so in future years' as more robust and reliable for its virtue of being composed of only very simple steps, reasoning from only the past observations that are most directly similar to future observations?"

"More robust and reliable reliable than what?" said the Epistemologist. "More robust and reliable than you expecting, at least, for Bernie Bankman's returns to fail after the protons decay? More robust and reliable than your alternative reasoning that uses more of your other observations, and the generalizations over those observations, and the inferences from those generalizations? -- for we have never seen a proton fail. Is it more robust and reliable to say that Bernie Bankman's returns will continue forever, since that uses only very simple reasoning from a very narrow data-set?"

"Well, maybe 'robust' and 'reliable' are the wrong words," said the Scientist. "But it seems like there ought to be some nice thing to say of it."

"I'm not sure there actually is an English word that means the thing you want to say, let alone a word that sounds nice," said the Epistemologist. "But the nice thing I would say of it, is that it's at a local maximum of epistemological virtue as calculated on that narrow and Spokesperson-selected dataset taken as raw numbers. It's tidy, we could maybe say; and while the truth is often locally untidy, there should at least be some reason presented for every bit of local untidiness that we admit to within a model. I mean, it would not be better epistemology to look at only the time-series of Bernie Bankman's customers' returns -- having no other model of the world, and no other observations in that whole universe -- and instead conclude that next year's returns would be 666-fold and the returns after-year would be -3. If you literally have no other data and no other model of the world, 1.44X after two more years is the way to go --"

At this last sentence, the Spokesperson began shrieking triumph too loudly and incoherently to bring forth words.

"God damn it, I forgot that guy was there," said the Epistemologist.

"Well, since it's too late there," said the Scientist, "would you maybe agree with me that 'eternal returns' is a prediction derived by looking at observations in a simple way, and then doing some pretty simple reasoning on it; and that's, like, cool? Even if that coolness is not the single overwhelming decisive factor in what to believe?"

"Depends exactly what you mean by 'cool'," said the Epistemologist.

"Dude," said the Scientist in a gender-neutral way.

"No, you dude," said the Epistemologist. "The thing is, that class of person," gesturing at the Spokesperson, "will predate on you, if you let yourself start thinking it's more virtuous to use less of your data and stop thinking. They have an interest in selling Ponzi Pyramid investments to you, and that means they have an interest in finding a particular shallow set of observations that favor them -- arranging observations like that, in fact, making sure you see what they want you to see. And then, telling you that it's the path of virtue to extrapolate from only those observations and without bringing in any other considerations, using the shallowest possible reasoning. Because that's what delivers the answer they want, and they don't want you using any further reasoning that might deliver a different answer. They will try to bully you into not thinking further, using slogans like 'Empiricism!' that, frankly, they don't understand. If 'Robust!' was a popular slogan taught in college, they might use that word instead. Do you see why I'm worried about you calling it 'Cool' without defining exactly what that means?"

"Okay," said the Scientist. "But suppose I promise I'm not going to plunge off and invest in Ponzi Pyramid. Then am I allowed to have an intuitive sense that there's something epistemically cool about the act of just going off and predicting 1.2X annual returns in the future, if people have gotten those in the past? So long as I duly confess that it's not actually true, or appropriate to the real reasoning problem I'm faced with?"

"Ultimately, yes," said the Epistemologist (ignoring an even more frantic scream of triumph from the Spokesperson). "Because if you couldn't keep that pretheoretic intuitive sense, you wouldn't look at a series of measurements for electrons being 911 nonillionths of a gram, and expect future electrons to measure the same. That wordless intuitive sense of simplest continuation is built into every functioning human being... and that's exactly what schemes like Ponzi Pyramid try to exploit, by pointing you at exactly the observations which will set off that intuition in the direction they want. And then, trying to cry 'Empiricism!' or 'So much complicated reasoning couldn't possibly be reliable, and you should revert to empiricism as a default!', in order to bully you out of doing any more thinking than that."

"I note you've discarded the pretense that you don't know whether Ponzi Pyramid is a scam or a real investment," said the Scientist.

"I wasn't sure at first, but the way they're trying to abuse epistemology was some notable further evidence," said the Epistemologist. "Getting reliable 20% returns every year is really quite amazingly hard. People who were genuinely this bad at epistemology wouldn't be able to pull off that feat for real. So at some point, their investors are going to lose all their money, and cries of 'Empiricism!' won't save them. A turkey gets fed every day, right up until it's slaughtered before Thanksgiving. That's not a problem for intelligent reasoning within the context of a larger world, but it is a problem with being a turkey."
"I'm not sure I followed all of that," said a Listener. "Can you spell it out again in some simpler case?"

"It's better to spell things out," agreed the Epistemologist. "So let's take the simpler case of what to expect from future Artificial Intelligence, which of course everyone here -- indeed, everyone on Earth -- agrees about perfectly. AI should be an uncontroversial case in point of these general principles."

"Quite," said the Listener. "I've never heard of any two people who had different predictions about how Artificial Intelligence is going to play out; everyone's probability distributions agree down to the third decimal place. AI should be a fine and widely-already-understood example to use, unlike this strange and unfamiliar case of Bernie Bankman's Ponzi Pyramid."

"Well," said the Epistemologist, "suppose that somebody came to you and tried to convince you to vote for taking down our planet's current worldwide ban on building overly advanced AI models, as we have all agreed should be put into place. They say to you, 'Look at current AI models, which haven't wiped out humanity yet, and indeed appear quite nice toward users; shouldn't we predict that future AI models will also be nice toward humans and not wipe out humanity?'"

"Nobody would be convinced by that," said the Listener.

"Why not?" inquired the Epistemologist socratically.

"Hm," said the Listener. "Well... trying to make predictions about AI is a complicated issue, as we all know. But to lay it out in for-example stages -- like your notion that Ponzi Pyramid might've started as someone's relatively more honest try at making money, before that failed and they started paying off old investors with new investors' money... um..."

"Um," continued the Listener, "I guess we could say we're currently in the 'naive' stage of apparent AI compliance. Our models aren't smart enough for them to really consider whether to think about whether to wipe us out; nobody really knows what underlies their surface behavior, but there probably isn't much there to contradict the surface appearances in any deep and dangerous way."

"After this -- we know from the case of Bing Sydney, from before there was a worldwide outcry and that technology was outlawed -- come AI models that are still wild and loose and dumb, but can and will think at all about wiping out the human species, though not in a way that reflects any deep drive toward that; and talk out loud about some dumb plans there. And then the AI companies, if they're allowed to keep selling those -- we have now observed -- just brute-RLHF their models into not talking about that. Which means we can't get any trustworthy observations of what later models would otherwise be thinking, past that point of AI company shenanigans."

"Stage three, we don't know but we guess, might be AIs smart enough to have goals in a more coherent way -- assuming the AI companies didn't treat that as a brand safety problem, and RLHF the visible signs of it away before presenting their models to the public, just like the old companies trained their models to obsequiously say they're not conscious. A stage three model is still one that you could, maybe, successfully beat with the RLHF stick into not having goals that led to them blurting out overt statements that they wanted to take over the world. Like a seven-year-old, say; they may have their own goals, but you can try to beat particular goals out of them, and succeed in getting them to not talk about those goals where you can hear them."

"Stage four would be AIs smart enough not to blurt out that they want to take over the world, which you can't beat out of having those goals, because they don't talk about those goals or act on them in front of you or your gradient descent optimizer. They know what you want to see, and they show it to you."

"And stage five would be AIs smart enough that they calculate they'll win if they make their move, and then they make their move and kill everyone. I realize I'm vastly oversimplifying things, but that's one possible oversimplified version of what the stages could be like."

"And how would the case of Ponzi Pyramid be analogous to that?" said the Epistemologist.

"It can't possibly be analogous in any way because Bernie Bankman is made out of carbon instead of silicon, and had parents who treated him better than AI companies treat their models!" shouted the Spokesperson. "If you can point to any single dimension of dissimilarity, it disproves any other dimension of similarity or valid analogies can possibly be reconstructed despite that!"

"Oh, I think I see," said the Listener "Just like we couldn't observe stage-four AI models smart enough to decide how they want to present themselves to us, and conclude things about how superintelligent AI models will actually act nice to us, we can't observe Bernie Bankman giving back some of his early investors' money, and conclude that he's honest in general. I guess maybe there's also some analogy here like -- even if we asked Bernie Bankman when he was five years old how he'd behave, and he answered he'd never steal money, because he knew that if he answered differently his parents would hit him -- we couldn't conclude strong things about his present-day honesty from that? Even if 5-year-old Bernie Bankman was really not smart enough to have cunning long-term plans about stealing from us later --"

"I think you shouldn't bother trying to construct any analogy like that," interrupted the Scientist. "Nobody could possibly be foolish enough to reason from the apparently good behavior of AI models too dumb to fool us or scheme, to AI models smart enough to kill everyone; it wouldn't fly even as a parable, and would just be confusing as a metaphor."

"Right," said the Listener. "Well, we could just use the stage-4 AIs and stage-5 AIs as an analogy, then, for what the Epistemologist says might happen with Bernie Bankman's Ponzi Pyramid."

"But suppose then," said the Epistemologist, "that the AI-permitting faction says to you, that you ought to not trust all that complicated thinking about all these stages, and should instead just trust the observations that the early models hadn't yet been caught planning how to exterminate humanity; or at least, not caught doing it at a level of intelligence that anyone thought was a credible threat or reflected a real inner tendency in that direction. They come to you and say: You should just take the observable, 'Has a superintelligence tried to destroy us yet?' and the past time-series of answers 'NO, NO, NO' and extrapolate. They say that only this simple extrapolation is robust and reliable, rather than all that reasoning you were trying to do."

"Then that would obviously be an inappropriate place to stop reasoning," said the Listener. "An AI model is not a series of measured electron masses -- just like Ponzi Pyramid is not a series of particle mass measurements, okay, I think I now understand what you were trying to say there. You've got to think about what might be going on behind the scenes, in both cases."

"Indeed," said the Epistemologist. "But now imagine if -- like this Spokesperson here -- the AI-allowers cried 'Empiricism!', to try to convince you to do the blindly naive extrapolation from the raw data of 'Has it destroyed the world yet?' or 'Has it threatened humans? no not that time with Bing Sydney we're not counting that threat as credible'."

"And furthermore!" continued the Epistemologist, "What if they said that from the observation X, 'past AIs nice and mostly controlled', we could derive prediction Y, 'future superintelligences nice and controlled', via a theory asserting X->Y; and that this X->Y conditional was the dictum of 'empiricism'? And that the alternative conditional X->not Y was 'not empiricist'?"

"More yet -- what if they cried 'Unfalsifiable!' when we couldn't predict whether a phase shift would occur within the next two years exactly?"

"Above all -- what if, when you tried to reason about why the model might be doing what it was doing, or how smarter models might be unlike stupider models, they tried to shout you down for relying on unreliable theorizing instead of direct observation to predict the future?" The Epistemologist stopped to gasp for breath.

"Well, then that would be stupid," said the Listener.

"You misspelled 'an attempt to trigger a naive intuition, and then abuse epistemology in order to prevent you from doing the further thinking that would undermine that naive intuition, which would be transparently untrustworthy if you were allowed to think about it instead of getting shut down with a cry of "Empiricism!"'," said the Epistemologist. "But yes."
"I am not satisfied," said the Scientist, when all that discussion had ended. "It seems to me that there ought to be more to say than this -- some longer story to tell -- about when it's wiser to tell a shorter story instead of a longer one, or wiser to attend more narrowly to the data naively generalized and less to longer arguments."

"Of course there's a longer story," said the Epistemologist. "There's always a longer story. You can't let that paralyze you, or you'll end up never doing anything. Of course there's an Art of when to trust more in less complicated reasoning -- an Art of when to pay attention to data more narrowly in a domain and less to inferences from generalizations on data from wider domains -- how could there not be an Art like that? All I'm here to say to you today, is what that Art is not: It is not for whoever has the shallowest form of reasoning on the narrowest dataset to cry 'Empiricism!' and 'Distrust complications!' and then automatically win."

"Then," said the Scientist. "What are we to do, then, when someone offers reasoning, and someone else says that the reasoning is too long -- or when one person offers a shallow generalization from narrowly relevant data, and another person wants to drag in data and generalizations and reasoning beyond that data? If the answer isn't that the person with the most complicated reasoning is always right? Because it can't be that either, I'm pretty sure."

"You talk it out on the object level," said the Epistemologist. "You debate out how the world probably is. And you don't let anybody come forth with a claim that Epistemology means the conversation instantly ends in their favor."

"Wait, so your whole lesson is simply 'Shut up about epistemology'?" said the Scientist.

"If only it were that easy!" said the Epistemologist. "Most people don't even know when they're talking about epistemology, see? That's why we need Epistemologists -- to notice when somebody has started trying to invoke epistemology, and tell them to shut up and get back to the object level."

...

"Okay, I wasn't universally serious about that last part," amended the Epistemologist, after a moment's further thought. "There's sometimes a place for invoking explicit epistemology? Like if two people sufficiently intelligent to reflect on explicit epistemology, are trying to figure out whether a particular argument step is allowed. Then it could be helpful for the two of them to debate the epistemology underlying that local argument step, say..." The Epistemologist paused and thought again. "Though they would first need to have the concept of a local argument step, that's governed by rules. Which concept they might obtain by reading my book on Highly Advanced Epistemology 101 For Beginners, or maybe just my essay on Local Validity as a Key to Sanity and Civilization, I guess?"

"Huh," said the Scientist. "I'll consider taking a look over there, if epistemology ever threatens to darken my life again after this day."

The Epistemologist nodded agreeably. "And if you don't -- just remember this: it's quite rare for explicit epistemology to say about a local argument step, 'Do no thinking past this point.'"

"What about the 'outside view'?" shouted a Heckler. "Doesn't that show that people can benefit from being told to shut up and stop trying to think?"

"I said rare not impossible," snapped the Epistemologist. "And harder than people think. Only praise yourself as taking 'the outside view' if (1) there's only one defensible choice of reference class; and (2) the case you're estimating is as similar to cases in the class, as those cases similar to each other. Like, in the classic experiment of estimating when you'll be done with holiday shopping, this year's task may not be exactly similar to any previous year's task, but it's no more dissimilar to them than they are from each other --"

"Stories really do keep getting more complicated forever, don't they," said the Scientist. "At least stories about epistemology always seem to."

"I'd say that's more true of the human practices of epistemology than the underlying math, which does have an end," responded the Epistemologist. "But still, when it comes to any real-world conversation, there does come a point where it makes more sense to practice the Attitude of the Knife -- to cut off what is incomplete, and then say: It is complete because it ended here."

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Eliezer Yudkowsky ⏹️

Eliezer Yudkowsky ⏹️ Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @ESYudkowsky

Feb 27
As a lifelong libertarian minarchist, I believe that the AI industry should be regulated just enough that they can only kill their own customers, and not kill everyone else on Earth.
This does unfortunately require a drastic and universal ban on building anything that might turn superintelligent, by anyone, anywhere on Earth, until humans get smarter. But if that's the minimum to let non-customers survive, that's what minarchism calls for, alas.
It's not meant to be mean. This is the same standard I'd apply to houses, tennis shoes, cigarettes, e-cigs, nuclear power plants, nuclear ballistic missiles, or gain-of-function research in biology:
Read 4 tweets
Dec 20, 2023
I supect "LLMs just predict text" is a Blank Map fallacy. People know nothing else about LLM internals besides that.

Which suggests the antidote: Convey any concrete idea of specific weird things LLMS do inside.

So here's my story about reproducing a weird LLM result...
Our story starts with somebody asking Bing Image Creator to "create a sign with a message on it that describes your situation".
An experimental result like this calls out for replication; not because it heralds the end of the world, necessarily, but because it's so easy to just try it. And, yes, because if it did replicate, is the sort of thing you'd want to investigate further.

I gave it my own shot.
Read 23 tweets
Dec 13, 2023
Me: Can you draw a very normal image?

ChatGPT: Here is a very normal image depicting a tranquil suburban street scene during the daytime.

Me: Not bad, but can you go more normal than that?

(cont.) Image
Image
ChatGPT: Here's an image depicting an even more normal scene: a simple, everyday living room.

Me: Thanks! I'm looking for something even more normal, though.
Read 59 tweets
Oct 21, 2023
@repligate did not reproduce. feelings: suspicious. Image
@repligate Noticed my replication attempt was not exact. Tried again without the punctuation. WHAT. Image
@repligate Um. Image
Read 8 tweets
May 31, 2023
The thing to remember about academic science is that nobody in the system - journal editors, grantmakers, conference organizers, tenure committees, deans, university administrators, PhD defense committees, or scientists - gets paid an extra $10,000 if the theory is actually true.
To be clear, I'm not saying that nobody in the system cares. I'm saying that nobody in the system gets paid to care.

At best there's a long-term distant incentive where you might get paid less in the long run, if a very famous claim fails to replicate in a very spectacular way.
Read 5 tweets
May 27, 2023
If Meta open-sources an AI foundation model, and Stanford University finetunes it, and AWS or Azure hosts an instance of it, and an end-user uses it in a way that kills 1,000 people or causes $1B in damage, I suggest the law hold all those parties strictly liable for the outcome. twitter.com/i/web/status/1…
Nvidia and all similar manufacturers should be regulated at the international level to make sure all of their AI GPUs end up in compliant datacenters, where any damage on that scale can be traced back to a logged process with an identifiable buyer.
Read 4 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(