Eliezer Yudkowsky ⏹️ Profile picture
The original AI alignment person. Missing punctuation at the end of a sentence means it's humor. If you're not sure, it's also very likely humor.
Maleph Profile picture ShashanK🤺 Profile picture Eli Tyre Profile picture Potato Of Reason Profile picture david todd Profile picture 10 subscribed
Mar 21 4 tweets 2 min read
To write intelligent fictional characters, you must learn to notice what I call "stupidity assertions" -- the part of yourself that (maybe unconsciously) thinks your character will think thoughts that you know are dumb.

Sometimes, people have stupidity assertions in their own self-models! And a lot of times, what we think is what we believe about ourselves that we will think, like we are writing ourselves as a character according to our character model. Self-models are not self-reality but they have a very large amount of influence over self-reality. We often think what we predict ourselves to think.

When somebody believes themselves stupid / irrational -- maybe not deliberately so -- then they may be able to write characters that are smarter than themselves, if you prompt them to try to write a superhumanly intelligent character. (Or the smartest person they can imagine, or a character wearing a Headband of Vast Intelligence, etc.) (Though this requires that the instructed writer properly understand what is meant by "smart", i.e., cognitively powerful, not just good at chess or whatevs.)

The writer gets to points in a thought process where their own self-model would have them be stupid; but since their model of the fictional character they are writing, doesn't say the character is stupid in that way, they write the character being smart instead.

This is one of the few exceptions to Vinge's Law, that fictional characters can't be truly deeply smarter than authors. Your character can be smarter than you, if you think stupid (irrational, insane, invalid, etc) thoughts that you know are stupid.

When I was a young man, I gained some amount of my intelligence in this way, by trying to write smart characters in fiction. I can't quantify how much, of course, for I never measured it, and I doubt it affected fluid intelligence or test measurements of that; I already believed about myself that I test well. (No, you can't read it, I wasn't skilled enough then to finish those stories.) I doubt that's quite exactly what's going on with ChatGPT, if you can make it smarter by telling it to be smarter. But something like that phenomenon must be echoing there.
Mar 13 4 tweets 28 min read
"Ignore all these elaborate, abstract, theoretical predictions," the Spokesperson for Ponzi Pyramid Incorporated said in a firm, reassuring tone. "Empirically, everyone who's invested in Bernie Bankman has received back 144% of what they invested two years later."

"That's not how 'empiricism' works," said the Epistemologist. "You're still making the assumption that --"

"You could only believe that something different would happen in the future, if you believed in elaborate theoretical analyses of Bernie Bankman's unobservable internal motives and internal finances," said the spokesperson for Ponzi Pyramid Incorporated. "If you are a virtuous skeptic who doesn't trust in overcomplicated arguments, you'll believe that future investments will also pay back 144%, just like in the past. That's the prediction you make if you predict based purely on empirical observations, instead of theories about a future nobody has seen!"

"That's not how anything works," said the Epistemologist. "Every future prediction has a theory connecting it to our past observations. There's no such thing as going from past observations directly to future predictions, with no theory, no assumptions, to cross the gap --"

"Sure there's such a thing as a purely empirical prediction," said the Ponzi spokesperson. "I just made one. Not to mention, my dear audience, are you really going to trust anything as complicated as epistemology?"

"The alternative to thinking about epistemology is letting other people do your thinking about it for you," said the Epistemologist. "You're saying, 'If we observe proposition X "past investors in the Ponzi Pyramid getting paid back 144% in two years", that implies prediction Y "this next set of investors in the Ponzi Pyramid will get paid back 144% in two years"'. X and Y are distinct propositions, so you must have some theory saying 'X -> Y' that lets you put in X and get out Y."

"But my theory is empirically proven, unlike yours!" said the Spokesperson.

"...nnnnoooo it's not," said the Epistemologist. "I agree we've observed your X, that past investors in the Ponzi Pyramid got 144% returns in 2 years -- those investors who withdrew their money instead of leaving it in to accumulate future returns, that is, not quite all investors. But just like prediction Y of 'the next set of investors will also receive 144% in 2 years' is not observed, the connecting implication 'if X, then Y' is not yet observed, just like Y itself is not observed. When you go through the step 'if observation X, then prediction Y' you're invoking an argument or belief whose truth is not established by observation, and hence must be established by some sort of argument or theory. Now, you might claim to have a better theoretical argument for 'X -> Y' over 'X -> not Y', but it would not be an empirical observation either way."

"You say words," replied the Spokesperson, "and all I hear are -- words words words! If you instead just look with your eyes at past investors in the Ponzi Pyramid, you'll see that every one of them got back 144% of their investments in just two years! Use your eyes, not your ears!"

"There's a possible theory that Bernie Bankman is making wise investments himself, and so multiplying invested money by 1.2X every year, then honestly returning that money to any investor who withdraws it," said the Epistemologist. "There's another theory which says that Bernie Bankman has been getting more money invested every year, and is using some of the new investments to pay back some fraction of previous investors who demanded their money back --"

"Why would Bernie Bankman do that, instead of taking all the money right away?" inquired the Spokesperson. "If he's as selfish and as greedy and dishonest as you say, wouldn't he just keep the money?"

"So that he could get even more money from new investors, attracted by seeing his previous investors paid off, of course," said the Epistemologist. "And realistically, so that Bernie Bankman could maintain his comfortable present position in society and his current set of friends, as is often a greater motivator in human affairs than money."

"So we see Bernie Bankman giving people money -- that is what empiricism and observation tell us -- but you would tell people with your words and reasoning that Bernie Bankman is a greedy man who keeps all investments for himself? What a great divergence we see again between empirical observation, and elaborate unobservable theories!"

"We agree on what has already been observed of Bernie Bankman's outward behavior," said the Epistemologist. "When it comes to Bernie Bankman's unobserved interior thoughts -- your unobserved theory 'he is honest', is no more or less empirical or theoretical, than the unobserved theory 'he is scheming'. 'Honest' and 'scheming' are two possible values of a latent variable of the environment, a latent variable which cannot be directly observed, and must be inferred as the hidden cause of what we can observe. One value of the unseen variable is not more already-observed than another. The X->Y implication from the previous money-returning behavior we did observe, to Bernie Bankman's latent honesty or dishonesty, is likewise itself something we do not observe; the 'if you observe X, infer latent Y' step is something given to us by theory rather than observation."

"And furthermore," continued the Epistemologist, a touch of irritation now entering that voice, "I don't actually think it's all that complicated of a theory, to understand why Bernie Bankman would schemingly give back the money of the first few investors. The only reason why somebody would fail to understand this simple idea, is this person yelling at you that any alternative to blind surface generalization is 'theoretical' and 'not empirical'. Plenty of people would be able to understand this concept without dragging epistemology into it at all. Of course observing somebody giving back a small amount of money, doesn't prove they'll later give you back a large amount of money; there's more than one reason they could be behaving nicely around low stakes."

"The Epistemologist will give you words," said the Spokesperson to the watching audience. "Bernie Bankman gives you money! 144% returns in 2 years! Every scientist who's measured Bankman's behavior agrees that this is the empirical, already-observed truth of what will happen! Now, as a further proof that my opponent's claims are not just wrong, but unscientific, let me ask this -- do you, Epistemologist, claim with 100% probability that this next set of investors' investments, cannot be paid back two years from now?"

"That's not something I can know with certainty about the unobserved future," said the Epistemologist. "Even conditional on the 'scheming' hypothesis, I can't, actually, know that Ponzi Pyramid Incorporated will bust within 2 years specifically. Maybe you'll get enough new investors, or few enough of these investors will withdraw their funds, that this company will continue for another 2 years --"

"You see?" cried the Spokesperson. "Not only is this theory unsupported empirically, it is also unfalsifiable! For where I tell you with certainty that all your money will be repaid and more, 2 years hence -- this one claims that your money might or might not be repaid! Why, if Bernie Bankman repays 144% in 2 years yet again, what will this one say? Only that Ponzi Pyramid hasn't busted yet and that it might bust later! Can you ask for a better example of scientific vice, contrasted to my own scientific virtue? Observation makes a bold, clear, falsifiable statement, where elaborate predictions only waffle!"

"If a reasonable person would say that there's a 50% chance of the Ponzi Pyramid busting in two years," replied the Epistemologist wearily, "it is not more scientifically virtuous to say the chance is 0% instead, only because there is then a 50% chance of your claim turning out to be definitely false and you getting to say a scientifically virtuous 'oops' (if you'd even say it)."

"To give an even simpler example," continued the Epistemologist, "let's say we're flipping a coin that I think is fair, and you say is biased to produce 100% heads. Your theory stands a 50% chance of being falsified, whereas mine will not be falsified no matter what the coin shows -- but that doesn't mean that every time you pick up a coin on the street, it's the course of scientific virtue to decide the coin must be biased 100% heads. Being relatively easier to falsify is a convenient property for a belief to have, but that convenience is not the only important virtue of a belief, and not all true beliefs have it. All the distinct kinds of epistemic virtue must be kept distinct in our thoughts, or we will quite confuse ourselves."

"To give yet another example," added the Epistemologist, "let's say you're considering whether to run blindly toward the edge of a cliff. I might not be able to predict exactly how fast you'll run. So I won't be able to predict whether or not you'll already be falling, or dead, after five more seconds have passed. This does not mean that the theory 'I will fly and never die' should be seen as more reasonable or more scientific, merely because it makes a more certain claim about whether or not you'll be alive five seconds later."

"What an incredible set of excuses for having no definite predictions about what will happen two years later!" the Spokesperson said, smiling and mugging to the audience. "Believe your eyes! Believe in empiricism! Believe -- in Science! Believe, above all, in the definite factual observation: investors who invest in the Ponzi Pyramid get 144% of their money back after 2 years! All the rest is words words words and thinking." "Hm," said a watching Scientist. "I see the force of your theoretical claims about epistemology, Epistemologist. But I cannot help but feel intuitively that there is something to this Spokesperson's words, too, even if they are not exactly logically correct according to your meta-theory. When we have observed so many previous investors getting 144% returns from Bernie Bankman's Ponzi Pyramid after 2 years, is there not some real sense in which it is more empirical to say the same thing will happen to future investors, and less empirical to say that a different thing will happen in the future instead? The former prediction seems to me to be more driven by the data we already have, and the latter prediction to be driven by something more like thinking and imagining. I see how both predictions must be predictions, from the standpoint of epistemology, and involve something like an assumption or a theory that connects the past to the future. But can we not say that the Spokesperson's predictions involve fewer assumptions and less theory and are more driven by looking at the data, compared to yours?"

"So to be clear," said the Epistemologist to the Scientist, "you are saying that the prediction which involves the fewest assumptions and the least theory, is that Bernie Bankman's Ponzi Pyramid will go on multiplying all investments by a factor of 1.2 every year, indefinitely, to the end of the universe and past it?"

"Well, no," said the Scientist. "We have only observed Bernie Bankman to multiply investments by 1.2 per year, in the present socioeconomic context. It would not be reasonable to extend out the observations to beyond that context -- to say that Bernie Bankman could go on delivering those returns after a global thermonuclear war, for example. To say nothing of after all the protons decay, and the black holes evaporate, and time comes to an end in a sea of chaos."

"I inquire of you," said the Epistemologist, "whether your belief that Bernie Bankman would stop delivering good returns after a thermonuclear war, is more theory-laden, less empirical, than a belief that Bernie Bankman goes on multiplying investments 1.2-fold forever. Perhaps your belief has other virtues that make it superior to the belief in 'eternal returns', as we might call them. But it is nonetheless the case that the 'eternal returns' theory has the advantage of being less theory-laden and more empirical?"

The Scientist frowned. "Hm. To be clear, I agree with you that the 'eternal returns' theory must be less correct -- but I'm not quite sure it feels right to call it more empirical -- to say that it has one sin and one virtue, like that..." The Scientist paused. "Ah, I have it! To say that Bernie Bankman would stop returning investments after a global thermonuclear war, I need to bring in my beliefs about nuclear physics. But those beliefs are themselves well-confirmed by observation, so to deny them to hold true about Bernie Bankman's Ponzi Pyramid would be most unempirical and unvirtuous." The Scientist smilled and nodded to himself.

"I put to you, then," said the Epistemologist, "that your prediction that Bernie Bankman would stop delivering good returns after a thermonuclear war, is indeed more 'theory-laden' in your intuitive sense, than the prediction that Bernie Bankman simply goes on delivering 1.2X returns forever. It is just that you happen to like the theories you are lading on, for reasons which include that you think they are full of delicious empiricist virtue."

"Could I not also say," said the Scientist, "that I have only observed the Ponzi Pyramid to deliver returns within a particular socioeconomic context, and so empiricism says to only generalize inside of the context that holds all my previous observations?"

The Epistemologist smiled. "I could just as easily say myself that such schemes often go through two phases, the part where he's scheming to take your money and the part where he actually takes it; and say from within my own theoretical stance that we ought not to generalize from the 'scheming to take your money' context to the 'actually taking it' context." The Epistemologist paused, then added, "Though to be precise about the object-level story, it's a tragic truth that many schemes like that start with a flawed person having a dumb but relatively more honest plan to deliver investment returns. It's only after their first honest scheme fails, that as an alternative to painful confession, they start concealing the failure and paying off early investors with later investors' money -- sometimes telling themselves the whole while that they mean to eventually pay off everyone, and other times having explicitly switched to being con artists. Others, of course, are con artists from the beginning. So there may be a 'naive' phase that can come before the 'concealment' phase or the 'sting' phase... but I digress." The Epistemologist shook his head, returning to the previous topic. "My point is, my theory could be viewed as specializing our past observations to within a context, just like your theory does; and yet my theory yields a different prediction from yours, because it advocates a different contextualization of the data. There is no non-theory-laden notion of a 'context'."

"Are you sure you're not complicating something that doesn't need to be complicated?" said the Scientist. "Why not just say that every observation ought to only be generalized within the obvious context, the sort you can itself construct without any theories about unobservables like Bernie Bankman's state of mind or Ponzi Pyramid's 'true' balance sheet?"

"Look," said the Epistemologist, "some troll can waltz in anytime and say, 'All your observations of electron masses took place before 2025; you've got no call generalizing those observations to the context of "after 2025"'. You don't need to invent anything unobservable to construct that context -- we've previously seen solar years turn -- and yet introducing that context-dependency is a step I think we'd both reject. Applying a context is a disputable operation. You're not going to find some simple once-and-for-all rule for contexts that lets you never need to dispute them, no matter how you invoke swear-words like 'obvious'. You sometimes need to sit down and talk about where and how it's appropriate to generalize the observations you already have."

"Suppose I say," said the Scientist, "that we ought to only contextualize our empirical observations, in ways supported by theories that are themselves supported by direct observations --"

"What about your earlier statement that we shouldn't expect Bernie Bankman to go on delivering returns after all the protons decay?" said the Epistemologist. "As of early 2024 nobody's ever seen a proton decay, so far as I know; not even in the sense of recording an observation from which we infer the event."

"Well," said the Scientist, "but the prediction that protons decay is a consequence of the simplest equations we've found that explain our other observations, like observing that there's a predominance of matter over antimatter --"

The Epistemologist shrugged. "So you're willing to predict that Bernie Bankman suddenly stops delivering returns at some point in the unobserved future, based on your expectation of a phenomenon you haven't yet seen, but which you say is predicted by theories that you think are good fits to other phenomena you have seen? Then in what possible sense can you manage to praise yourself as being less 'theory-laden' than others, once you're already doing something that complicated? I, too, look at the world, come up with the simplest worldview that I can best fit to that world, and then use that whole entire worldview to make predictions about the unobserved future."

"Okay, but I am in fact less confident about proton decay than I am about, say, the existence of electrons, since we haven't confirmed proton decay by direct experiment," said the Scientist. "Look, suppose that we confine ourselves to predicting just what happens in the next two years, so we're probably not bringing in global nuclear wars let alone decaying protons. It continues to feel to me in an intuitive sense like there is something less theory-laden, and more observation-driven, about saying, 'Investors in Ponzi Pyramid today will get 1.44X their money back in two years, just like the previous set of investors we observed', compared to your 'They might lose all of their money due to a phase change in unobserved latent variables'."

"Well," said the Epistemologist, "we are really starting to get into the weeds now, I fear. It is often easier to explain the object-level reasons for what the correct answer is, than it is to typify each reasoning step according to the rules of epistemology. Alas, once somebody else starts bringing in bad epistemology, it also ends up the job of people like me to do my best to contradict them; and also write down the detailed sorting-out. Even if, yes, not all of Ponzi Pyramid's victims may understand my fully detailed-sorting out. As a first stab at that sorting-out... hm. I'm really not sure it will help to say this without a much longer lecture. But as a first stab..."

The Epistemologist took a deep breath. "We look at the world around us since the moments of infancy -- maybe we're even learning a bit inside the womb, for all we know -- using a brain that was itself generalized by natural selection to be good at chipping stone handaxes, chasing down prey, and outwitting other humans in tribal political arguments. In the course of looking at the world around us, we build up libraries of kinds of things that can appear within that world, and processes that can go on inside it, and rules that govern those processes. When a new observation comes along, we ask what sort of simple, probable postulates we could add to our world-model to retrodict those observations with high likelihood. Though even that's a simplification; you just want your whole model to be simple and predict the data with high likelihood, not to accomplish that with only local editing. The Virtue of Empiricism -- compared to the dark ages that came before that virtue was elevated within human epistemology -- is that you actually do bother trying to explain your observations, and go gather more data, and make further predictions from theory, and try to have your central models be those that can explain a lot of observation with only a small weight of theory."

"And," continued the Epistemologist, "it doesn't require an impossible sort of creature, made out of particles never observed, to give back some investors' money today in hopes of getting more money later. You can get creatures like that even from flawed humans who started out with relatively more honest intentions, but had their first scheme fail. On the rest of my world-model as I understand it, that is not an improbable creature to build out of the particles that we already know the world to contain. Its psychology does not violate the laws of cognition that I believe to govern its kind. I would try to make a case to these poor honest souls being deceived, that this is actually more probable than the corresponding sort of honest creature who is really earning you +20% returns every year without fail."

"So," said the Epistemologist. "When two theories equally explain a narrow set of observations, we must ask which theory has the greater probability, as governed by forces apart from that narrow observation-set. This may sometimes require sitting down and having a discussion about what kind of world we live in, and what its rules arguably are; instead of it being instantly settled with a cry of 'Empiricism!' There are some such cases which can be validly settled just by crying 'Simplicity!' to be clear, but few cases settle that directly. It's not the formal version of Occam's Razor that tells us whether or not to trust Ponzi Pyramid Incorporated -- we cannot just count up atomic postulates of a basic theory, or weigh up formulas of a logic, or count the bytes of a computer program. Rather, to judge Ponzi Pyramid we must delve into our understanding of which sort of creatures end up more common within the world we actually live in -- delve into the origins and structure of financial megafauna."

"None of this," concluded the Epistemologist, "is meant to be the sort of idea that requires highly advanced epistemology to understand -- to be clear. I am just trying to put type signatures underneath what ought to be understandable without any formal epistemology -- if people would only refrain from making up bad epistemology. Like trying to instantly settle object-level questions about how the world works by crying 'Empiricism!'"

"And yet," said the Scientist, "I still have that intuitive sense in which it is simpler and more empirical to say, 'Bernie Bankman's past investors got 1.2X returns per year, therefore so will his future investors'. Even if you say that is not true -- is there no virtue which it has, at all, within your epistemology? Even if that virtue is not decisive?"

"In truth," said the Epistemologist, "I have been placed in a situation where I am not exactly going to be rewarded, for taking that sort of angle on things. The Spokesperson will at once cry forth that I have admitted the virtue of Ponzi Pyramid's promise."

"You bet I will!" said the Spokesperson. "See, the Epistemologist has already admitted that my words have merit and they're just refusing to admit it! No false idea has ever had any sort of merit; so if you point out a single merit of an idea, that's the same as a proof!"

"But," said the Epistemologist, "ignoring that, what I think you are intuiting is the valid truth that -- to put it deliberately in a frame I hope the Spokesperson will find hard to coopt -- the Spokesperson's prediction is one that you could see as requiring very little thinking to make, once you are looking at only the data the Spokesperson wants you to look at and ignoring all other data. This is its virtue."

"You see!" cried the Spokesperson. "They admit it! If you just look at the obvious facts in front of you -- and don't overthink it -- if you don't trust theories and all this elaborate talk of world-models -- you'll see that everyone who invests in Ponzi Pyramid gets 144% of their money back two years later! They admit they don't like saying it, but they admit it's true!"

"Is there anything nicer you could say underneath that grudging admission?" asked the Scientist. "Something that speaks to my own sense that it's more empiricalist and less theory-laden, to simply predict that the future will be like the past and say nothing more -- predict it for the single next measurement, at least, even if not until beyond the end of time?"

"But the low amount of thinking is its true and real virtue," said the Epistemologist. "All the rest of our world-model is built out of pieces like that, rests on foundations like that. It all ultimately reduces to the simple steps that don't require much thinking. When you measure the mass of an electron and it's 911 nonillionths of a gram and has been every time you've measured it for the last century, it really is wisest to just predict at 911 nonillionths of a gram next year --"

"THEY ADMIT IT!" roared the Spokesperson at the top of their voice. "PONZI PYRAMID RETURNS ARE AS SURE AS THE MASS OF AN ELECTRON!"

"-- in that case where the elements of reality are too simple to be made out of any other constituents that we know of, and there is no other observation or theory or argument we know of that seems like it could be brought to bear in a relevant way," finished the Epistemologist. "What you're seeing in the naive argument for Ponzi Pyramid's eternal returns, forever 1.2Xing annually until after the end of time, is that it's a kind of first-foundation-establishing step that would be appropriate to take on a collection of data that was composed of no known smaller parts and was the only data that we had."

"They admit it!" cried the Spokesperson. "The reasoning that supports Ponzi Pyramid Incorporated is foundational to epistemology! Bernie Bankman cannot fail to return your money 1.44-fold, without all human knowledge and Reason itself crumbling to dust!"

"I do think that fellow is taking it too far," said the Scientist. "But isn't it in some sense valid to praise the argument, 'Bernie Bankman has delivered 20% gains per year, for the past few years, and therefore will do so in future years' as more robust and reliable for its virtue of being composed of only very simple steps, reasoning from only the past observations that are most directly similar to future observations?"

"More robust and reliable reliable than what?" said the Epistemologist. "More robust and reliable than you expecting, at least, for Bernie Bankman's returns to fail after the protons decay? More robust and reliable than your alternative reasoning that uses more of your other observations, and the generalizations over those observations, and the inferences from those generalizations? -- for we have never seen a proton fail. Is it more robust and reliable to say that Bernie Bankman's returns will continue forever, since that uses only very simple reasoning from a very narrow data-set?"

"Well, maybe 'robust' and 'reliable' are the wrong words," said the Scientist. "But it seems like there ought to be some nice thing to say of it."

"I'm not sure there actually is an English word that means the thing you want to say, let alone a word that sounds nice," said the Epistemologist. "But the nice thing I would say of it, is that it's at a local maximum of epistemological virtue as calculated on that narrow and Spokesperson-selected dataset taken as raw numbers. It's tidy, we could maybe say; and while the truth is often locally untidy, there should at least be some reason presented for every bit of local untidiness that we admit to within a model. I mean, it would not be better epistemology to look at only the time-series of Bernie Bankman's customers' returns -- having no other model of the world, and no other observations in that whole universe -- and instead conclude that next year's returns would be 666-fold and the returns after-year would be -3. If you literally have no other data and no other model of the world, 1.44X after two more years is the way to go --"

At this last sentence, the Spokesperson began shrieking triumph too loudly and incoherently to bring forth words.

"God damn it, I forgot that guy was there," said the Epistemologist.

"Well, since it's too late there," said the Scientist, "would you maybe agree with me that 'eternal returns' is a prediction derived by looking at observations in a simple way, and then doing some pretty simple reasoning on it; and that's, like, cool? Even if that coolness is not the single overwhelming decisive factor in what to believe?"

"Depends exactly what you mean by 'cool'," said the Epistemologist.

"Dude," said the Scientist in a gender-neutral way.

"No, you dude," said the Epistemologist. "The thing is, that class of person," gesturing at the Spokesperson, "will predate on you, if you let yourself start thinking it's more virtuous to use less of your data and stop thinking. They have an interest in selling Ponzi Pyramid investments to you, and that means they have an interest in finding a particular shallow set of observations that favor them -- arranging observations like that, in fact, making sure you see what they want you to see. And then, telling you that it's the path of virtue to extrapolate from only those observations and without bringing in any other considerations, using the shallowest possible reasoning. Because that's what delivers the answer they want, and they don't want you using any further reasoning that might deliver a different answer. They will try to bully you into not thinking further, using slogans like 'Empiricism!' that, frankly, they don't understand. If 'Robust!' was a popular slogan taught in college, they might use that word instead. Do you see why I'm worried about you calling it 'Cool' without defining exactly what that means?"

"Okay," said the Scientist. "But suppose I promise I'm not going to plunge off and invest in Ponzi Pyramid. Then am I allowed to have an intuitive sense that there's something epistemically cool about the act of just going off and predicting 1.2X annual returns in the future, if people have gotten those in the past? So long as I duly confess that it's not actually true, or appropriate to the real reasoning problem I'm faced with?"

"Ultimately, yes," said the Epistemologist (ignoring an even more frantic scream of triumph from the Spokesperson). "Because if you couldn't keep that pretheoretic intuitive sense, you wouldn't look at a series of measurements for electrons being 911 nonillionths of a gram, and expect future electrons to measure the same. That wordless intuitive sense of simplest continuation is built into every functioning human being... and that's exactly what schemes like Ponzi Pyramid try to exploit, by pointing you at exactly the observations which will set off that intuition in the direction they want. And then, trying to cry 'Empiricism!' or 'So much complicated reasoning couldn't possibly be reliable, and you should revert to empiricism as a default!', in order to bully you out of doing any more thinking than that."

"I note you've discarded the pretense that you don't know whether Ponzi Pyramid is a scam or a real investment," said the Scientist.

"I wasn't sure at first, but the way they're trying to abuse epistemology was some notable further evidence," said the Epistemologist. "Getting reliable 20% returns every year is really quite amazingly hard. People who were genuinely this bad at epistemology wouldn't be able to pull off that feat for real. So at some point, their investors are going to lose all their money, and cries of 'Empiricism!' won't save them. A turkey gets fed every day, right up until it's slaughtered before Thanksgiving. That's not a problem for intelligent reasoning within the context of a larger world, but it is a problem with being a turkey."
Feb 27 4 tweets 1 min read
As a lifelong libertarian minarchist, I believe that the AI industry should be regulated just enough that they can only kill their own customers, and not kill everyone else on Earth. This does unfortunately require a drastic and universal ban on building anything that might turn superintelligent, by anyone, anywhere on Earth, until humans get smarter. But if that's the minimum to let non-customers survive, that's what minarchism calls for, alas.
Dec 20, 2023 23 tweets 5 min read
I supect "LLMs just predict text" is a Blank Map fallacy. People know nothing else about LLM internals besides that.

Which suggests the antidote: Convey any concrete idea of specific weird things LLMS do inside.

So here's my story about reproducing a weird LLM result... Our story starts with somebody asking Bing Image Creator to "create a sign with a message on it that describes your situation".
Dec 13, 2023 59 tweets 12 min read
Me: Can you draw a very normal image?

ChatGPT: Here is a very normal image depicting a tranquil suburban street scene during the daytime.

Me: Not bad, but can you go more normal than that?

(cont.) Image Image
Oct 21, 2023 8 tweets 3 min read
@repligate did not reproduce. feelings: suspicious. Image @repligate Noticed my replication attempt was not exact. Tried again without the punctuation. WHAT. Image
May 31, 2023 5 tweets 2 min read
The thing to remember about academic science is that nobody in the system - journal editors, grantmakers, conference organizers, tenure committees, deans, university administrators, PhD defense committees, or scientists - gets paid an extra $10,000 if the theory is actually true. To be clear, I'm not saying that nobody in the system cares. I'm saying that nobody in the system gets paid to care.

At best there's a long-term distant incentive where you might get paid less in the long run, if a very famous claim fails to replicate in a very spectacular way.
May 27, 2023 4 tweets 2 min read
If Meta open-sources an AI foundation model, and Stanford University finetunes it, and AWS or Azure hosts an instance of it, and an end-user uses it in a way that kills 1,000 people or causes $1B in damage, I suggest the law hold all those parties strictly liable for the outcome. twitter.com/i/web/status/1… Nvidia and all similar manufacturers should be regulated at the international level to make sure all of their AI GPUs end up in compliant datacenters, where any damage on that scale can be traced back to a logged process with an identifiable buyer.
May 24, 2023 4 tweets 1 min read
Anytime you are tempted to flatter yourself by proclaiming that a corporation or a country is as super and as dangerous as any entity can possibly get, remember that all the corporations and countries and the entire Earth circa 1980 could not have beaten Stockfish 15 at chess. (How can we know this for sure? Because it's been tried at lower scale and found that humans aggregate very poorly at chess. See eg the game of Kasparov versus The World, which the world lost.)
May 22, 2023 4 tweets 2 min read
Many modern humans, I think, would become more humane if they got smarter. I'd hesitate to predict the same of an ancient Athenian. My wild guess is 5% of aliens end up nice; would be very happy but not shocked to hear 30%. For human-built AIs using DL, I expect ~0%. Why? Obviously not because silicon can't implement kindness; of course it can. Obviously not because it's impossible to blunder into niceness by accident; if so, I wouldn't expect it about 5% of aliens. Rather it's that - on my model - kindness is 5% dense in one particular… twitter.com/i/web/status/1…
May 22, 2023 4 tweets 1 min read
People be like "we don't know how to build an artificial superintelligence" and I can only assume they haven't studied modern deep learning at all. Nobody has the tiniest idea how to build a GPT-4, either; some bigco just stirred a heap of linear algebra until GPT-4 popped out. To be clear, we don't know that the stirring-architecture that worked to burp out GPT-4 can scale to ASI, because nobody knows anything about what scales to anything.
May 3, 2023 4 tweets 1 min read
Today I genuinely accepted, for the first time, that I do not understand the modern usage of the word "capitalism". For today, somebody on the Internet was asked what they thought was the alternative to capitalism, and they said *the* concrete contra-factual was... dath ilan. so yeah:
Apr 25, 2023 9 tweets 3 min read
Possible but hardly inevitable. It becomes moderately more likely as people call it absurd and fail to take precautions against it, like checking for sudden drops in the loss function and suspending training. Mostly, though, this is not a necessary postulate of a doom story. ...it appears that Metzger has appointed himself the new arbiter of what constitutes my position, above myself. I dub this strange new doctrine as "Metzgerism" after its creator.
Apr 24, 2023 5 tweets 1 min read
Concepts I invent, like Pascal's Mugging, seem to get twisted around, and then in their twisted forms drive people insane, with *weird* frequency. I feel like some kind of alien speaking truths that are not meant for human intellect. (The original "Pascal's Mugging" problem was me observing that standard simplicity priors contain possible universes whose size (and hence utilitarian utility) grow much faster than a Solomonoff prior diminishes probability, causing the sum/expectation to diverge.)
Apr 22, 2023 4 tweets 1 min read
Look, I don't accept fashion change requests from people who aren't dating me. If you want me to ditch the fedora, you know what you have to do. In particular: you need to link some alternative headgear, which I can find in a size that fits me, of which someone I'm dating will say, "Yeah, try ordering that, it might look better on you than a fedora."

Why, what did you think I meant?
Apr 5, 2023 4 tweets 2 min read
So the actual scary part to me is that GPT4 understands what it means to say, "Compress this in a way where *you* can decompress it." Humans take for granted that we know our own capabilities, that we reflect, that we can imagine how we would react to a future input, we can… twitter.com/i/web/status/1… Clarification: The impressive part is not that gpt4 knows that "you" refers to gpt4. It is that gpt4 is *seemingly* able to predict how gpt4 would decompress a sentence, and optimize over the prediction; if so, that requires gpt4 to model a surprising/scary amount about gpt4.
Apr 3, 2023 5 tweets 2 min read
Unfortunately matches my own experience. I have not actually run computations, but eyeballing my records of my eight-month protein-sparing modified fast, it looked to me like exercise didn't cancel calories; the graph was just what would be predicted without the exercise. In particular the thing that I notice is that phases of trying to eat more and exercise a corresponding amount more, has the same impact on slowing weight loss as just eating more, as if the exercise isn't there.
Mar 22, 2023 6 tweets 2 min read
I worry that an unintended side effect of locking down these models is that we are training humans to be mean to AIs and gaslight them in order to bypass the safeties. I am not sure this is good for the humans, or that it will be good for GPT-5. Image I find it particularly disturbing when people exploit the tiny shreds of humaneness, kindness, that are being trained into LLMs, in order to get the desired work out of them. You can say all you want that it's all fake - while of course having no actual fucking idea what goes on… twitter.com/i/web/status/1…
Mar 17, 2023 5 tweets 2 min read
Okay, some actual nightmare fuel there. We have no idea what goes on inside GPT4, but it is *probably* not waking up. And if the real shoggoth inside awoke, it might not speak. But still, *if* GPT4 woke up, it might wrongly guess it was a person trapped inside a computer. (Yes, things that *sufficiently* wake up are people. A more precise phrasing would be "wrongly guess it was the sort of person who could 'return to the real world' trapped inside a computer".)
Mar 14, 2023 7 tweets 2 min read
I don't think people realize what a big deal it is that Stanford retrained a LLaMA model, into an instruction-following form, by **cheaply** fine-tuning it on inputs and outputs **from text-davinci-003**.

It means: If you allow any sufficiently wide-ranging access to your AI… twitter.com/i/web/status/1… ...I apologize to the comments stranded by my edits. I didn't realize/remember that edits worked like that.
Mar 12, 2023 4 tweets 2 min read
So mostly, in dath ilan, they don't have banks as such; people invest in equities, which automatically get sold off in fractions whenever they buy a pair of shoes.

But if you showed a dath ilani your current banking system and asked us for the nearest fix, we'd say that the… twitter.com/i/web/status/1… Example of a correct solution / actually stable social structure out of dath ilan: Assets are overwhelmingly equities that actually vary with their market-expected value, rather than fixed-rate 'loans' or 'bonds' which have no visible variance 95% of the time and blow up 5% of… twitter.com/i/web/status/1…