WHAT I SAID WAS "Scientific knowledge is socially constructed" BUT SOME PEOPLE HEARD "Scientific truth is a social construct". For me, there is a difference!
When you tweet things out to 45k people, you learn a lot about the difference between what you think you said and what people hear.
I now understand that some percentage of people are hearing me say "Scientific truth is a social construct" whenever I say "Scientific knowledge is socially constructed". I think both are true but that's not the whole story. Let me explain.
Let's talk about language for minute. I'm sure you will agree that the words we use for things are somewhat arbitrary. However, the things in the world that we use language to describe aren't arbitrary. The word "rose" is a social construct but the rose itself is not.
Languages imperfectly depict reality. They disagree with each other on how to carve up the world. Distinctions in one language often don't match up with distinctions in other languages. These mismatches can help us learn something about objective reality.
Here's what I think. I think it's possible to socially construct something which technically speaking is a social construct (like words in a particular language) but which approximates a non-social construct (like physical reality).
The level of correspondence between the social construct and reality is an objective property of the construct. For instance, "unicorn" and "horse" are both words and both socially constructed but one of these words matches up better with reality than the other.
When we create a social construct with a goal in mind like helping us manipulate the physical world, I think that goal IS constraining on our construct. I would expect some convergence in the form of the construct as it comes to better describe objective reality.
In the language analogy, the process of "socially constructing" scientific knowledge is like translating sentences back and forth between multiple languages as a way of telling us something about the biases each language contains and about reality itself.
When we participate in science as individuals, we create mental constructs. When we participate in science as a group, we exchange our mental constructs with each other and test them out.
This process empirically tests whether our mental constructs, our personal knowledge that appears to us to correspond to reality, is indeed mind-independent knowledge. Only after we have verified that a useful mental construct is mind-independent, can we safely call it "science".
To summarize. I believe that:
1. Scientific knowledge is socially constructed
2. Scientific knowledge is a social construct
3. Scientific truth is NOT a social construct
Let me translate this into the language analogy.
1. English is socially constructed
2. The precise wording of any collaboratively written sentence is a social construct
3. The correspondence of that sentence with reality is not a social construct.
Addendum: There is a lot more to say about how one would establish the degree of correspondence between a social construct and reality or how one would come up with good candidate mental constructs to begin with but that's another longer essay.
Forgive me for not addressing every aspect of the scientific method in this tweet thread. I'm only focusing on the "social construction" aspect for now because I find it interesting! Also please don't take this essay as me saying I know all the answers because I don't.
• • •
Missing some Tweet in this thread? You can try to
force a refresh
You may have heard hallucinations are a big problem in AI, that they make stuff up that sounds very convincing, but isn't real.
Hallucinations aren't the real issue. The real issue is Exact vs Approximate, and it's a much, much bigger problem.
When you fit a curve to data, you have choices.
You can force it to pass through every point, or you can approximate the overall shape of the points without hitting any single point exactly.
When it comes to AI, there's a similar choice.
These models are built to match the shape of language. In any given context, the model can either produce exactly the text it was trained on, or it can produce text that's close but not identical
I’m deeply skeptical of the AI hype because I’ve seen this all before. I’ve watched Silicon Valley chase the dream of easy money from data over and over again, and they always hit a wall.
Story time.
First it was big data. The claim was that if you just piled up enough data, the answers would be so obvious that even the dumbest algorithm or biggest idiot could see them.
Models were an afterthought. People laughed at you if you said the details mattered.
Unsurprisingly, it didn't work out.
Next came data scientists. The idea was simple: hire smart science PhDs, point them at your pile of data, wait for the monetizable insights to roll in.
As a statistician, this is extremely alarming. I’ve spent years thinking about the ethical principles that guide data analysis. Here are a few that feel most urgent:
RESPECT AUTONOMY
Collect data only with meaningful consent. People deserve control over how their information is used.
Example: If you're studying mobile app behavior, don’t log GPS location unless users explicitly opt in and understand the implications.
DO NO HARM
Anticipate and prevent harm, including breaches of privacy and stigmatization.
Example: If 100% of a small town tests positive for HIV, reporting that stat would violate privacy. Aggregating to the county level protects individuals while keeping the data useful.
Hot take: Students using chatgpt to cheat are just following the system’s logic to its natural conclusion, a system that treats learning as a series of hoops to jump through, not a path to becoming more fully oneself.
The tragedy is that teachers and students actually want the same thing, for the student to grow in capability and agency, but school pits them against each other, turning learning into compliance and grading into surveillance.
Properly understood, passing up a real chance to learn is like skipping out on great sex or premium ice cream. One could but why would one want to?
If you think about how statistics works it’s extremely obvious why a model built on purely statistical patterns would “hallucinate”. Explanation in next tweet.
Very simply, statistics is about taking two points you know exist and drawing a line between them, basically completing patterns.
Sometimes that middle point is something that exists in the physical world, sometimes it’s something that could potentially exist, but doesn’t.
Imagine an algorithm that could predict what a couple’s kids might look like. How’s the algorithm supposed to know if one of those kids it predicted actually exists or not?
The child’s existence has no logical relationship to the genomics data the algorithm has available.