Remember that time economists used a gravity model to find ancient lost cities from the Bronze Age?
If you do or you don't, check out this threadđź§µ
The authors gained access to a collection of almost 12,000 deciphered and edited texts that were excavated primarily at the archaeological site of Kültepe, ancient Kaneš.
The ruins (pictured) are located in central Turkey, in the province of Kayseri.
The texts look like this.
They were inscribed on clay tablets in the Old Assyrian dialect of Akkadian in cuneiform by ancient Assyrian merchants, business partners, and their family members.
This tablet is dated to between 1930 and 1775 B.C.
The tablets were all from between 1930 and 1775 B.C., and 90% of the sample came from just one generation of traders, between 1895 and 1865 B.C.
The reason is that Kaneš experienced a major fire in 1840 B.C. and the commercial archives in the city were sealed off.
Tablets were largely business letters, shipment documents, accounting records, seals, and contracts.
A typical shipment document or expense account in which a merchant would inform partners about their cargo and expenses would read like this:
Some business letters would contain information about market and transport conditions, like this:
The tablets are spread across the world in museums and institutions, but many have been transcribed.
The transcribed ones mentioned 79 cities distributed across modern-day Iraq, Syria, and Turkey and 2,806 mentioned at least two Anatolian city names simultaneously, like so:
That tablet identified three shipments: Durhumit to Kaneš, Kaneš to Wahšušana, and Durhumit to Wahšušana.
So the itinerary is A→B→C, and there were 227 of these, with 391 examples of travel between city pairs.
Specifically, 25 city pairs: 15 known (gray), 10 lost (black).
Using trade among known cities, they estimated the distance elasticity of trade (how sensitive trade btwn cities is to the distance btwn them), so they could estimate the prbblity of shipments from city i to city j given their distance
Thus, probable locations for 10 lost cities
These estimates largely concurred with those of historians, and since the historians' conjectures weren't used in the model, this suggests people should start pursuing those estimations.
In fact, this modeling exercise might help to decide among the different proposals made by historians.
But the authors weren't done. They supplemented their analysis with data from merchant itineraries. For example, consider this letter:
That letter was submitted to the Assyrian port authorities at Kaneš from emissaries in Wahšušana, and it described how missives would travel through two different routes:
Wahšušana→Ulama→Purušhaddum
W→Šalatuwar→P
But only Wahšušana, Ulama, and Šalatuwar are known cities.
Using every multistop itinerary, a model with just two constraints offers a lot of info. The constraints are simple:
1. When deciding itineraries, merchants like direct routes. 2. Caravans have to make stops to rest, replenish supplies, feed pack animals, and make side trades.
With estimates constrained to regions that are admissible given those constraints (dashed lines), the locations of the newly-identified lost cities are now more certain!
With the exception of Purušhaddum.
But how do we know this method works?
Easy! Just lose known cities and see if the method rediscovers them.
As the picture shows, the average distance between estimated and known city locations wasn't huge. In fact, estimates were a median of 33km away (mean = 40km).
This method also helps to identify the names of sites that people have continued living in, like Kırşehir Kalehöyük, which might have been located under where the Alaaddin Mosque and a high school were later built.
There are other interesting findings here, too.
Consider this: geography has deep and persistent impacts on the economy of the area, and cities tend to show up where there are "natural roads".
Ancient cities were estimated to be larger when the natural roads were better!
And, modern cities are larger when nearby ancient cities were estimated to be larger as well.
The deep geographic reasons for cities to crop up in certain locations are still powerful forces today!
And for the real nerds, Zipf's law looks to basically hold for ancient city populations.
There you have it: economists might have discovered the locations of ancient lost cities from the Bronze Age, and supported a number of other fun facts while they were at it.
Only time will tell if these discoveries end up being true 🤞
Link:
The model the authors used was the gravity model: the workhorse model of trade.
0. The sample for the population result is *everyone* with ≥1/16 and ≤15/16 African admixture, the within-family result is for the siblings among them. I didn't plot between-families results, but they're pulled from the same distribution as the population result, so when I have those computed I'll replot, but they shouldn't be any different (will take a few hours, maybe be tomorrow, w/e).
1. Yes, datasets will be merged and sharper results will be obtained for the article on this that's coming out ~shortly. Can't be done for all phenotypes in the UKBB, but can be done for IQ at least. For example, we don't have lipoprotein(a) (lp(a)) measurements in some of the samples of young American kids, some of them don't have objective skin color measurements, etc. As a side note, this result holds up with brain size (not shown) in the UKBB, but I'm unsure if it holds up in the young samples that'll be merged with it, as they're still developing in most cases.
Anyway, the IQ p-value is p = 0.008 (two-tailed) within families and extremely low between them. The within-family result is robust, will get sharper with more data, and is also sharpened by using a latent variable instead of a score, and by correcting for measurement error (not needed with the LVM). Score shown for simplicity and ease of others with UKBB access replicating this. Also, accounting for error in the admixture computation, the p-value would drop a bit further, but not by very much since error is very small.
2. The "IQ per unit of admixture" is statistically indistinguishable between the population and within-family results, and yes, it explains most of the Black-White difference in IQ. I just wanted comparably-scaled results for all the traits here, so you're seeing r's. It's pleasant that the within-family variance reductions aren't enormous for siblings, which is what we expect even with quite high heritabilities given their genetic relatedness. It's the same result we've seen with American data, and it's also nice to see that in the case of this trait, the global admixture result *can* be interpreted like the within-family one. Presumably this only holds with measurement invariance, as we see in the U.K. when comparing Whites and Blacks there. Since we see this in the U.S. too, it's likely that the previous, already-published within-family null—which had a sizable effect in the correct direction which also could not be distinguished from the global r—was just a false-negative.
3. This result replicates with other degrees of relatedness, but we might lose the causal interpretation with those ones because the estimand for, say, a cousin test is different given the identity of the "C" variance component shifts for that comparison.
4. What is architectural sparsity and why is it relevant? Consider this table from nature.com/articles/s4159… (cc: @hsu_steve):
Basically, sparsity refers to the number of variants involved in a trait. It also refers to their effect size distribution. So, for Alzheimer's, for example, the trait is highly polygenic, but APOE explains more variance than the entire rest of the PGS, so while being under highly polygenic control, it remains moderately sparse.
If you're still not grokking what I mean here, consider some distributions of cumulative effects across the chromosomes. Here's the result for lp(a), which is already known to be influenced by essentially one gene. Guess which chromosome it's located on:
Consider, as a comparison, creatinine, which is considerably less sparse, and thus has effects distributed across all chromosomes:
Now, consider what this does to within-family admixture assessments. Skin color, for example, is controlled by only a handful of genes. This means that global admixture shouldn't tag it very much between siblings. The same is even more true for lp(a). And in fact, now we have confirmation of this!
But, we know from other methods that lp(a) is effectively entirely genetically-caused between populations. This is just accepted within the medical community because it is an obvious fact that follows from its strong control by a single gene. You can also figure this out using local ancestry estimates. Basically, the correlation between genome-wide ancestry and ancestry at a causal loci is what we want to get at, and if you know the causal loci, you gain power by restricting your analysis to that area. This is what we find with lp(a) (not shown, but use your brain. The obviousness of this fact is why the author used lp(a) as an example).
Also, in some sense, the frequency of that locus between ancestries gives you what you need without doing all this within-family stuff, if you're confident it's causal. The effect estimate might still be biased by population structure though, so that's worth keeping in mind.
There will be a post soon with more details and the expanded set of results with the additional datasets, robustness tests, and plenty of other fun things to look at.
TL;DR: this is a spoiler, and it shows that, yes, you can explain the Black-White IQ difference in Britain mostly genetically, and the global admixture result that I've posted here before is equivalent to the within-family one. Woohoo for things that should hold up, in fact, holding up!
As a sort of replication of the Young paper, you cannot explain the difference in educational attainment (as years of education) in this way. Why? Well, hard to say. Compensatory factors like I found with the GCSEs? (Haven't read that post yet? Go check it out here: cremieux.xyz/p/explaining-a…). Poor phenotype quality? Very plausible, because education really is a huge garbage heap, but why would that be in the general population and not within families? Maybe it has to do with what other traits admixture tags? Maybe it does replicate, but we just can't see it, because the precision is too poor (possibly the lp(a) story too). Who knows!
Q: Will the combination with more datasets allow us to fix this educational attainment result?
A: No, because most of the other datasets involve young people, not people who have almost all completed their educations, as in the UKBB. That plus the generational change and international incomparability in the definition of educational attainment makes it too poor as a phenotype. Sorry!
Any questions?
Link to a fun previous post from the same dataset, showing a result that *does* hold up within families: x.com/cremieuxrecuei…
I want to ping an old post that I've also posted some replications for.
Basically, parents are inequality averse, and they try to compensate for when one sibling is less gifted than another, reducing the ancestry/PGS effect on education within families.
Ever wondered why advertisements heavily feature Black actors when they're just 12-14% of the population?
I might have an explanation:
Black viewers have a strong preference for seeing other Blacks in media, whereas Whites have no racial preferences.
These results are derived from a meta-analysis of 57 pre-2000 and 112 post-2000 effect sizes for Blacks alongside 76 and 87 such effect sizes for Whites.
If you look at them, you'll notice that Whites' initial, slight preference declined and maybe reversed.
It's worth asking if this is explained by publication bias.
It's not!
Neither aggregately (as pictured), nor with results separated by race.
You're on trial, and the jury can't make up their minds. The decision is a coin flip: 50/50, you either get it or you don't.
Your odds of a given verdict depend on the "peers" making up your jury.
If you're Black and they're Black, your odds are good; if you're White, pray.
Though White jurors have, on average, no racial bias, the same can't be said for Black jurors.
Where the White jury gives you approximately the coin flip you deserve, the Black jury's odds for a verdict are like a coin rigged to come up heads 62% of the time.
Once you get to sentencing, things get even worse.
The White jury is still giving you a coin flip on a lighter or a harsher sentence, but the Black jury is giving lenient sentences to Blacks about 70% of the time.
Posts saying Charlie said things he didn't, oftentimes even including videos where he doesn't say what the post says he does, have convinced me.
There will be no organic temperature-lowering coming from the left, because they don't want it.
They believe the things they claim.
They actually believe the people they dislike *are* racists, fascists, homophobes, transphobes, all of it.
They are not reasonable, and they cannot form anything like a reasoned argument for these perceptions.
Even still, it's really what they think.
Accordingly, they will not lower the temperature.
You can't expect people to stop using slurs like "fascist" or "racist" or "sexist"—even if they are completely untrue and it's impossible to provide valid evidence for them!—if they believe they're true.
It shows that the gender wage gap is mostly about married men and their exceptional earnings.
In this thread, I'm going to explain why married men earn so much more than everyone elseđź§µ
The question is:
Does marriage maketh man?
Or
Are all the good men married?
That is, does marriage lead men to earn more, or do men who earn more get married more often?
To answer this question, we have to work through the predictions of different theories.
For example, one of my favorite papers on the subject looked into three different hypotheses to explain the "marriage premium" to wages, and they laid out a few testable predictions: