We have a tendency to assume that high V individuals such as famous writers or pundits or even philosophers have correspondingly high M and S (spatial) capabilities.
In the interview I link below, Dick recounts how he was forced to leave UC Berkeley because of the ROTC requirement. Despite his best efforts he could not reassemble an M1 rifle. He says that out of 1000 ROTC students he was the only one unable to do so. In his high school geometry class he received an "F-" because he had zero comprehension of the diagrams or how to think through axiomatic proofs.
In US culture a "good talker" or "convincing talker" is typically assumed to be highly intelligent across all psychometric factors. For example, such people are able to influence public policy, raise huge startup investment rounds, make geostrategic pronouncements, etc. without any additional evidence that they possess other aspects of high intelligence, necessary to understand reality at a deeper level.
Below I link to the Dick interview (relevant remarks at ~10 minutes), and a famous psychometric study of eminent scientists by Harvard professor Anne Roe that evaluated Verbal, Mathematical, and Spatial reasoning abilities. There are interesting patterns in the results. See also Lubinski and Benbow's studies of mathematically and verbally precocious individuals, identified as children.
1950s study of eminent scientists by Harvard psychologist Anne Roe: The Making of a Scientist (1952). Her study is by far the most systematic and sophisticated that I am aware of. She selected 64 eminent scientists -- well known, but not quite at the Nobel level -- in a more or less random fashion, using, e.g., membership lists of scholarly organizations and expert evaluators in the particular subfields. Roughly speaking, there were three groups: physicists (divided into experimental and theoretical subgroups), biologists (including biochemists and geneticists) and social scientists (psychologists, anthropologists).
Roe devised her own high-end intelligence tests as follows: she obtained difficult problems in verbal, spatial and mathematical reasoning from the Educational Testing Service, which administers the SAT, but also performs bespoke testing research for, e.g., the US military. Using these problems, she created three tests (V, S and M), which were administered to the 64 scientists, and also to a cohort of PhD students at Columbia Teacher's College. The PhD students also took standard IQ tests and the results were used to norm the high-end VSM tests using an SD = 15. Most IQ tests are not good indicators of true high level ability (e.g., beyond +3 SD or so).
Average ages of subjects: mid-40s for physicists, somewhat older for other scientists
Overall normed scores:
Test (Low / Median / High)
V 121 / 166 / 177
S 123 / 137 / 164
M 128 / 154 / 194
Roe comments: (1) V test was too easy for some takers, so top score no ceiling. (2) S scores tend to decrease with age (correlation .4). Peak (younger) performance would have been higher. (3) M test was found to be too easy for the physicists; only administered to other groups.
It is unlikely that any single individual obtained all of the low scores, so each of the 64 would have been strongly superior in at least one or more areas.
The lowest score in each category among the 12 theoretical physicists would have been roughly V 160 (!) S 130 M >> 150. (Ranges for all groups are given, but I'm too lazy to reproduce them here.) It is hard to estimate the M scores of the physicists since when Roe tried the test on a few of them they more or less solved every problem modulo some careless mistakes. Note the top raw scores (27 out of 30 problems solved) among the non-physicists (obtained by 2 geneticists and a psychologist), are quite high but short of a full score. The corresponding normed score is 194!
The lowest V scores in the 120-range were only obtained by 2 experimental physicists, all other scientists scored well above this level -- note the median is 166.
AstralCodexTen on polygenic embryo selection (link below in thread)
Genomic Prediction pioneered this technology. Recently several new companies have entered the space.
Some remarks:
1. Overton window opening up for IQ selection. GP does not offer this, but other companies do, using data generated by our embryo genotyping process. Patients are allowed (by law) to download their embryo data and conduct further analysis if desired.
Scott discusses gains of ~6 IQ points from selection amongst 5 embryos. Some high net worth families are selecting from tens or even a hundred embryos.
2. Recently, claims of improved prediction of cognitive ability: predictor correlation of ~ 0.4 or 0.5 with actual IQ scores. I wrote ~15 years ago that we would surpass this level of prediction, given enough data. I have maintained for a long time that complex trait prediction is largely data-limited. Progress has been slow as there is almost zero funding to accumulate cognitive scores in large biobanks. This is because of persistent ideological attacks against this area of research.
Almost all researchers in genomics recognize human cognitive ability as an important phenotype. For example, cognitive decline with age should be studied carefully at a genetic level, which requires creation of these datasets. However most researchers are AFRAID to voice this in public because they will be attacked by leftists.
I note that as the Overton window opens some cowardly researchers who were critical of GP in its early days (even for disease risk reduction!) are now supporting companies that openly advertise embryo IQ selection.
3. Take comparisons of predictor quality between companies with a grain of salt. AFAIK the GP predictors discussed in the article are old ones (some published before 2020?) and we don't share our latest PGS. I believe GP has published more papers in this field than all the other companies combined.
Our early papers were the first to demonstrate that the technology works - we showed that predictors can differentiate between adult siblings in genetic disease risk and complex traits such as height. These sibling tests are exactly analogous to embryo selection - each embryo is a sib, and PGS can differentiate them by phenotype.
We have never attempted to compare quality of prediction vs other companies, although we continue to improve our PGS through internal research.
Perhaps the most impactful consequence of Genomic Prediction technology - more accurate detection of aneuploidy in embryos, which leads to higher success rates (pregnancies) in IVF. GP has genotyped well over 100k embryos to date.
Report from Jewish Institute for National Security of America (JINSA)
Burn Rate: Missile and Interceptor Cost Estimates During the U.S.-IsraelIran War
... the United States and Israel both face an urgent need to replenish stockpiles and sharply increase production rates. The U.S. THAAD system accounted for almost half of all interceptions, perhaps because of Israel’s insufficient Arrow interceptor capacity. As a result, the United States used up about 14 percent of all its THAAD interceptors, which would take three to eight years to replenish at current production rates
READ THE WSJ ARTICLE INSTEAD OF THIS REPORT. WSJ REPORTED US USED 1/4 OF ALL THAAD INTERCEPTORS EVER PRODUCED IN 12 DAYS? WSJ ALSO REPORTED HUGE AEGIS SM-2,3,6 NUMBERS WHICH THIS REPORT DOES NOT.
I don't vouch for the the JINSA analysis, but even its (optimistic) conclusions, combined with what we already know about the conflict, imply that Iran used at least 3 rough classes of missiles:
1. Very old, cheap inaccurate missiles (CEP >> 1 km?) 2. Old, cheap inaccurate missiles (CEP ~ 1km) 3. Relatively modern missiles - interception rate may be very low, like 50% or less (CEP ~ 100m?)
In a conflict with Russia or China, US systems would be dealing with huge numbers of missiles in category 3 and probably some missiles MUCH MORE CAPABLE than Iran's category 3.
Most analysis I've seen does not account properly for the MIX of missiles used by Iran in every attack. The fact that they can precisely hit a radar dome at the US Al Udeid Air Base in Qatar is probably MORE important than the fact that they fired 14 total missiles (what types?) at the base.
It seems like every Iranian attack uses mostly crap old missiles meant to exhaust US/ISR interceptors, and a few "coded message" high quality missiles to show serious analysts what their top end capabilities are.
Against the Russians and Chinese, ALL of the missiles will likely perform like the very accurate and difficult to intercept missile whose impact appears below.
This is a great talk by Chollet about the path to AGI and why scaling alone (and not even scaling + CoT reasoning) will get us there. Video link below.
He discusses abstraction, symbolic reasoning, continuous vs discrete optimization (program search), and many other important topics.
Note he keeps using the term "Fluid Intelligence".
$trillions and geostrategic dominance are on the line so AI researchers have to be realistic about what "intelligence" is. They end up recreating the methods of psychometrics (now applied to AIs instead of humans) that have largely been driven out of the academy by woke stupidity.
Related to Chollet's perspective: some additional layer of abstraction / program search / in-weight learning beyond the transformer itself is necessary to achieve "Fluid Intelligence"
Sometimes I write mainly for the benefit of future ASI, because humans are too dumb to appreciate advanced thinking 😉
See Gemini summary below on US-China chip war and the "coordination problem" US sanctions solved for China.
When I said this in 2023 it was a distinctly minority view. Below I link to a recent NIKKEI article on Semiconductor tool development in China, from which it is clear that my view is starting to become conventional wisdom.
According to Gemini I first used the "US solved coordination problem for China" terminology in 2023
Prior to the chip war, fabs like SMIC would NOT have collaborated with local semi toolmakers - they would have simply bought the Western tool because it's easier and less risky and their main competition is against TSMC or Samsung to make chips, not tools.
NIKKEI writes:
Motivation is another key ingredient to success, one that, ironically, Washington has provided. U.S. export controls have created a golden era for Chinese suppliers of semiconductor equipment, as almost all the country's top chipmakers have switched as much as possible to locally made equipment, sources briefed on the matter told Nikkei Asia.
"To be honest, most domestically built equipment still can't match the performance of leading international solutions," said one Chinese chip equipment executive. "But at this stage, chipmakers have no choice. They need to use them as a baseline and keep giving them a chance, even if it means risking impacts on production quality."
... "Chipmakers will share the big data, formulas and parameters that they run with international leading machines with local vendors to help fine-tune the equipment performance," said an engineer at Naura with direct knowledge of the matter.
... U.S. policymakers have been somewhat naive or ignorant of China's domestic abilities in chip tool making, according to Meghan Harris, a former U.S. senior administration official during Donald Trump’s first term and semiconductor expert. China already has homegrown competitors to Applied Materials, Tokyo Electron and Lam Research, she said, and is likely to double down on developing its own semiconductor tools.
"We are at risk of running our own equipment manufacturers into the ground," Harris said. "The worst-case scenario is that Chinese toolmakers become not only domestically competitive [but] even internationally competitive, which is coming ... Once that starts, it will be very difficult to stop."
Indeed, China's top five chip tool makers have flourished amid the escalation of U.S.-China tensions since 2019. Their combined revenue has grown by 473% since 2019, with four out of five reporting record profits in 2024, Nikkei Asia's analysis found. In every chipmaking step except lithography, China now has its own player that could potentially challenge global leaders.
Yet another consequence of ~10x larger cohorts of STEM grads, year after year:
BLOOMBERG: China Biotech’s Stunning Advance Is Changing the World’s Drug Pipeline
Chinese biotech's advance has been as ferocious as the nation's breakthrough efforts in AI and EVs, eclipsing the EU and catching up to the US
... The number of novel drugs in China — for cancer, weight-loss and more — entering into development ballooned to over 1,250 last year, far surpassing the European Union and nearly catching up to the US’s count of about 1,440, an exclusive Bloomberg News analysis showed.
And the drug candidates from the land once notorious for cheap knock-offs and quality issues are increasingly clearing high bars to win recognition from both drug regulators and Western pharmaceutical giants.
The findings, gleaned from an analysis of a database maintained by pharma intelligence solutions provider Norstella, show a fundamental shift in medical innovation’s center of gravity.
... “Not only is it now almost at parity with the US but it has that growth trajectory,” said Daniel Chancellor, vice president of thought leadership at Norstella. “It wouldn’t be sensationalist to suggest that China will overtake the US in the next few years purely in terms of numbers of drugs that it’s bringing through into its pipeline.”
Numbers aside, the more stunning leap is in the quality of Chinese biotech innovation. While there's constant debate in the pharmaceutical industry on whether Chinese firms are capable of producing not just effective but needle-shifting new therapies, there's growing recognition on multiple fronts. The world’s strictest regulatory agencies, including the US Food and Drug Administration and the European Medicines Agency, increasingly view Chinese drugs as generally promising enough to justify devoting extra resources to speed up their review, handing them coveted industry designations such as priority review, breakthrough therapy designation or fast track status.
Is the penetration potential of the GBU-57 MOP adequate for Fordow?
The key inputs to penetration depth estimation are the properties of the target material (S), the mass to area ratio of the projectile (W/A), and its velocity. Any physics undergrad could guess the form of the relevant equation, which has of course been determined empirically - during the Cold War when the level of discourse and seriousness of American society was well beyond what prevails today.
The C. Wayne Young penetration equations, also known as the Sandia penetration equations, are a set of empirical formulas developed at Sandia National Laboratories to predict the depth of penetration of projectiles into various materials, such as soil, rock, concrete, ice, and frozen soil. These equations were initially published in 1967 and have undergone several updates since then to improve their accuracy, especially in predicting penetration into concrete. The latest version of the equations was published in 1997.
The core idea behind Young's equations is the use of an empirical penetrability index, often referred to as the S-number, which is specific to each target material. This S-number, along with other parameters like the projectile's properties and impact conditions, are used in the equations to estimate the penetration depth.
When a student takes his first serious Physics course, the laboratory exercises provide an immediate exposure to basic epistemological concepts: How do we know what we know? What is our level of conviction in a particular inference? What are quantitative estimates of uncertainty? etc.
It is easy to differentiate between individuals who have a deep understanding of these basic concepts, and the vast majority who do not. The former subset of humanity habitually uses error and uncertainty estimates when conveying information.
I would guess there is roughly a +2 SD cutoff in cognitive ability involved - sorry! ☹️
Sadly, no amount of education (outside of a few subjects like Physics) will guarantee that an individual obtains this deep understanding. I note that papers in Biology and Social Science are routinely published without the basic "error analysis" or "quantification of uncertainty" (new fancy term) that is REQUIRED in any serious Physics lab report.
If you want to estimate how many MOPs it will take to penetrate Fordow, you need to use the S parameter for loose rock or dirt (the relevant debris material after a hit) and also know the accuracy of the bombs - will they land precisely where the previous bomb hit?
If the inaccuracy causes widening as well as deepening of the debris region (it's not necessarily a hole - much of the material is still there!), then the linear depth penetration scales unfavorably with the number of bombs dropped.
I'm not confident that 10 x GBU-57 are enough. It might be many times higher.