Test X predicts achievement Y in school A. Peter scores x on that test, and is therefore denied access to A. How unfair can that be? (Hint: predictive validity is a characteristic of test x population x situation; the test score is one of a particular human being)
The case of admission to the Civil Service: Francis Y. Edgeworth (1888). The statistics of examinations. ‘Journal of the Royal Statistical Society, 51’ 599-635. jstor.org/stable/2339898 The exam is a lottery offering better chances to the better prepared, which is not unfair.
On lotteries in education, see benwilbrink.nl/projecten/lott… In the Netherlands admission to studies with a restricted number of slots was regulated by weighted lottery from 1975-2017 (since then selection is by whatever instruments are deemed opportune, lotteries being outlawed!)
Another way to express the random component in testing goes this way: the cut-off (pass-fail) score x is the score where the decision maker (teacher; school; institution) is indifferent between passing or failing students scoring x. Read this again.
Indifferent: the expected utility of a pass equals that of a fail. Whose utility? That of the institution (or that of the teacher in behalf of). For technicalities see benwilbrink.nl/publicaties/80… [in Dutch] Expected utility is a group statistic. Is it fair to fail Peter scoring x?
The (written as well as unwritten) law sees on the rights and duties of schools and individual pupils. For the Netherlands see [in Dutch] C. W. Noorlander 2005 ‘Recht doen aan leerlingen en ouders’ wolfpublishers.com/book.php?id=291; M. Job Cohen 1981 ‘Studierechten’ benwilbrink.nl/projecten/toet…
Flunking grades. Dutch schools have detailed regulations (more detailed than just a GPA) on the grades that give pupils a pass to the next class. Peter is 0,1 point short. The team decides to flunk him, without further motivation. Correct? Noorlander: ben-wilbrink.nl/Noorlander14.5…
The Noorlander quote is in Dutch. The main point of Dutch law here is: schools have a best efforts obligation, pupils are responsible themselves for their own achievements or lack thereof. WTF? I'll have to research its historical roots (sometime, not now ;-).
There is a serious inconsistency between attributing responsibility primarily to pupils, and yet flunking them without giving pupils (and their parents) voice. Whose fault is it that Peter has to repeat class: the decision was made by an algorithm (school rulings), not by Peter.
Dutch schools are free to choose their own regulations. Retaining pupils must not depend on only the letter of regulations ( = the algorithm), however. Every individual decision needs a motivation by the school team that is sufficient to justify it. Case onderwijsgeschillen.nl/uitspraken/030…
Jurisprudence on class retention (Dutch): onderwijsgeschillen.nl/thema/bevorder… Google for case law in your own country, f.e. Who Grades Students? Some Legal Cases, Some Best Practices umich.edu/~aaupum/Euben.…
It is a general principle of justice that drastic decisions (failing pupils) be motivated materially (i.e., just following a rule is not a motivation), having heard both sides. In seemingly evident cases just applying the rule might be too harsh, given personal circumstances.
Whether or not the team has discussed the case of Peter, if it is communicated that he has to repeat class just because of being 0,1 point short then Peter has been done an injustice. [Discussion of effectiveness of grade repetition itself is another question. (not effective)]
A difference of 0,1 point (in whatever the metric is) can not justify to treat two persons differently: failing the one, passing the other. A. D. de Groot understood that much, but it was a paradox for him. How else can this kind of decision be justified? He had no answer.
Observe that this case is not the same as that of Edgeworth’s Civil Service examination. Here Peter is denied an educational opportunity (if he and his parents think so) on the basis of arbitrariness. The educational/human rights of Peter are in the balance now.
Just in case you might think this all is just theory: a case hanging on 1 point in a secondary school examination is now being considered by the Supreme Court of the Netherlands (for Dutch readers, info: benwilbrink.wordpress.com/2017/09/20/exa… ).
The curious thing about this examination: one runs the risk of failing it. Why do we accept the fuss and enormous costs of failing pupils on their exam? After all: they get a diploma and a listing of the grades received. That should be enough! Proposal: telegraaf.nl/watuzegt/64151…
The proposal is nothing new: in the middle ages students were admitted to an examination as soon as their master deemed them ready. Our academic promotion is an exam one can’t fail on. A history of assessment (in English): benwilbrink.nl/publicaties/97… Grading = ranking, really!
A treasury of insights is Cronbach & Gleser's classic 1965 (2nd) ‘Psychological tests and personnel decisions’ (reviewed journals.uchicago.edu/doi/abs/10.108…) Main distinction: between interests of employers served (mainstream psychometrics) or those of patients, clients, students (advisory).
Find Cronbach & Gleser on abebooks.com Another treasury is Hanson 1993 ‘Testing testing. Social consequences of the examined life’ about abuse of tests (crushing the individual) by employers, institutions, schools. Open access publishing.cdlib.org/ucpressebooks/… Important case:
The lie detector. It will label more testees as liars than there are true liars. That should be an important lesson for everyone using psychological tests or having to sit them. Representatives of the American Psychological Association explained it in testimony:
General case: “Collective statistical illiteracy refers to the widespread inability to understand the meaning of numbers.” F.e. in medical diagnosis (essentially identical to assessment in education!): Gigerenzer a.o. 2008 Making sense of health statistics stat.berkeley.edu/~aldous/157/Pa…
Tests and assessments are forms of institutional violence, never mind the good intentions expressed by stakeholders. Violence may be justified, like surgery ;-) Or not. Whatever the case, there should not be any secrecy about tests and exams researchgate.net/publication/23…
Documentation on test secrecy in the Netherlands (in Dutch) benwilbrink.nl/projecten/gehe… A special case was the #rekentoets (math test, now discontinued), kept secret ‘in the national interest’ (yes, you read that correctly!). Crushing the individual examinee is no problem, then .....
In the Netherlands most children in primary education get tested a number of times every school year: standardized aptitude tests reported in terms of percentile groups the pupil scores in. Some schools stream children based on those results . Fair? No way.
Telling a child repeatedly that it belongs to the 10% least clever kids in the country is nothing less then psychological abuse, a serious breach of the child’s rights to a quality education. Schools should only test with permission of the parents. justiceinschools.org/opting-out-sta…
The problem is labeling the individual child on the basis of group statistics. Remember the lie detector case?

Nobody has a right to be admitted to the Civil Service: an examination is fair.

Children’s rights on a quality education have been agreed upon in intern. treaties.
Next I’ll try to disentangle the thinking of some behaviour geneticists that heritability of years of schooling (a population statistic) implies that school curricula should be individualised (given pupil’s genome), using the debate on Plomin’s ‘Blueprint’. Will take some time.
Warming-up, today’s blog in Quillette on correlation ≠ causation quillette.com/2019/07/30/the… (via @RichardPPhelps )
@RichardPPhelps For the behaviour genetics of cognition (IQ), see f.e. Briley & Tucker-Drob 2013 ‘Explaining the Increasing Heritability of Cognitive Ability Across Development. A Meta-Analysis of Longitudinal Twin and Adoption Studies’ jstor.org/stable/2348467… Lots of ‘influences’ in this paper,
yet the only data available are correlations. Stick to correlations. “In early childhood, increasing genetic influences on cognitive ability can be attributed to innovative genetic influences.” Influences on = correlations with. Cognitive ability=DIFFERENCES in cognitive ability.
The data are not experimental, only correlational. Be warned: even highly heritable traits like weight are highly malleable. So is IQ: the Flynn effect! Stop educating youth and the country’s human capital will drop to archaic lows. So, don’t jump to easy conclusions!
Heritability of intelligence, or of length of schooling, is not a characteristic of intelligence (or length of schooling) per se, only of intelligence in our contemporary society. Health care influences IQ heritability; better HC heightens IQ heritability. Really!
Plomin’s ‘Blueprint’: penguin.co.uk/articles/2018/…
[p. viii] “The power of genetic research comes from its ability to detect the effect of these inherited DNA differences on psychological traits without knowing anything about the intervening processes.” This is extremely problematic:
(1) ‘Without knowing anything about the intervening processes’ [#blackbox] claims about causality (‘effects’) are idle, and therefore dangerous.
(2) Talk about differences without considering absolute levels is misleading. (#contingent on societal institutions: NHS, schools).
Plomin, the Penguin page: “The evidence for the importance of genetics itself calls for a radical rethink about parenting, and education, and society” This man is dangerous. What’ll happen to the six-year old in primary school, having had his genome screened in the Plomin way?
Let me look at commentaries on Plomin’s extreme position by some of his colleagues. F.e. this one on causation: Eric Turkheimer (July 6, 2019). Behavior causes genes!

[I once asked Denny Borsboom how a DIFFERENCE can be a cause. Surprise!]
Turkheimer refers in his blog to a paper on causation that’ll serve to defuse the Plomin simplification of (differences in) years of education, intelligence, depression, being ‘caused’ by genes: Craver & Bechtel, Top-down causation without top-down causes pages.wustl.edu/files/pages/im…
Turkheimer, on Plomin and his ‘Blueprint’: “But overstating the science of human behavioral genetics comes with the greatest price imaginable: it encroaches on human freedom and justice.” [Turkheimer, 2019, ‘The social science blues’ people.virginia.edu/~ent3c/papers2… ]
In another blog, June 11 2019 geneticshumanagency.org/gha/causation-… Turkheimer formulates an answer to the problem posed at the start of this thread: is it fair to judge and treat an *individual* pupil (given her IQ, or DNA) on the basis of *population* statistics? It isn’t. Look:
