zdnet.com/article/google…
Another idea is a battery of cognitive tests, like an AIQ score, but that's also gameable.
Ideally we want a ratio-scale variable like transistor count. Maybe neural network size?
I'd refresh a regularly updated dashboard with all benchmarks like this. Maybe rise of the machines dot com?
gigazine.net/gsc_news/en/20…
Score each as 0.1X as good as a human, 1X as good, 10X as good, and so on.
To aggregate you might say "in 2017, AI was >1X human in 42% of benchmarks."