“easy for humans, hard for ai” is not a solid design principle for evals imo
it leads you towards “judging a fish by how far it can climb a tree” absurdities
but maybe it’s one orthogonal eval style among many equally important ones
specifically arc agi visual tasks look like nonsense in JSON format and multi modality isn’t great
and the character manipulation tasks don’t work for the same reason models mess up the how many “r”s in strawberry problem (tokenization/BPE)
it’s almost adversarially constructed wrt input modalities. for a model to solve these requires a far higher level of intelligence than equivalent human score
• • •
Missing some Tweet in this thread? You can try to
force a refresh
people are mostly wrong about psyops and information warfare. you can bet your bottom dollar that the boomer spooks are not great at manipulating online opinion. they lost control long ago
now it’s not necessarily true that the successor (Rome Is The Mob) is any better but you must let slip your illusions of control
the most skilled person you know at social media is in command like 5% of days. much less professional mossad kgb spooks. creating a Russian botnet or whatever doesn’t matter you can only say the things that people already want to hear
the entropy of twitter has decreased. the slop to life ratio has gone up. the gini coefficient has increased. there are fewer posts that get lots of attention and many posts that get few attention.
when you prioritize engagement on any platform, viral memes, self help slop, linkedin "insight" threads, dating content, celebrity pics, porn replies, etc takes over
human nature has clearly changed over time:
- people are vastly smarter thanks to better nutrition and abstract language environments
- people are less cruel and violent due to their distance from warfare
- exiting malthusian poverty removes many foundational traumas
even if it’s true that it’s the same animal under all that, what does that matter? aren’t there fundamental differences between feral children and normal ones? the life trajectory of their psychology isn’t a triviality but an essential feature
the average woman, were they to survive to adulthood and marriage, would have seen 2 of their 4 kids die. the average man was probably torturing animals to cope with the brutality of life. how can fixing that not change the mass psychology of your civilization?
one of the least examined most dogmatically accepted things that smart people seem to universally believe is that ad tech is bad and that optimizing for engagement is bad
on the contrary ad tech has been the single greatest way to democratize the most technologically advanced platforms on the internet and optimizing for engagement has been an invaluable tool for improving global utility
it’s trivially true that overoptimizing for engagement will become goodharted and lead to bad dystopian outcomes. this is true of any metric you can pick. this is a problem with metrics not with engagement.
i don’t understand “enough people must survive that it’s ok”. no i think if industrial civilization ends as we know it there’s no coming back. the whole astronomical waste thesis comes to an incorrect conclusion
the easy oil in the earths is tapped; the cost of new oil is only made feasible by a capital buildup of advanced technology. if you were an early industrializing civilization starting to mine oil, coal, gas today the initial ROI would be infeasibly low
it is very possible that this entire experiment ends in the next century if we don’t play our cards right. existential risk is everywhere and everpresent. intelligent life in this galaxy (or even on this planet) is clearly not abundant across the stretch of time