One of the most dangerous afflictions of data science teams is to go really big for the sake of going really big. That is why everyone jumped on the Big Data bandwagon and got little ROI to show for it.
Yes, Microsoft and Nvidia have the compute resources to go very big (i.e. 530b parameters), but that doesn't justify that everyone else does the same thing! microsoft.com/en-us/research…
What do you call that cognitive bias where you believe that you cannot make good progress without the fastest most advanced piece of hardware? This affliction affects so many technical endeavors. We all want to play with the F1 cars that everyone raves about.
The best people should be able to have access to the best equipment. Anything less will be demotivational.
The odd thing about big data and clusters of GPUs is that you can never know when you have enough!
• • •
Missing some Tweet in this thread? You can try to
force a refresh
In Apple's rendition of Asimov's Foundation, the empire is ruled by 3 clones of the same original ruler. These clones are at different ages where the middle-aged clone is the ruler, the elder is an advisor and the younger is his successor.
In addition, there is a character Demerzel that serves the trio of clones. She is ever-present with the clones and all their ancestors. From singing lullabies to them before their birth to sending them off to incineration in their death.
Demerzel is immortal because she's an android. In Asimov's Foundation universe there are no AIs with this exception. Apparently, an AI is always present that is serving (or perhaps manipulating) the rulers of the civilization.
The academic community would like one to believe that a single AI training method can lead to a useful system. This belief is not even remotely true. Indeed it gets you to publish a paper, but a useful product is very different from an academic paper.
A useful product is one that can be operated economically and addresses a user's needs at the correct price point. There are a multitude of knobs to tune here and a multitude of methods with varying resource demands, latency and accuracy.
A one-size-fits-all solution is a fantasy when it comes to products driven by AI methods. To deploy the right product requires a balance of many existing methods. This kind of balancing act is extremely difficult to do if we have tunnel vision of what methods are available.
When modern civilization voted away monarchies, we collectively sought to get rid ourselves of leaders that were psychopaths. Yet here we are today.
When optimization is the primary driver of civilization, we structure our lives as if we are cogs in a great machine. As a consequence, our leadership also treats people as if they were also machines.
The fear of AI is because they replace us as cogs in the machinery. Thus we lose our relevance. We cannot stomach the possibility that AI replaces our psychopathic leaders. Thus we lose our agency.
If wealth implies having the luxury of time to engage in your passionate interests, then why don't wealthy intellectuals hire tutors so they can understand complex subjects faster?
The strange thing is consultants are usually hired so that the wealthy do not have to think about some things (for example: financial planning), but it is a rare case where they are hired to do something better.
Yet it is not unusual for the wealthy to hire personal trainers because it's common sense that you can't outsource your physical exercise. So why aren't there personal 'cognitive' trainers?
Finally a credible mathematical framework for understanding how to build deep learning architectures for different problem domains. @mmbronstein
5G's of Geometric Deep Learning:
Now every Deep Learning practitioner needs to include groups, geodesics, and gauge invariances in their working vocabulary. Our brains are about to explode!
The number of papers are indications of interest and not impact. Indeed more people working on the same problem can generate more ideas. But more ideas do not necessarily generate more impactful ideas when ideas are constrained by groupthink.
What is driving the interest in deep learning is of course its phenomenal success. This leads to more funding and more advanced tools. There are diminishing returns in every field as the low-hanging fruit is picked.
Like in any field the early adopters are always rewarded disproportionately more than the latecomers. Unfortunately, it is human bias to recognize more the pioneers.