As anyone with an even cursory knowledge of AI history knows, there have been several AI Winters, periods of cooling of interest (and drop in funding) in AI research. All of these came about after the realization that at the time dominant AI paradigms were somehow limited. 1/7
For at least a decade now we have been enjoying an unprecedented AI Springtime. A perfect storm of major advances in algorithms (deep learning), computational architecture (GPUs) and availability of large high quality datasets has enabled the field to grow - exponentially! 2/7
However, in the real world there is no such thing as an endless exponential growth. What may seem like an exponential curve, inevitably turns out to be the fast rising part of a logistic curve. It is hard to speculate when the fastest part of the growth will end though. 3/7
Deep Learning architectures seem to be getting close to their optimum, but there still might be some surprises ahead. We are getting near to what is physically possible with the current computational architectures, 4/7
but we are only scratching the surface of what massive parallelization can achieve. Once your datasets contain all of human knowledge, it's hard to imagine where else you can find more data to feed your algorithms with. 5/7
We are still well in the "exploitation" part of the current paradigm, and there are still many gains to be made. But we should keep our eyes open to the alternatives, and devote at least a fraction of our attention, funding and other resources to explore them. 6/7
Because it is very likely that this wave of AI is not the final one, and a new AI Winter may arrive at some point. Maybe even much sooner than most people realize. 7/7
• • •
Missing some Tweet in this thread? You can try to
force a refresh
You gotta have options - that's the line that a jewelry salesman once used on my wife, and has become an inside joke in our family. However, that sales line is a very good consideration to have in all sorts of life situations.
I endured some of the biggest setbacks in my life when I found myself in situations where I had just a few bad options, or even worse, just one terrible one. Over the years I found myself unconsciously working to maximize the number of options that I had. 2/
There are many ways that you can increase your optionality, and most of them don't require you to have access to outsize resources. 3/
A few weeks ago I came across a tweet by a prominent ML/AI developer and researchers that promoted a new post about the use of transformers based neural networks for tabular data classification.
The post was on Keras’ official site, and it seemed like a good opportunity to learn how to build transfomers with Keras, somethig that I’ve been meaning to do for a while. However, one part of the post and the tweet bothered me. 2/27
However, one part of the post and the tweet bothered me. It claimed that the model mateched “the performance of tree-based ensemble models.” As those who know me well know, I am pretty bullish on the tree-based ensemble models, 3/27
The current issue of @Nature has three articles that show how to make those error-correcting mechanisms achieve over 99% accuracy, which would make silicon-based qubits a viable option for the large-scale quantum computational devices.
I've worked for 4 different tech companies in various Data Science roles. For my day job I have never ever had to deal with text, audio, video, or image data. 1/4
Based on the informal conversations I've had with other data scientists, this seems to be the case for the vast majority of them. 2/4
Almost a year later this remains largely true: for the *core job* related DS/ML work, I have still not used any of the aforementioned data. However, for work-related/affiliated *research* I have worked with lots of text data. 3/4
2/ A year ago I was approached with a unique and exciting opportunity: I was asked to help out with setting a Kaggle Open Vaccine competition, where the goal would be to come up with a Machine Learning model for the stability of RNA molecules.
3/ This is of a pressing importance for the development of the mRNA vaccines. The task seemed a bit daunting, since I have had no prior experience with RNA or Biophysics, but wanted to help out any way I could.
One of the unfortunate consequences of Kaggle's inability to host tabular data competitions any more will be that the fine art of feature engineering will slowly fade away. Feature engineering is rarely, if ever, covered in ML courses and textbooks. 1/
There is very little formal research on it, especially on how to come up with domain-specific nontrivial features. These features are often far more important for all aspects of the modeling pipeline than improved algorithms. 2/
I certainly would have never realized any of this were it not for tabular Kaggle competitions. There, over many years, a community treasure trove of incredible tricks and insights had accumulated. Most of them unique. 3/