Deep Learning and Neural Networks have become the default approaches to Machine Learning in recent years. However, despite their spectacular success in certain domains (vision and NLP in particular),
1/5
their use across the board for all ML problems and with all datasets is problematic, to say the least. Oftentimes better and more robust results can be obtained with simpler, easier to train and deploy, classical ML algorithms.
2/5
One such “traditional” approach was recently used to reevaluate sleep scoring on a few publicly available datasets. The results were published in the journal of Biomedical Signal Processing and Control.
3/5
"Results show that competitive performance can be achieved with a conventional ML pipeline consisting of preprocessing, feature extraction, and a simple machine learning model. In particular, we analyze the performance of a linear model and a non-linear [GBT] model."
4/5
Article:
“Do not sleep on traditional machine learning: Simple and interpretable techniques are competative to deep learning for sleep scoring.”
There was nothing that shocked me more when I entered the industry from academia than this kind of attitude. I came from an environment where teaching and learning were the norm, to the one where giving help to “underperformers” was viewed with disdain as a liability.
1/5
Fortunately not all organizations and managers are this cutthroat, but this kind of mindset is pervasive, especially at startups. There is a widespread attitude that *it’s someone else’s responsibility to do the educating*: yours, your previous job’s, your college’s etc.
2/5
And in some way this is a *rational* attitude to have: there are hardly *any* incentives to help others get better, as this is almost never a peer of your performance evaluation.
3/5
Last week @DeepMind’s research on AlphaCode - a competative programming system - has been published in Science. AlphaCode has been able to beat 54% of humans on a competative coding challenges, putting it on par with many junior-level developers.
1/4
The original announcement from DeepMind came out in February, which in the fast-paced world of AI is already ancient history.
2/4
The explosive rise of generative AI over the past few months will most certainly have a major impact, if it already hasn’t, on the future versions of AlphaCode and similar AI-enabled coding resources.
3/4
Last week @OpenAI released ChatGPT - a Large Language AI Model that interacts with users in a natural conversational way. The chatbot is able to answer complex questions, even in highly technically demanding categories.
1/7
It is also able to answer the follow up question, backtrack on wrong assumptions, and provide other detailed resources, including code fragments.
2/7
Most people in tech consider this to be the greatest technological advancement of the year. Many of us consider it even more epochal, perhaps one of the biggest turning points in history.
3/7
PyTorch 2.0 is out! This major release upgrade brings about many new features, but the main improvements are under the hood.
1/6
The three main principles behind PyTorch
1. High-Performance eager execution 2. Pythonic internals 3. Good abstractions for Distributed, Autodiff, Data loading, Accelerators, etc.
PyTorch 2.0 is fully backward compatible with the previous versions of PyTorch.
2/6
The main new feature is torch.compile, "a feature that pushes PyTorch performance to new heights and starts the move for parts of PyTorch from C++ back into Python."
3/6
Decision trees based Machine Learning models are some of the best performant algorithms in eras of predictive capability, especially on small and heterogenous datasets.
1/4
They also provide an unparalleled level of interpretability compared to all other non-linear algorithms. However, they are very hard to optimize on Von Neumann architecture machines due to their non-uniform memory access patterns.
2/4
In groundbreaking work published in Nature Communications a team of researchers has shown that analog content addressable memory (CAM) devices with in-memory calculation can dramatically accelerate tree-based model inference, as much as 10**3 over the conventional approaches 3/4
This past week I came across another paper that purports to get the SOTA for NNs for tabular data. Due to the extreme penchant for exaggeration in this community, I have given up on checking most of these claims, but decided to take a look at this particular work.
1/6
I decided to check how does XGBoost *really* perform on the datasets used in the paper, and the results were not pretty.
2/6
The main takeaway: for all three datasets used in the paper, the reported performance of XGBoost was widely inaccurate and the real performance was much better than their best results.
3/6