1\ While I have criticized many facets of applying AI in trading, the appeals fall on deaf ears. If you are going to use them anyway, let’s do it correctly.
A deeper look at one of the most common pitfalls of “AI quants”.

THREAD; The Curse of Dimensionality.
2\ In an earlier thread, I argued the boon of having synthetic data where datasets were sparse, for example in the crypto space. This technique is particularly important when we consider applications of statistical learning algorithms.
3\ The favourite garbage AI model is one when people just spam 10 different features into a neural network to get some price signals. Lets argue why this often fails.
4\ Let’s first familiarize ourselves with a terminology: the hyperspace. When we consider a model parameterized by a single value, that value can exist on a line (univariate). Bivariate models are parameterized along a plane, and tri-variate models can be defined on a space.
5\ Now, when we extend this to higher (n) dimensions on real values, we obtain an n-dimensional “Euclidean space”. In search problem settings, we often call this the “hyperspace”.
6\ So, these 10 features you selected for your AI algorithm exists as a point in the hyperspace, and your algorithm searches this hyperspace for potential alpha. Now that we understand our search space and objective, we can discuss the curse.
7\ Let’s pretend that our features are independent, and identically sampled. Let’s also assume that our AI model performs predictions based on local regions of training datasets.
8\ To illustrate, let’s assume that our algorithm picks references from points such that the training input is selected if it’s point is within 10% range of the test input along each dimension.
9\ Case: 1 Feature. When we have a single dimension, 10% of the possible training datasets are used for an arbitrary test input. If we have 1000 training data and we want to make a prediction, the predictions are made by 100 other training samples.
10\ Case: 2 Features. When we have 2 dimensional features, 1% of the possible training datasets are used for an arbitrary test input. This is because only 10% of the training data are nearby our first sampling axis, and 10% of training data are nearby the second sampling axis.
11\ If we have 1000 training data and we want to make a prediction, the predictions are made by 10 other training samples. Our prediction is has significantly more noise. For 3 features, our prediction is only dependent on just one training input.
12\ As the feature space increases, our learning algorithms requires significantly more training data to have useful results. There are different algorithms that respond differently to such dimensionality issues.
13\ Along with choosing the right algorithm, dimensionality reduction techniques, feature selection, eigensystem analysis and synthetic data are just some cures to this apparent curse that is the demise of those who insist on intelligent trading agents.

FIN

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with HangukQuant

HangukQuant Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @HangukQuant

16 Oct
1\ Risk management is probably my primary focus on top of alpha research. Am always thinking about how to better combine alpha strats. One of the safest bet, especially in lower dimensional portfolios is a static equal weighted basket.
2\ There are also different levels of risk/portfolio management, and some techniques can apply to multiple layers. For example, you can apply risk controls to individual assets, timeframes, strategy level and portfolio level et cetera.
3\ Academic literature on this is pretty extensive. Recently a tweet (can’t seem to place it) by @choffstein on literature of portfolio management. Definitely one of the my favourite narrators on mathematical models for trading.
Read 15 tweets
13 Oct
THREAD: Stop Paying for Beta; How to Fin-Dom Beta ▼😳

1\ The Volatility Effect is a well journaled factor return, with empirical evidence suggesting that in US, European, and Japanese markets, low volatility assets in equity markets have higher risk-adjusted returns.
2\ This indicates that investors tend to overpay for risky stocks. Some of the explanations include leverage, behavioural biases and “beta-chasing” funds. For example, we know that in the CAPM model, excess returns may only be obtained via beta of the market net on risk-free.
3\ In many cases, factor returns exist both cross sectionally and in time-series analysis. For example, both cross-sectional and time-series momentum are known effects.

They do not exist on equal planes, however. TS-MOM is tends to be more persistent, particularly in equities.
Read 9 tweets
11 Oct
1\ Buffett on hindsight: “a lot of things I wish I’d done in hindsight. But I don’t think much of hindsight generally in terms of investment decisions. You only get paid for what you do.”

Thread: Hindsight Wisdom on Trading; A Treasure Dismissed as Hoax.
2\ One would be remiss to ignore the value of experiential learning and hindsight wisdoms.

In trading, hindsight is dismissed as the most useless wisdom. However, as agents of the world, there is information in all we perceive. Hindsight can make us better after all.
3\ Often, in finance, significant number of problems often have sparse rewards; when training intelligent agents (Reinforcement Learning/Neural Networks), we do not have sufficient data to learn our environment.
Read 16 tweets
9 Oct
THREAD: Stochastic Optimization in Dynamic Environments: Portfolio Allocation by a Quant ▼

1\ Combining alpha signals are an essential part of portfolio management, with extensive literature on integrating alpha. Famous examples include the (Half) Kelly, Markowitz portfolios.
2\ We provide a review of these methods and offer glimpses at our unique, proprietary robust signal-weighting scheme. Let us consider the problem statement and inherent characteristics of dynamic optimisation.
3\ The obvious, and most problematic behaviour is the presence of stochasticity in a dynamic environment.

For an academic treatment of stochastic optimization, a lesson from the Department of Statistics at Columbia.

stat.columbia.edu/~liam/teaching…
Read 25 tweets
7 Oct
1\ Thread:

A Quantum Revolution is Looming.

Why should you care?
Is it going to “break Bitcoin”?

A Trader’s Overview of the Quantum World
2\ You probably have heard of quantum teleportation. No, it is not going to make your girlfriend appear on your bed whilst your parents are sleeping. Whilst it cannot currently destroy your digital coins, it might just be able to do that, sometime soon.
3\ So, what is Quantum Mechanics? It is a fundamental theory in physics that provides a description of the physical properties of nature. The branch of quantum theory dealing with manipulation and computation of quantum bits (qubits) is known as quantum computational theory.
Read 18 tweets
6 Oct
1\13 One of the biggest challenges in quant and alpha research is obtaining clean, error-free data. Models need be built, using assumptions to reduce “dimensions” of reality for tractability.

A THREAD ▼ ▼ ▼
Machine Learning, Sparse Datasets and Error-Free Simulations
2\ Mathematical models, by definition are built to simulate and capture some phenomena, practical or abstract. Often, they are built on data, which themselves are derived from some unknown, statistical distribution.
3\ For example, an alpha model, is backtested upon data where prices/returns are drawn from some distribution. Widely in quantitative finance, they are assumed to be drawn from a log-normal distribution.
Read 13 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!

:(