1\13 One of the biggest challenges in quant and alpha research is obtaining clean, error-free data. Models need be built, using assumptions to reduce “dimensions” of reality for tractability.

A THREAD ▼ ▼ ▼
Machine Learning, Sparse Datasets and Error-Free Simulations
2\ Mathematical models, by definition are built to simulate and capture some phenomena, practical or abstract. Often, they are built on data, which themselves are derived from some unknown, statistical distribution.
3\ For example, an alpha model, is backtested upon data where prices/returns are drawn from some distribution. Widely in quantitative finance, they are assumed to be drawn from a log-normal distribution.
4\ However, there is extensive literature suggesting that empirical returns do not fit this assumption. In particular, asset prices display auto-correlative properties, while the random walk hypothesis suggests that returns are independently sampled.
5\ An explanation on the challenges of the assumptions of GBM by @VolQuant : a thread on Monte Carlo and Fractional Brownian motion.

6\ We take a look at a tangential topic.

Problem: some assets do not have much significant data to be modelled on, particularly in the crypto space. To fit flexible models using advanced features and non-linear approximations, we encounter the problem “Curse of Dimensionality”
7\ That is, we have too limited amount of data relative to the features, and in the low signal environments our model is bound to overfit and perform poorly.
8\ We may apply dimension reduction techniques such as component analysis and looking at its eigensystem, but this usually does little to solve our problem. Pruning features is also a solution, but not ideal.
9\

i) If our alpha modelling is dependent on having clean, continuous datasets, a common solution is just to “forward fill” the data, reflecting real world “trading halts”-
10\

-however, this does not really help any of our statistical learning algorithms. A decent solution is an application of a Brownian Bridge, for shorter periods.

ii) If the data were just too short (e.g. traded only for a year), most earlier dates missing, then we need more.
11\

A solution is to apply Geometric Random Walks modelling the data we already have. However, as addressed, we know that when generated for longer durations, this is likely to depart from the true distribution. Classic Stochastic Models and Volatility Models also lack.
12\

A spectacular approach to generating synthetic data is to model the simulation of high frequency data, and using Style Transfer techniques to higher duration frequencies such as daily resolutions. A Denoising Autoencoder is trained with high frequency data.
13\

Results after Style Transfer found that the simulated paths are more similar in autocorrelation, and matches the true distributions most closely up to the 4^th moment.

Paper: arxiv.org/pdf/1906.03232…

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with HangukQuant

HangukQuant Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @HangukQuant

7 Oct
1\ Thread:

A Quantum Revolution is Looming.

Why should you care?
Is it going to “break Bitcoin”?

A Trader’s Overview of the Quantum World
2\ You probably have heard of quantum teleportation. No, it is not going to make your girlfriend appear on your bed whilst your parents are sleeping. Whilst it cannot currently destroy your digital coins, it might just be able to do that, sometime soon.
3\ So, what is Quantum Mechanics? It is a fundamental theory in physics that provides a description of the physical properties of nature. The branch of quantum theory dealing with manipulation and computation of quantum bits (qubits) is known as quantum computational theory.
Read 18 tweets
5 Oct
Been a fulfilling ~1 month since our launch. For hitting 1k followers, we have a special thread for you, including a premium alpha report and a case study. 🔥

MEGA Thread (N = 60+) : Robust Alpha Research Processes; HangukQuant

↓ ↓ ↓ ↓ ↓
On Alpha Research, and Quantitative Alpha
On Trading Edges, and Adopting Research Methods
Read 7 tweets
5 Oct
Sub-Thread 5

HangukQuant’s Alpha
Sample Article
📌 HangukQuant’s Alpha

1) We adopt the Hybrid (Type 3) approach in the alpha research process. The hybridity is reflected in our team’s dynamic, with practitioners working on the theoretical models, and traders providing input on the heuristical discovery of alpha.
2) The result is a coherent product in the form an “alpha report”, that premium subscribers get access to weekly.
Read 11 tweets
5 Oct
Sub-Thread 4

Numerical Expression of the Thought Process

“Horizontal Alpha Research”; Alpha Fracking

Alpha Probing
📌 Numerical Expression of the Thought Process

1)

-1 * cor ( returns(1) , delay( returns(1) , 1 ) , 90)

“The alpha is the negative correlation between time-series of daily returns and its 1-lagged series, taken across, say, 90 days.”
2) The strategy of numerically ranking the outputs from above and going risk-neutral L/S quantiles on 30 stochastically sampled S&P assets:
Read 16 tweets
5 Oct
Sub-Thread 3

The Research Process

The Alpha Leak

“Vertical Alpha Research”, A Thought Process
📌 The Research Process

1)
Type 1: To keep my knowledge of finance, I both subscribe to financial literature, academic or otherwise. That means reading books/papers on finance, trading, economics, podcasts for general knowledge, and a working knowledge “Mathematical Finance”.
2)
Type 2: My knowledge on Computer Science and Statistical methods comes from my background in academia. I also keep up to date on new state-of-the-art research by reading academic literature.
Read 14 tweets
5 Oct
Sub-Thread 2

Finding “Edges”

Adopting Alpha Research Methods
📌 Finding “Edges”

1)
Type 1: Understanding market structure and market incentives, and the corresponding flows

Type 2: Finding statistical anomalies within price/non-price data

Type 3: Hybrid approach.

Let’s seek to understand this further.
2)
Type 1: There are many reasons for actionable price flows, such as price insensitive liquidation, factor premia et cetera. For an example, a previous thread

Read 20 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!

:(