1\13 One of the biggest challenges in quant and alpha research is obtaining clean, error-free data. Models need be built, using assumptions to reduce “dimensions” of reality for tractability.
A THREAD ▼ ▼ ▼
Machine Learning, Sparse Datasets and Error-Free Simulations
2\ Mathematical models, by definition are built to simulate and capture some phenomena, practical or abstract. Often, they are built on data, which themselves are derived from some unknown, statistical distribution.
3\ For example, an alpha model, is backtested upon data where prices/returns are drawn from some distribution. Widely in quantitative finance, they are assumed to be drawn from a log-normal distribution.
4\ However, there is extensive literature suggesting that empirical returns do not fit this assumption. In particular, asset prices display auto-correlative properties, while the random walk hypothesis suggests that returns are independently sampled.
5\ An explanation on the challenges of the assumptions of GBM by @VolQuant : a thread on Monte Carlo and Fractional Brownian motion.
Problem: some assets do not have much significant data to be modelled on, particularly in the crypto space. To fit flexible models using advanced features and non-linear approximations, we encounter the problem “Curse of Dimensionality”
7\ That is, we have too limited amount of data relative to the features, and in the low signal environments our model is bound to overfit and perform poorly.
8\ We may apply dimension reduction techniques such as component analysis and looking at its eigensystem, but this usually does little to solve our problem. Pruning features is also a solution, but not ideal.
9\
i) If our alpha modelling is dependent on having clean, continuous datasets, a common solution is just to “forward fill” the data, reflecting real world “trading halts”-
10\
-however, this does not really help any of our statistical learning algorithms. A decent solution is an application of a Brownian Bridge, for shorter periods.
ii) If the data were just too short (e.g. traded only for a year), most earlier dates missing, then we need more.
11\
A solution is to apply Geometric Random Walks modelling the data we already have. However, as addressed, we know that when generated for longer durations, this is likely to depart from the true distribution. Classic Stochastic Models and Volatility Models also lack.
12\
A spectacular approach to generating synthetic data is to model the simulation of high frequency data, and using Style Transfer techniques to higher duration frequencies such as daily resolutions. A Denoising Autoencoder is trained with high frequency data.
13\
Results after Style Transfer found that the simulated paths are more similar in autocorrelation, and matches the true distributions most closely up to the 4^th moment.
Why should you care?
Is it going to “break Bitcoin”?
A Trader’s Overview of the Quantum World
2\ You probably have heard of quantum teleportation. No, it is not going to make your girlfriend appear on your bed whilst your parents are sleeping. Whilst it cannot currently destroy your digital coins, it might just be able to do that, sometime soon.
3\ So, what is Quantum Mechanics? It is a fundamental theory in physics that provides a description of the physical properties of nature. The branch of quantum theory dealing with manipulation and computation of quantum bits (qubits) is known as quantum computational theory.
Been a fulfilling ~1 month since our launch. For hitting 1k followers, we have a special thread for you, including a premium alpha report and a case study. 🔥
MEGA Thread (N = 60+) : Robust Alpha Research Processes; HangukQuant
1) We adopt the Hybrid (Type 3) approach in the alpha research process. The hybridity is reflected in our team’s dynamic, with practitioners working on the theoretical models, and traders providing input on the heuristical discovery of alpha.
2) The result is a coherent product in the form an “alpha report”, that premium subscribers get access to weekly.
1) Type 1: To keep my knowledge of finance, I both subscribe to financial literature, academic or otherwise. That means reading books/papers on finance, trading, economics, podcasts for general knowledge, and a working knowledge “Mathematical Finance”.
2)
Type 2: My knowledge on Computer Science and Statistical methods comes from my background in academia. I also keep up to date on new state-of-the-art research by reading academic literature.
1) Type 1: Understanding market structure and market incentives, and the corresponding flows
Type 2: Finding statistical anomalies within price/non-price data
Type 3: Hybrid approach.
Let’s seek to understand this further.
2) Type 1: There are many reasons for actionable price flows, such as price insensitive liquidation, factor premia et cetera. For an example, a previous thread