Not sure how I feel about de prado. For some things they are just stupid, like what the fuck is triple barrier and other times they are genius. Like when he said t-values of microstructure features. He did a meh job on the explanation, and I prefer Hasbrouck's book, but GOAT idea
Anyways, have a skim through advanced is financial machine learning and tell for yourself. There are a few moments of genius and other times utter horseshit. Fractional differencing is genius, but it isn't explained properly and the stationarity of features etc is key.
some features do not need stationarity, a lot of them don't. Most wavelets need stationarity, but modern methods have non-stationary wavelets now! Here is a lecture by him I mildly agreed with:
A very passive-aggressive review from me, but I do think he has some amazing ideas, but there is a great deal of filtering needed to ingest it, and I do recommend you implement the ideas, test them skeptically, and then do further reading elsewhere like microstructure.
• • •
Missing some Tweet in this thread? You can try to
force a refresh
Elaborating on topological structure. A venn diagram is a good basic example. Any point in section B of the diagram == any other point in section B, but is completely different from something in section A. This is a topological structure. Decision trees effectively do this.
A lot of the time this non-linearity lets you pick up on subtleties but is incredibly prone to overfitting and actually overfits when there is a linear relationship because linear relationships (straight lines) aren't that topological. some point is in some standard form...
different from another point. Whereas they can be the same for decision trees. This is where decision trees benefit from using logistic regressions and linear regressions and not classifying at all. Just taking the linear regression. You get a more non-linear line, but
Heavily regularized Kernel PCA methods are awesome. Very often PCA isn't appropriate. Offset from midprice vs execution is one feature for an execution prediction model that would not have a good time with PCA because it is a parabola, but still a very simple U shape...
The issue with the introduction of polynomials or kernel methods for dimensionality reduction is that you can easily overfit, especially when noise is heavily present. Regularization should be proportional to noise and accuracy metrics. Another comment to add, is that PCA can...
destroy the structure of your data if you apply it to super HFT data. I have only given PCA so far as I don't want to go too deep into kernel methods, or give too much alpha away, but I will say that a great method is a state based model between linear, heavily regularized...
Factor models are basically just if you tried to decompose returns into some linear regressions and then pretended it was ergodic it because of a “risk-premium”. Risk premiums don’t necessarily exist but with a lot of these there does exist increasing tail risk with…
better performance of said factors. As seen in 2008 the big brown line of momentum is eventually corrected but that is less of a risk premium reason and because of the august risk of multi-strat funds getting blown up in credit and then covering with momentum forcing…
Momentum the other way. However for other factors like quality and profitability there is little to no basis for it being a risk factor and the issue with assuming this level of ergodicity is that you may hold a position that no longer has alpha expecting to be compensated…
I get people who come up to me and say what about this paper it looks great??? If they don't show performance a lot of the time it sucks, especially if they only show wlr. Worse yet you find out they levered 10% apy 10x!!! Think about strategies as components to yours.
Stack it!
No single strategy will get you alpha. Some may get you close, but it takes a level of awareness in all steps of the process from data to features, to reduction, to ML, to sizing (use kelly), risk mgmt, regime shift, meta-label, crossing, and then super fast execution.
Crossing means you cancel to opposing orders. 500 SELL from ALGO1 against 300 BUY from ALGO2. Maybe ALGO1 is your ML strategy, but ALGO2 is Genetic Algorithm, or maybe MM, or perhaps pairs. Combine and ensemble not just predictions, but also orderflow. Reduces size and fees
I’ve seen people claim pairs is dying. That certainly isn’t true but it has become a feature or component and not a strategy. The mispricing index of copulas weighted by their copula relationship strength may serve as features to an ML model. An interesting mispricing based…
approach is where you explicitly treat the spread as it's components. This way you can only trade one leg. If you are able to use something such as LASSO and SPCA to cluster by industry and then reduce to a network of highly connected assets then you can use this...
It makes very little sense to just try and find pairs and use them as features since one asset and it's relationship won't really make up that much of the price movement, but the weighted (by relationship strength) mispricing against the network (LASSO/SPCA reduced)...
Whilst alt data can create an edge it is largely overhyped. In my experience it provides very little alpha unlike what most think. Two key reasons exist for this:
1) So much data 2) Fundamentals just aren’t that important for the short term
A word on each:
The first problem is that there is so much data out there and whilst most firms can sort through this data with reasonable effectiveness there is so much of it that the methods used provide very little alpha. The amount of articles published each day are so numerous…
That the alpha coming from a better model is slowly whittled away because a CNN-LSTM model you can make in tensorflow can usually do a good job, or worse just import an NLP library in Python or R. The second issue is as many will know from previous tweets that flow is…