StatArb Profile picture
Mar 12 4 tweets 1 min read
Whilst I am not a big fan of using linear regressions I do use regressions in the models I develop. Polynomial logistic regressions are effectively a smoothed decision tree surface for example. Regularization that limits depth of an NN or tree is great, but introducing...

1/n
the bias that jagged jumps in the decision surface are a bad idea through the use of regression-based models massively denoises your data. (not strictly regressions, for ex: regression NNs and regression decision trees are awesome). As I have mentioned before...

2/n
a decision tree cannot replicate a linear regression with regularization. It's like if you tried to fit a sin wave with a Taylor series, you can get close, but the level of complexity would be infinity for a perfect replication. (for taylor series this would be polynomial...

3/n
order), but for a decision tree, it would be depth. To replicate a linear regression all it would do is add more steps and make each step smaller. If it is a regression tree it can just have one node which is a linear regression. Same complexity as a linear regression there.

4/4

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with StatArb

StatArb Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @TerribleQuant

Mar 13
For those unaware of getting flipped:

This is when you are on the other side of the OB with an aggressive (usually) LO acting as a maker, but the price moves and now you are on the opposite side of the OB, and get matched with an order as such. You pay taker fees when this...
happens which for a lot of HFT strats means certain death. No kidding. Rebate of 1bps vs 4bps fees. Not a fun time when your alpha is likely only a few bps and your BAS is a bp maybe. This example is crypto futures. The risk of getting flipped is why you have to be...
fast as a MM, not just to prevent leaving stale quotes on the OB, but also to prevent getting flipped. The risk increases as you become more aggressive, which is why a lot of stat arb strategies (like I use) that don't even need that fast execution, have to use it because...
Read 6 tweets
Mar 12
Back to the quant topics:

Microstructural fair value!

Let's dive into it:

1/n
The usual approach is to take the midprice which is just the average between the best bid and ask. This is decent for most applications, but can definitely be improved. Decent won't cut it in HFT! ...
2/n
Let's outline a few approaches:

OB Liquidity Based:

Weighted Midprice
Exponentially Weighted Midprice
TA (MA) based variants

Microprice & Variants:

HMM Microprice (stoikov)
ARIMA Microprice
SAE Microprice
Read 17 tweets
Mar 11
A thread on Copula trading in 2022:

I previously mentioned the use of CDS pricing literature for alternative copula distributions, and for bringing assumptions to the higher dimensions without using vines.

Continuing, this topic I'll go through some other methods/tricks:
For optimizing copulas the easiest method is to use a heuristic method and then brute force the lower dimensions. This is usually bivariate only, but with smaller asset universes trivariate is possible. From there you just test all permutations that include...
2/n
your established pair or triplet. To solve for the weights you can do this with standard methods. Another note is that since copulas are effectively conditional probabilities and vine copulas form chains, the Baum Welch algorithm can be adapted with use of non-gaussian...
3/n
Read 9 tweets
Mar 10
Lag error:

I’ve totally made up this word but there’s not word for the concept it captures and I’ve been using it for over a year regardless.

If you use bollinger bands to trade mean reverting portfolios your lag error is the loss of alpha from the deterministic component.
This comes in 3 forms:

Jump risk:
Large jumps in the mean will take time for your mean to move to and cause errors because moving averages are lagged. This is a regime shifting ish problem and is aided by unsupervised learning models with conservatism controls.

The next is mismatched period:

If there is a sin wave with white noise we may attempt to use an MA to trade the noise part. This will give us lag error as we will not be accounting for the broader sim function and get lag error, hurting our PnL. Mismatched timeframe

Read 6 tweets
Mar 8
For anyone making an HFT strat, you need a simulator to see when you get filled. You can use Kalman queues, multi-queues, sim matching engines and they are all cool but usually don't properly capture effects like adversity. Then there are stochastic approaches...
1/n
This would be like simulating the poisson process. This sucks even worse as it utterly and entirely ignores adversity. At least sim matchers give a try, although not a great one and an easy NN will trounce it. Plus stochastic approaches don't use historical data...
2/n
Other methods involve procedural OB sims which basically just take the data and add a bit of stochastic overlay to the historical OB, this assumes (most of the time) that you get filled when midprice crosses your bid/ask which is just wrong because it neglects...
3/n
Read 16 tweets
Mar 3
Multithreading is not an instant speed booster:

A word on these three:

Async I/O
Multi-threaded CPU tasks
Hyper-threaded CPU tasks
Async I/O will only benefit speed-wise from the writing data component of the work and can slow you down if that is not significant. There is only one NAC remember, but there is not infinite cache and management is expensive.

1/3
Multi-threading is great, but not for file handling, that won't be faster and will split your cache once again so for ultra low latency applications you usually just have one super fast core enabled so you can maximise cache.

2/3
Read 4 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(