Post #2033 by anchak on the Mechanical Investing board

Investment Strategies / Mechanical Investing ❤

Unthreaded | Threaded | Whole Thread (41)

Post New | Post Reply | Report Post | Recommend It!

No. of Recommendations: 3

Hi RAMc..... First of all kudos to trying out something new. Except as Jim mentions a LOT of it is basically like old wine in a new bottle.

Traditional MI ( esp the GTR1 kind) is basically an intersection of rulesets - which in a modeling sense is equivalent to partitioning the data into rectangular portions (Hyperplanes in multidimensional spaces) - the closest ML type algorithm for this is the Decision Tree :CART/C4.5 etc

I have been using them for a while - but not in the traditional MI Screen way. So my 0.02

(1) Algorithm : Dont get too hung up with this. Infact if you understand the underlying mechanism which Jim also alluded to, the basic issue is going to be dealing with Noise and MOST IMPORTANTLY Time Invariant ( ie Non-Stationary Data). If you have read some good books - I especially liked the fact that in the "Deep Learning" book - it clearly documents how Deep Learners are not exactly suited to these kind of problems - ie you have to go thru some hoops to make them work. While on the other hand - LTSMs beat traditional Time Series (ARIMA class) models - but what most people dont get is that Time Series models are not exactly great forecasters - they are like the hammer you try on every problem/nail.

Try Statistical MLs and some simple Deep Learners , infact no harm also to benchmark NNs. Choose the ones which have highest %age of Stable features across Dev, Validation, Test, Out-of-Time ( Not necessarily performance)

(2) Features : There's going to be Non-Stationarity in your features , its a given. Additionally be very very careful of choosing unscaled ones - a lot of times with MLs the absolute value of the underlying is basically the embedded information which is being used to discriminate. Eg for whatever reason - VIX in the late 90s rarely ever breached 30 and 20 used to be local maxima. This largely held thru to 2007 and even early summer 2008 and then BAM - it went close to 90. Nowadays the VIX is some of the lowest it has even been. Direct usage of this - will potentially impact your models ability to perform "Out-of-Sample"

(3) Validation/Out-of-Sample Data: You need to ensure you have rolling-cross/validation samples. Have a holdout which is contiguous time period with decent representation of both Bullish, Bearish periods. And have an OOT or Out-of-Time - This is like post-discovery - it should be the LAST STEP ie DO NOT LET INFO LEAK into the model from this. You judge your performance after all due diligence and ensure that Validation and OOT performance do not diverge too much ( or maybe diverge a bit- but consistently across methods. This would indicate unhandled non-stationarity, sometimes you just have to accept it)

Most importantly - PLEASE DO NOT SACRIFICE YOUR EFFORTS on the "ALTER OF BEST FIT" - what fits best in the lookback typically wont survive out of time ie post-discovery. Try to choose stable but decent performers.

All the best!

Post New | Post Reply | Report Post | Recommend It!

Print the post

Unthreaded | Threaded | Whole Thread (41)

Prev | Next

Announcements

Mechanical Investing FAQ

Contact Shrewd'm
Contact the developer of these message boards.