Investment Strategies / Mechanical Investing
No. of Recommendations: 7
As a long term individual Mechanical Investor, I’m always interested in what the big guns are doing.
Currently on Kaggle there is “Jane Street Real-Time Market Data Forecasting” competition going on.
The Overview: In this competition, hosted by Jane Street, you'll build a model using real-world data derived from production systems, which offers a glimpse into the daily challenges of successful trading. This challenge highlights the difficulties in modeling financial markets, including fat-tailed distributions, non-stationary time series, and sudden shifts in market behavior.
There’s only $120,000 prize but there are 25,000 Entrants with currently >5000 submissions.
In this competition, hosted by Jane Street, you'll build a model using real-world data derived from production systems, which offers a glimpse into the daily challenges of successful trading. This challenge highlights the difficulties in modeling financial markets, including fat-tailed distributions, non-stationary time series, and sudden shifts in market behavior.
Jane Street says they “actively trade thousands of financial products each day across 200+ trading venues around the world. While this challenge only presents a tiny fraction of the quantitative problems Jane Streeters work on daily”
I was looking at the current leaders in the competition and noticed that the top 5 are from China or Hong Kong and none of the top 10 are from the US. The US may currently be leading in the design of the hardware but software has no borders and software has most of the money.
No. of Recommendations: 0
Did you look at the scores ? They are all in the 1.8% ish range
Best
No. of Recommendations: 8
Anchak; Did you look at the scores ? They are all in the 1.8% ish range
Did you look at how the score is computed?
Submissions are evaluated on a scoring function defined as the sample weighted zero-mean R-squared score (R2) of responder_6. The formula is give by:
The formula is essentially the difference between Jane Street’s prediction and the contestants answer.
I’ve been running ML predictions for more than a year now and with the very noisy financial data environment your R^2 number for predictions over the entire universe looks very poor. But when you look at the average returns over the top few percent of the predictions the results are profitable and if you add a hysteresis between buy and sell rules you even come up with very good sharp ratios.
No. of Recommendations: 2
RamC: You are correct!
A lot of the returns in the middle are noise and typically banded around a Zero-Mean.
One of the approaches I have tried in these cases is to actually ignore the data in the middle ( ie a thresholding or hysteresis as you call it) and then try to use a Classfier ( 2 group) - Now a lot will depend on the level of the threshold ( it also needn't be symmetric) - and consequently the model efficacy.
In fact based on Jim's pointers his Bottom Detector is of this class - ie Identifying Market Extremes at Market Sell-off. Is it enough or No - binary.
Best
No. of Recommendations: 5
Anchak: A lot of the returns in the middle are noise and typically banded around a Zero-Mean.
I was normalizing all my NaN’s (missing values) which distorted the middle estimations but setting them to a negative value improved my results. I’m still in the learning mode, probably will be forever but my current model has drawdowns approximately the same as the overall universe. However, the gain from bottoms has historically have been the best periods. Perhaps those are the periods with the most mispricing. My model hasn’t done well the last few months, it hasn’t adapted to the current uncertainty.
The Kaggle contest was focused on very short-term mispricing, significantly different from my goal of finding stocks that will have at least 3 months and ideally a year of above average return. But is still informative as the overall market environment is similar.