Post #1128 by Aussi on the Mechanical Investing board

Investment Strategies / Mechanical Investing ❤

Unthreaded | Threaded | Whole Thread (36)

Post New | Post Reply | Report Post | Recommend It!

No. of Recommendations: 7

Following are two posts by Robbie about P123 compared to GTR1. (Reference Datahelper)

I'm splitting this topic off from http://boards.fool.com/Message.aspx?mid=32145718 as a separate thread because it obviously has nothing to do with the main topic there. People are obviously free to discuss whatever they want here, but I'll be focusing on a few anomalies in the Portfolio123 database that I saw when a couple of its users posted some test results on the GTR1 Helper mailing list a year ago. Since I'm not a customer, I don't anticipate having much to contribute after that, other than describe how I would go about validating Portfolio123 (or any backtester) against benchmarks, as described in the post I just linked to.

Universe Composition

The first thing I would do is determine exactly what the P123 universe consists of from the non-OTC stock market. We did this almost exactly one year ago on the GTR1_Helper mailing list and found the following:

1. The P123 non-OTC universe as of 3/6/2015 included only nine stocks that were not part of the GTR1 universe on the same day. Five of them were in fact OTC, three of them were ADRs representing preferred stock (which P123 appeared to generally exclude), and one was a unit bundling common stock with warrants (which they also appeared to generally exclude), where the common stock would qualify for the GTR1 universe only once they were unbundled. So the non-OTC P123 universe was essentially a subset of the GTR1 universe, with the tiny handful of exceptions appearing to be the result of security mistyping or exchange listing information that was out-of-date.

2. The P123 universe excluded absolutely all ETFs and CEFs, i.e., stocks with security type codes 14, 15, 44, 45, 73, 74 and 75 in the GTR1 field file styp.a.

3. 53 stocks excluded from the P123 universe were issues of companies with another share class that was included in the P123 universe. It would be helpful to know how consistent P123 was in including only a single share class for each company, but I don't appear to have determined that. But it appears that this was their intent, and if they are simply defining their universe according to Capital IQ Snapshot coverage, then that would be the result.

4. Coincidentally, there were also 53 stocks excluded from P123 for unapparent reasons.
-- Exactly one of these was a REIT, and thus the only exception to P123 generally including all REITs.
-- Five were ADRs, and thus exceptions to P123 generally including all ADRs representing common stock.
-- Twelve were unit trusts, a.k.a. "K-1 stocks" of various types (Limited Partnerships, LLCs, Royalty Trusts, etc). Since they included the vast majority of unit trusts, these once again appeared to be the exceptions to generally including them all.
-- The remaining 35 missing stocks were ordinary common shares. Many of them were shell corporations aiming to take private companies public through acquisition, but this did not disqualify other stocks from inclusion in the P123 universe.

The bottom line: One could expect the non-OTC P123 universe to consist of all non-ETF/CEF stocks in the GTR1 universe, excluding redundant share classes, and perhaps excluding a few dozen stocks for unknown reasons.

This conclusion more than likely still holds today, but here is a URL that customers can use to test it against the current P123 universe: http://gtr1.backtest.org/2013/?lf-1lp-1h1::styp.a:...... Just click "Run Screener" and download the spreadsheet report.
-- The field labeled "MultiClass" indicates whether a stock is one among more than one share class for the same company (1 = yes, 0 = no).
-- The field labeled class.a gives the share class number that the GTR1 backtester assigns to each stock. A stock with a share class number of 1 is the primary class identified by the backtester, which is likely, but not always, the share class included by P123.
-- The field styp.a gives security type as a code explained by borisnand here http://boards.fool.com/share-type-styp-counts-3190... .

Historical Universe Size

After determining the rules for current universe composition, the next thing I would check for is that they were consistently applied in the past. Since I don't think P123 allows historical screening, all we can do to determine this is to count non-OTC stocks over the course of a P123 backtest from universe inception and compare the counts to what the GTR1 backtester predicts. To get the GTR1 stock counts, go to http://gtr1.backtest.org/2013/?lf-1lp-1::styp.a:ne...... and click "Count Stocks". This screen selects the primary active share class for each company among all non-CEF/ETF stocks. While the primary share class may not match what P123 includes, that shouldn't matter for the sake of counting.

What we found a year ago was that at the beginning of the backtest on 1/4/1999, P123 was missing 1,031 non-ETF/CEF stocks, but this steadily diminished to 441 missing stocks on 6/9/2003. Then, on 6/16/2003, the number of missing stocks abruptly dropped to just 25. The discontinuity was due almost entirely to a jump from 5,583 to 5,995 in the number of stocks in the P123 non-OTC universe, whereas the GTR1 count was essentially unchanged from 6,024 to 6,020.

Such a pattern where stocks are missing from the beginning of a backtest, but where the number of missing stocks gradually decreases over time could signify some survivorship bias. What I think is quite possible is that the P123 database is based on some source other than Compustat/S&P Capital IQ before 6/16/2003, because the abrupt change in the number of stocks does not appear in my 2010 vintage Compustat Point-In-Time database. The count of CPiT stocks in the GTR1 database can be obtained from http://gtr1.backtest.org/2013/?lf-1lp-1::cprc:gt0 . A year ago, P123 was missing 682 CPiT stocks on 1/4/1999, which gradually diminished to 393 missing CPiT stocks on 6/9/2003, and then abruptly fell to -24 (meaning P123 somehow contained more 24 more stocks than actually active in CPiT) on 6/16/2003. But there is no need to speculate--customers should be able to get an explanation for this anomaly from Marco Salerno.

Missing Returns

The particular P123 backtest that was done for me was simply the entire non-OTC universe with the minimum holding period of one week. That backtest, oddly, was missing forward 1-week returns for the following dates:

20000703
20010910
20011224
20011231
20060703
20071224
20071231
20121224
20121231

The stock market appears to have been closed on all of these days, and perhaps the prior week's forward returns are actually two-week forward returns. I would imagine that any P123 customer would be able to confirm this.

Implausible Returns

It appears that a separate backtest of the entire non-OTC universe held in equal weight with a holding period of one market day was also done for me. In this backtest, I noticed two daily returns that were totally implausible: A 1-day forward return of 118.7816% from the close of market day 1/30/2015, and a 1-day forward return of 10.1288% from the close of 10/22/2002. One of the P123 customers can let us know whether either of these errors have been corrected.

This is enough for today (and there isn't much more I can do without further assistance), but it will be interesting to see how far this discussion goes.

Robbie Geary

And then this

Rayvt:
I would question the validity of RSI(2).

How can RSI(2) be invalid? It is just a simple computation based on total returns over the last two market days. Any backtester should be able to calculate it correctly and report accurate results for screens that use it to select stocks. If Portfolio123's results are not credible, then I think it's the backtester or data that is suspect, not the RSI(2) itself.

I suspect it is overtuned and/or on a hair-trigger.

I don't see how you could possibly claim that the strategy Randy backtested is over-tuned. RSI(2) is just the simplest non-degenerate case of an already simple formula.

However, it should be noted that Portfolio123's documentation on RSI (http://www.portfolio123.com/doc/doc_detail.jsp?fac......) is far from clear on what kind of RSI it is, Wilder's or Cuttler's. On the one hand, it claims that it is Wilder's RSI. But in the description of the calculation, it states that RSI(N) uses averages (not exponential moving averages, as used by Wilder's RSI) over exactly N bars, which is Cutter's RSI. But then later the documentation talks about a trade-off between accuracy and performance, which is an issue with exponential moving averages. Cuttler's RSI should not be a performance drag, unless the documentation was written twenty years ago. Perhaps it was one kind of RSI in the past and then changed to the other, and only part of the documentation was updated.

One way to resolve this is this: Does the RSI function require N to be a whole number? If not (and you get different results as you vary it in fractional increments), then it is Wilder's RSI, not Cuttler's RSI. The converse might not necessarily be true, i.e., it might be Wilder's RSI after all but be arbitrarily limited to whole number values of N.

If we can pin down what this thing is, then we have a good opportunity for a direct comparison between the P123 and GTR1 backtesters. Nothing can be more straightforward than historical Dow Jones Industrial membership, and once we know what kind of RSI P123 uses, that part should be straightforward as well. The fact that the strategy has high turnover and we are not accounting for commissions, spreads, slippage etc is irrelevant as far as a backtester's computational accuracy is concerned. Both backtesters should be comparable on a frictionless basis.

In the meantime, the GTR1 backtester can calculate both kinds of RSI, as well as a third kind called Connor's RSI. Here is the GTR1 backtest of Randy's RSI(2) <= 20 variant (the one with the highest Sharpe ratio) using Cutter's RIS(2) with a lag of one market day:

DJI Member, rsi(1,2) <= 20, 5-day Hold
19981231 to 20160606
Avg Min Max SD
CAGR: 9.95 7.77 13.52 2.04
TR: 448.89 267.49 807.53 195.03
GSD(20): 26.94 23.33 31.53 2.64
GDD(20; 0%): 16.64 14.24 20.57 2.31
GDDD3: 15.97 9.63 23.89 4.88
MDD: -61.77 -79.30 -48.84 11.44
UI(20): 20.39 11.65 31.75 6.92
Sharpe(20): 0.44 0.36 0.54 0.07
Beta(20): 1.01 0.91 1.11 0.08
TI(20): 10.35 9.20 12.69 1.37
AT: 38.75 38.39 39.04 0.21
(1) http://gtr1.net/2013/?s19981231h5::dji.a:et1:rsi%2......

And with a lag of zero market days:

DJI Member, rsi(0,2) <= 20, 5-day Hold
19981231 to 20160606
Avg Min Max SD
CAGR: 10.27 8.38 11.45 1.37
TR: 459.53 305.46 558.85 113.79
GSD(20): 27.53 25.11 29.36 1.49
GDD(20; 0%): 16.65 14.93 18.09 1.19
GDDD3: 16.23 11.95 19.80 2.91
MDD: -66.00 -73.89 -56.97 6.96
UI(20): 22.24 15.32 30.29 5.54
Sharpe(20): 0.44 0.38 0.51 0.05
Beta(20): 1.09 0.99 1.19 0.08
TI(20): 10.07 8.57 11.30 1.01
AT: 38.75 38.40 39.03 0.20
(2) http://gtr1.net/2013/?s19981231h5::dji.a:et1:rsi%2......

I know some users of weekly-data backtesters are believers in the magic of Monday trading (which becomes evident in the posts here when rebel2011 or Bill2m are running late), so here's a backtest of a single cycle that calculates Cuttler's RSI(2) through Friday's close (or the last market day of the week) and trades at Monday's open (or the open of the first market day of the week):

DJI Member, rsi(1,2) <= 20, 5-day Hold
19981231 to 20160606
Avg Min Max SD
CAGR: 11.56 11.56 11.56 0.00
TR: 570.64 570.64 570.64 0.00
GSD(20): 22.88 22.88 22.88 0.00
GDD(20; 0%): 15.65 15.65 15.65 0.00
GDDD3: 13.82 13.82 13.82 0.00
MDD: -62.49 -62.49 -62.49 0.00
UI(20): 17.99 17.99 17.99 0.00
Sharpe(20): 0.54 0.54 0.54 0.00
Beta(20): 0.94 0.94 0.94 0.00
TI(20): 11.99 11.99 11.99 0.00
AT: 39.55 39.55 39.55 0.00
(3) http://gtr1.net/2013/?s19981231o::dji.a:et1:rsi%28......

Indeed it does seem that the CAGR of 20.7 and Sharpe of 0.93 that P123 reports is way outside the range of random variation by GTR1 trading cycles. Either (a) there's a bug in the GTR1 rsi function, (b) P123 is buggy or has data problems, or (c) P123 uses a better version of RSI than Cuttler's RSI that isn't described well by the documentation.

Regarding (a), it wouldn't shock me if a little-used function like rsi were buggy. If so, this post should bring the bugs to light and I'll have them fixed the same day they're reported. You can get the RSI(2) values for current DJI members with http://gtr1.net/2013/?s19981231h5::dji.a:et1:rsi%2...... by clicking "Run Screener"--no screening password is required. You can remove the "dji.a = 1" step and get current RSI(2) for all stocks in the GTR1 universe. To examine rsi(1,2) for a specific Yahoo! ticker symbol over time, use http://gtr1.net/2013/?h1::rsi%281,2%29al0%7bU:%7bS... , which is for SPY. Select "Signal Values" for the downloadable report, click "Run Backtest" and when it completes, download the spreadsheet report, open it and scroll down to "Daily Signal Values". As always, pay careful attention to the effective lags reported in the Command Translation. Also note that for Cuttler RSI(2), values of 0 and 100 are common, because all it takes to get 100 is two up days in a row, and all it takes to get 0 is two down days in a row.

(b) wouldn't shock me either, but I am surprised at how little interest P123 users have in this possibility. There used to be a user on another MI forum who would post P123 backtest results for weekly-traded strategies with CAGRs in the hundreds of percent, even with OTC exclusions and what seemed like reasonable liquidity requirements. Needless to say, I could never get the GTR1 backtester to report CAGRs anywhere near that high. As a result, I'm generally suspicious of P123 backtests for trading more frequently than monthly. Frequently-traded strategies that exploit short-term price reversals act as error magnets for any daily stock price database. All price errors, and all mis-dated dividends, present false arbitrage opportunities to backtesters. For example, consider what happens if a special dividend is incorrectly ex-dated one market day after the actual ex-dividend date. On the actual ex-dividend date, the stock appears to take a big drop, which would put it at the top of a screen using RSI(2) in a contrarian way. If the backtest picks this stock, then the next day, the portfolio gets credit for the special dividend, and also the stock appears to have popped, putting its RSI(2) at a sell level. But simple price errors would be the most common, and it would take tons of them to inflate CAGRs for frequently-traded contrarian trading strategies.

As for (c), it's possible P123 is really using Wilder's RSI but that the documentation's description of the computation is just incorrect. The GTR1 backtester doesn't have a built-in Wilder's RSI function, but since the results are so far from P123's, I did the work of building Wilder's RSI (RSIW) out of more elementary field functions. Since it uses the time-series function tsema (exponential moving average), it's imperative that you use the field function importf to import the RSIW calculation into your backtest, rather than attempt to build your screen on top of the screen that calculates RSIW, or calculate RSIW within the screen that uses it.

Here's another backtest of (1) that uses WRSI3 in place of rsi(1,2):

DJI Member, WRSI3 (lag 1) <= 20, 5-day Hold
19981231 to 20160606
Avg Min Max SD
CAGR: 12.95 8.90 14.90 2.10
TR: 768.59 340.63 1020.86 229.31
GSD(20): 28.19 23.07 41.35 6.79
GDD(20; 0%): 18.14 11.88 34.16 8.17
GDDD3: 15.64 6.25 36.44 10.67
MDD: -56.10 -87.39 -34.69 17.52
UI(20): 19.65 7.61 47.26 14.34
Sharpe(20): 0.57 0.42 0.64 0.08
Beta(20): 0.91 0.70 1.06 0.15
TI(20): 15.14 11.22 18.35 2.52
AT: 38.91 38.16 39.65 0.49
(4) http://gtr1.net/2013/?s19981231h5::dji.a:et1:dspo%......

You're probably wondering why I used "WRSI3" instead of "WRSI2". The reason is that if Portfolio123 is using Wilder's RSI, then it is very likely that RSI(2) is actually 3-day Wilder's RSI. It is a common mistake (and I make a presumption of guilt) to assume that an "N-bar" EMA (which of course uses infinity bars, not N bars) uses 1/N for the weight of the most recent bar. Actually, the correct weight for the most recent bar is 2/(N + 1). The reason EMA(9), EMA(19), EMA(39) etc are so common in technical analysis is not that technical analysts love the number 9, but because these EMAs, when calculated correctly, use nice round weights of 0.2, 0.1 and 0.05, respectively. Likewise, WRSI(3) uses a weight of 2/(3 + 1) = 0.5 for the most recent bar in its EMAs. The issue is not just a semantic one. The "lookback" for a correctly defined EMA has a precise mathematical property, namely, that for EMA(N), it takes at least N terms to capture at least 1 - e^-2 ~= 86.47% of the total weight of infinite sum. And of course since the tail is itself another EMA, the property can be re-applied to the same infinite sum indefinitely: Another N terms always captures at least 86.47% what's left of the infinite sum. This property is important for converting an EMA of one bar granularity (e.g., daily) to another (e.g., weekly). For example, a weekly EMA with 1/3 weight applied to the current week does not behave like a daily EMA with 1/(5 * 3) = 1/15 as the weight applied to the current daily bar. Instead, the 1/3-weight weekly EMA has a "lookback" of 5 weeks (because 2/(5 + 1) = 1/3), which is roughly 25 market days. Thus the correct current weight for the corresponding daily EMA would be 2/(25 + 1) = 1/13.

This property also provides a way to define a correspondence between SMAs and EMAs. Since an SMA gives 86.47% weight to the most recent N * 0.8647 bars, the EMA with the same property has a "lookback" of N * 08647. This means, for example, that Cuttler RSI(50) should behave most similarly to Wilder RSI(43).

Anyway, you can check the backtester's current WRSI(3) calculations with "Run Screener". To examine the backtester's WRSI(3) calculation for a specific Yahoo! ticker symbol over time, use a URL such as http://gtr1.net/2013/?lf-1lp-1h1::WRSI3:al0:WRSI3:...... and download daily signal values as already described.

In case P123 does not make the mistake just mentioned, here are the backtest results for actual WRSI(2), which uses a weight of 2/(2 + 1) = 0.666... for the latest bar:

DJI Member, WRSI2 (lag 1) <= 20, 5-day Hold
19981231 to 20160606
Avg Min Max SD
CAGR: 13.39 12.00 14.44 0.86
TR: 797.44 618.40 944.38 115.26
GSD(20): 25.37 22.67 32.20 3.46
GDD(20; 0%): 16.00 14.11 20.98 2.59
GDDD3: 14.00 9.63 24.41 5.29
MDD: -58.70 -79.78 -50.63 10.80
UI(20): 16.63 10.00 27.44 6.03
Sharpe(20): 0.60 0.53 0.64 0.04
Beta(20): 0.93 0.87 1.06 0.08
TI(20): 14.37 13.79 14.83 0.35
AT: 39.13 38.59 39.63 0.38
(5) http://gtr1.net/2013/?s19981231h5::dji.a:et1:dspo%......

Results are only slightly better than (4), but I have added more than three CAGR points to (1).

Finally, Connor's RSI. A couple years ago someone offered to pay me to construct a GTR1 URL that calculates this. I figured that if they thought it was that good, I better investigate it myself for free. It appeared to be slightly better than Wilder's RSI and Cuttler's RSI, but nothing to get excited about. I spent way more time than I should have trying to match StockCharts, but what I came up with follows the definitions I was provided with as closely as possible. Here are the results of (1) with Connor's RSI in place of Cuttler's RSI(2):

DJI Member, Connor's RSI (lag 1) <= 20, 5-day Hold
19981231 to 20160606
Avg Min Max SD
CAGR: 14.61 3.43 21.36 6.64
TR: 1455.95 79.89 2799.65 1112.51
GSD(20): 26.23 20.22 36.51 5.63
GDD(20; 0%): 15.73 11.44 25.34 5.14
GDDD3: 12.38 7.89 24.88 6.31
MDD: -52.44 -75.40 -36.29 12.70
UI(20): 15.10 8.14 34.77 9.96
Sharpe(20): 0.64 0.22 0.89 0.24
Beta(20): 0.76 0.59 0.93 0.13
TI(20): 19.24 7.20 24.48 6.41
AT: 35.21 33.78 36.10 0.84
(6) http://gtr1.net/2013/?s19981231h5::dji.a:et1:dspo%......

We now have an example of one kind of RSI with one trading cycle that produces results almost as good as what P123 reports for an unknown version of RSI. But of course, one of the five cycles has a CAGR of 3. This is yet another demonstration of the importance of using daily-cycled backtesting, and that even weekly data is not good enough for assessing strategies like Randy's.

Robbie Geary

PS All backtests in this post start at the beginning of 1926. To get the full results, remove 19981231 from the report starting date that I have inserted in all the URLs so that results are comparable with Portfolio123.

PPS Connor's RSI is a bit computationally intensive due to the many EMAs involved, so (6) may take a few minutes to run for the first time each day. It actually calculates Connor's RSI for ever stock on every day back to 1926, not just the DJI stocks. Once these calculations have been saved to a temporary field file automatically accessed with importf, subsequent backtests of other screens that use Connor's RSI through the same importf call should be fast.

Post New | Post Reply | Report Post | Recommend It!

Print the post

Unthreaded | Threaded | Whole Thread (36)

Prev | Next

Announcements

Mechanical Investing FAQ

Contact Shrewd'm
Contact the developer of these message boards.