No. of Recommendations: 30
I'm startled by how good the new AI tools are at generating the code I need to do backtests. I asked
claude.ai to write a Python program to test a market timing strategy that I had first come across in 2015. The strategy, called
"The Buy and Sell Portfolio" was one of 3 timing strategies described in the following blog post:
How To Create Portfolios That Adapt To Market Changes
https://thetaoofwealth.wordpress.com/2015/02/09/ho...The Buy and Sell strategy has the following rules.
1. Buy the S&P 500 whenever it closes at a 6 month high.
2. Sell the S&P 500 whenever it closes at a 1 year low, and put the proceeds into 5 year treasuries.
The author (Dick Stoken) claims the following results for this strategy.
In sample backtest of Buy and Sell market timing strategyBacktest period: 1972 - 2010 (39 years)
CAGR: 13.8%
Max drawdown: +6.1%
Max drawdown for buy and hold S&P 500: -37.6%
I had bookmarked this blog post 9 years ago, because I thought the above max drawdown figure looked really good (almost too good to be true).
So I decided to do a post-discovery (out of sample) backtest for the period 2010 - 2024. To do this, I needed to write a program that could fetch daily prices for SPY and implement the buy and sell rules. I couldn't find an ETF that held only 5-year T-bills during this period, so I settled on using IEI (which holds 3 to 7-year T-bills) as a proxy.
To write the backtest program, I went to
https://claude.ai and entered the following instructions in plain English:
"I want to back test the following investment strategy. Buy the SPY ETF on 12/31/2009. Sell SPY whenever the price hits a 52-week low and purchase the IEI ETF with the cash from the sale. Sell IEI and repurchase SPY whenever the SPY price hits a new 26-week high. Please provide the Python code to test this strategy from 12/31/2009 to the present using an API to a data source that provides daily closing prices for the SPY ETF."
In a few seconds, Claude generated the Python program I needed. It even offered to calculate the Sharpe ratio and max drawdown stats for me. After running the program, I asked Claude to also calculate the CAGR and compare it to the CAGR of buying and holding SPY.
I noticed that the program used the yfinance Python package to pull prices from Yahoo Finance. I had to install that package on my MacBook using pip install. Here are the results I got after I ran the program:
Post-discovery (out of sample) backtest of Buy and Sell market timing strategyBacktest Results (2009-12-31 to 2025-01-01):
Trading Strategy Performance:
Final Value: $39,012.64
Total Return: 290.13%
CAGR: 9.50%
Sharpe Ratio: 0.58
Maximum Drawdown: -19.78%
Max Drawdown Period: 2020-02-19 to 2020-03-18
Buy and Hold SPY Performance:
Final Value: $69,283.30
Total Return: 592.83%
CAGR: 13.77%
Sharpe Ratio: 0.73
Maximum Drawdown: -33.72%
Max Drawdown Period: 2020-02-19 to 2020-03-23
Strategy Trade Summary:
Number of Trades: 11
Trade History:
2009-12-31: Switch to SPY
2011-10-03: Switch to IEI
2012-01-25: Switch to SPY
2016-02-11: Switch to IEI
2016-04-18: Switch to SPY
2018-12-19: Switch to IEI
2019-04-05: Switch to SPY
2020-03-12: Switch to IEI
2020-08-10: Switch to SPY
2022-05-09: Switch to IEI
2023-04-28: Switch to SPY
We see disappointing post-discovery results compared to the in-sample test period. This is a common result with most market timing strategies that were discovered around 2010 right after the Great Recession ended.
The CAGR is only 9.5% compared to SPY's 13.8%, and investors in the market timing strategy were still exposed to a -20% drawdown in 2020 - nowhere near as good as the in-sample results.
I hope this post inspires others here to try out claude.ai (and other chatbots) to generate their own backtesting programs.