Ernest Stokley was kind enough to spend time learning a new software language in order to run back-tests on ETFs used to populate ITA portfolios.
There is an abundance of noise or uncertainty when analyzing data for back-testing. Here are a few examples.
- Actual Buy and Sell prices will not match back-testing Buy and Sell prices.
- Variations in start and end dates will impact results.
- Back-tests used end of day prices while portfolio managers are buy and selling securities throughout the trading day.
- The selection of the review period (33 days for the ITA portfolios) will impact results.
- Portfolio managers using limit orders will lag orders placed in back-testing studies as the order may not be struck for several days, if at all.
- The inclusion and frequent use of TLT may be an outlier as it is highly unlikely TLT will repeat the performance it had over the past eight years. This is a specific example of how back-tested results can be misleading.
Ernie ran a number of tests that show us that we need to be careful not to “hang our hats” on one set of results. In other words, be skeptical of back-testing results.
I’ve extracted two paragraphs from Mr. Stokely’s article which is no longer available.
“The conclusion of both noise tests is that Monte Carlo runs should be made when looking at the absolute return numbers from any single back-test of a given portfolio management technique. Serendipitous choices of date sequences can result in either highly inflated expectations of a given technique, or it can give unrealistically low expectations for that technique.
One of the things highlighted by this study is the sensitivity of the method to noisy changes in the parameters. Because of the compounding effect, noisy influences can become magnified. A side note of this point is that small fortuitous investment decisions can pay off handsomely in the long run through the compounding effect, a fact that is well known but was not fully appreciated by the author until this study.”
Anyone who has worked with the uncertainty of measurement understands exactly how the “noise” is compounded, particularly when values are multiplied or divided.
As a counter point, remember that back-testing is about as good as it gets when one is trying to develop robust portfolio management models.