My Trading Research Tools
I’ve been posting my trading research without realizing that most people do not really have a statistics background or are used to looking at trading and market data in a statistical fashion.
Just as a way of bringing everyone on the same page, here’s a brief outline of how I measure the validity of a trading idea.
Start With A Question
All kinds of knowledge begins with a question. What, Why, When, How. In my experience with testing ideas, it helps to ask the following questions:
1. What is the phenomena/criteria being tested?
- examples of “what”:
a. Technical: Price crossing above a moving average
b. Fundamental: A Low Price-Earnings Ratio
c. Seasonal: August
d. Economic: GDP growth
2. How does the criteria interact with the market? or What is the effect of the criteria on market prices?
- following the above earlier examples:
a. Technical: After a cross-over, prices tend to continue in the direction of the crossover
b. Fundamental: Low PE stocks outperform high PE stocks.
c. Seasonal: Markets trend down every August
d. Economic: Markets follow the trend of the GDP of the country
You can have more elaborate questions than above, but I believe it pays to be simple and specific. We try to find some relationships between a predictor variable (#1) and the market movement (#2).
Gather Data
Once you have formulated a question, the grunt work is gathering and processing data to observe the relationship. Most of the time, this kind of research involves historical data. Although there are numerous software available to test market relationships, I found the most robust and adaptable software is a simple spreadsheet program like Microsoft Excel. This allows me to organize data transparently: each observation is a row on the spreadsheet, while each criteria is a new column.
For example, my spreadsheet might look like this:

The first few columns are my price data, then one column to test for the condition (DOJI) and the last column is the result I am testing (gain tomorrow).
The condition column will contain a simple IF-Then statement to signify the condition as true or false, while the result condition contains a statement to compute for the result. Both columns use the price data of the first columns to generate their results.
e.g. DOJI means IF Open and Close are less than .2% apart. Gain tomorrow is tomorrow’s close less today’s close.
Interpreting the Results
Once all the data is processed similarly, it’s fairly easy to summarize the results into a tabular or graphical format.

In this table, I summarized for instance the result of a combination of two conditions:
a. having a DOJI yesterday
b. the gain/loss on today’s opening
So the permutations are: 1) no doji yesterday, 2) doji yesterday and market opened up, 3) doji yesterday and the market opened down.
The typical results I track are as follows (following the chart above):
Trades – this is simply a count of how many days qualified under each specific condition. This tells me immediately how rare or common the criteria occurs. This gives me an idea of whether the pattern/event happens often enough to be useful in practical trading, and also if the other results are only dependent on a few observations, making them less significant. In general, the more observations the more valuable the insight is.
Results – This is simply a summation of the total market movement/return grouping the conditions together. One good way to look at this is to imagine the sample set of all trading days as a random mix of good and bad days. Once the criteria being tested is used to segregate the days, a positive result means that the criteria was able to group profitable days together, while the opposite is true if the result is negative. A near zero or breakeven result means that the criteria was not effective in grouping the results together.
Win Rate – Apart from the total returns, this statistic shows you how many of those days that are grouped by the criteria were actually net positive. This helps put the total returns in perspective. If the returns are positive, but the win rate is less than 50%, this suggests that the positive return may be due to just a few exceptionally good trades, but in general the ability of the pattern/criteria to group profitable days together is not very good.
Average Win, Average Loss, Win/Loss - These numbers support the relationship of the total results and the win rate. They are computed as follows:
Average Win = Total Returns of Winners / Total Winning Days
Average Loss = Total Losses of Losers / Total Losing Days
Win/Loss = Average Win / Absolute Value(Average Loss)
These numbers show the characteristics of the winning and losing trades that populate the total returns of the criteria. In our earlier example, if results are positive, and win rate is below 50%, this is further supported if average wins are larger than average losses. Therefore, the conclusion is that the criteria may not result in many winning trades, but the winners are considerably larger than the losers.
Expectancy – This is a final observation that combines our knowledge of the winning percentage and the win/loss ratio into a measure we can use to interpret the robustness of the results. It’s computed as:
Winning Percentage x Average Win + (1 – Winning Percentage) x Average Loss
If the number is not only positive, but a large or double digit positive number, this implies that the combination of the winning percentage and win/loss ratios will result in a positive market movement. The opposite is true for negative expectancies. In the chart above, a down close following a doji as a -11 expectancy which reflects a tendency for the market to fall given that condition and the performance measures resulting from that condition.
Testing Relationships
Sometimes, the data will provide you two results which can be related to each other to form further conclusions. For example in the study of dojis above, the summary table implies a relationship between the opening on the market day following a doji day and the close for that day. Using the value for opening and closing on the days following the doji, we can create a scatterplot of the results.
A scatterplot is simply a chart, with the X-axis representing the first value (e.g. the opening result) and the Y-axis representing the second value (e.g. the closing result). Each observation is a dot coordinate on the plot, and the shape of the dots should line up if a relationship exists:

Although not all dots will align perfectly, if a relationship exists between the two values, most dots will align to represent that relationship. For more clarity, a trendline can be automatically generated from the plot to best summarize the fit of all dots. The slope of the trendline will reflect the relationship–if it is upward sloping, the relationship is positive (i.e. results in one value will reflect the result of the next), if downward, the relationship is negative (i.e. results in one value will reflect the opposite of the next).
Insight And Application
Armed with the results above, the tester will then be able to hypothesize first the existence of the relationship, secondly the strength of the relationship, and finally, the conditions contributing to the relationship. This now allows anyone to answer the questions originally formulated.
This kind of analysis is not difficult to understand or do, although admittedly it requires some patience in gathering and summarizing data. However, once the results are in, it is a simpler matter to draw some full or partial conclusions about what the data suggests.
Very few investment or trading books go to this length in establishing empirical proof of the relationships they describe, and most investment advice simply relies on either: vague generalizations and rationalizations, or even sometimes outright lies, just to promote an idea. This kind of reasoning is closer to blind faith and religion than it is to actual scientific fact, and sadly most traders and investors do not have the initiative or courage to ask simple questions of validation.
However, testing is not the ultimate holy grail–because it relies on one thing: historical data. Past data reflects past relationships, and those seldom repeat exactly in the future. On the other hand, having knowledge of past relationships is much better than having no knowledge at all, and moreso if one is risking hard-earned funds on an idea, it’s comforting to have the benefit of testing and experience prior to a trade commitment.
Finally, historical data is also not perfect–in that it cannot represent everything that happened in the past. Some market ideas are difficult to reduce to objective questions, or translate to arguments that can be tested by past data. (e.g. buy low, sell high–leads to what is low, what is high?). These are the types of ideas that persist solely on belief. In my experience, all ideas that are worth knowing, is at least partly supported by past evidence and if I come across ideas that I cannot break down into arguments testable by data, these ideas are usually less reliable than those that can be tested.
About this entry
You’re currently reading “My Trading Research Tools,” an entry on Mark T. Market(tm)
- Published:
- July 13, 2008 / 4:00 pm
- Category:
- Trading
- Tags:
- research, statistics
No comments yet
Jump to comment form | comment rss [?] | trackback uri [?]