Exploratory Data Analysis

Exploratory data analysis gives us a broad understanding of data. It is important as a first step in the modelling process, but can also be useful in its own right. In the plots to the right we have examples of different ways to look at profitability that may lead to insights in the trading process. Exploratory data analysis can and should be performed on other variables. 

(Click to enlarge)

You know my method. It is founded upon the observation of trifles.
— Sherlock holmes, sir arthur conan doyle
 

 Profitability for a particular trader against wind over the course of a week shows that the performance corresponds quite dramatically with the amount of wind generation.  In particular, low wind hours have seen more downside risk than high wind and beyond 10GW of wind sees profitability rising in expectation as wind increases. This knowledge is imperative in creating an algorithmic trader that automatically adapts to its weaknesses, but it can also be very useful for inspection of a manual trader's profit in order to identify persistent problems. 

Below, a plot that many are likely used to seeing. This view of overall trends of a trader over longer periods of time can assist in dynamically allocating capital, in understanding the role of seasonality, and for assessing the current risk levels relative to historical profit. 

(Click to enlarge)

At left we show profitability and trades by hourending. The marginal density at the top of the plot shows that the trades over this time period have been almost exactly uniformly spread across each of the hours. We might ask ourselves whether this is desirable in a capital-constrained environment. It seems that over this short time horizon, the evening peak has provided the most volatility, and has been profitable in spite of the two large drawdowns. There is a concentration of moderately profitable hours that is highlighted by the blue-ish density at around $200 between hours ending 10 and 15, signifying that those hours are more consistently profitable in this time horizon. Naturally, a seven day look-back is not long enough to justify making trading decisions, but it may be enough to identify a trend that persists in the monthly or even yearly case.

There are tools both simpler and more complex beyond what we have outlined here. Oftentimes, the best insights come from looking into more than two dimensions, though that precludes most standard visualizations. In spite of this, creating a custom suite of visualizations is a fast and easy way to check for small-but-relevant changes in the behavior of variables of interest. They are an indispensable tool for making quick insights, particularly as the scope of the project gets large; it is tedious to examine reports and tables of multiple markets and products, but with the correct visualizations, a quick glance is all that is required to justify further investigation