Trading Stochastic Difference Equations

If you haven’t read Part 1, do that first.
\(\newcommand{\Var}{\mathrm{Var}}\)
\(\newcommand{\Cov}{\mathrm{Cov}}\)
\(\newcommand{\Expect}{{\rm I\kern-.3em E}}\)
\(\newcommand{\me}{\mathrm{e}}\)
For a quick review, recall that our system looks like:
\begin{align}
x_{n+1}&=ax_n+w_n \\
w_n &\sim N(0, \sigma^2) \\
|a| & < 1. \\
\end{align}
Next, let’s define a time constant, \(\tau\), that gives us an idea of how the noise term, \(w_n\), is filtered. Time constants are generally the time it takes to reach \(1/\me\) of a steady-state value, so:
\begin{align}
&\frac{1}{\me}=a^\tau \\
&-1=\tau \ln a\\
&\tau=\frac{-1}{\ln a}\\
\text{or}\\
&a=\me^{-1/ \tau}
\end{align}

Now, our goal is to understand how we could trade (profitably) such a stochastic system. What does this mean? We might want to try something like, “buy low, sell high.” Like
\begin{align}
x_n \leq T_1&\Rightarrow \text{buy}\\
x_n \geq T_2&\Rightarrow \text{sell}
\end{align}
Our profit on each round-trip trade would be \(T_2-T_1\). Hopefully, we choose \(T_2 > T_1\).

Thresholds

Since our system is linear, scaling the input (\(w_n\)), scales the output. Thus, if we set thresholds as a multiple of \(\sigma\) we do not have to explore different noise powers. Put another way, if we fix, say, \(\Var(x)\equiv 1\), then we can experiment with different thresholds, and the results will hold for those thresholds as a multiple of \(\Var(x)\).

Time Durations

Similarly, if we are counting events (trades) over time, we want to be able to scale everything to a useful value.
Consider (from Part 1):

\begin{align}
\Var(x_k-x_0)&=2 \sigma^2 \frac{1-a^k}{1-a^2} \\
&=2\Var(x)(1-a^k)\\
&=2\Var(x)(1-\me^{-k/\tau})
\end{align}
We can see that the variance of increments depends on the number of time constants that we go out. So, if we count trades per time constant interval, we will have standardized our results. Note: The astute reader should note that this is not a proof (not even close), but we will leave that for later. This is more of a “seems plausible.”

Some Simulations

Let’s perform a few tests to get started. We will pick an \(a\) corresponding to \(\tau\) of [300, 600, 1200] time steps (think seconds), and then fix \(\sigma\) to make \(\Var(x)\equiv 1\). Then we will perform Monte Carlo simulations over enough time steps to get statistically significant results.

Here are the results.
round trip trades vs. threshold

There are 2 types of trading algorithms simulated here:

  • -t/+t: buy when x<-t and sell when x>+t, count action as 0.5 trades
  • -2t/0/+2t: buy when x<-2t and sell when x>0; plus sell when x>2t and buy when x<0

Each of these has a 2t profit per round trip trade, but the second option occurs more often for smaller thresholds.  For larger thresholds, the rapid Gaussian falloff causes there to be fewer trades.

Note you can see that the three different time constant (TC) values have lines very close to each other, justifying our supposition that events per TC are the relevant experimental parameter.  This means we can do our simulations for a single TC, rather than a whole series.

If we multiple the number of trades by the profit per trade, we get the following result.  Of course, this ignores commissions, which would drag down the profits more on the lower thresholds (more trades).

total profit vs. threshold

The simulations used 200,000 time steps, and the following parameters.

time constant (steps)astd dev of wstd dev of x
300.99667.0821.031
600.99833.0581.029
1200.99917.0411.005

All About Stochastic Difference Equations, Part 1

“All about X, part 1” is really a fancy way of saying a little bit about X.

How do you eat an elephant?
One bite at a time 1

So let’s eat stochastic difference equations one bite at a time. Let’s consider a first-order linear equation driven by a Gaussian random variable.

\begin{align}
x_{n+1}&=ax_n+w_n \\
w_n &\sim N(0, \sigma^2) \\
|a| & < 1. \\
\end{align}

Let’s dissect this one bit at a time. \(x_n \) is our system’s state, or value, at time \(n \). Our time variable \(n \) takes on integer values, -2, -1, 0, 1, 2, and so on. \(w_n\) is our noise term, and it is a normally distributed Gaussian variable with mean 0 and standard deviation \(\sigma\) at each time \(n\). The values at different times are uncorrelated. We require \(|a|<1\) so that our equation is stable, or does not blow up.
What does this series look like? Here is a sample run of 5000 steps for a=0.9967 and \(\sigma=\)0.082. The significance of these numbers will be revealed later.

sde_a0-9967_w0-082

Let’s derive some basic properties of our stochastic series \(x_n\). Recall that \(E()\) is the expectation operator, meaning it computes the expected value of its random variable argument.

\(\newcommand{\Var}{\mathrm{Var}}\)
\(\newcommand{\Cov}{\mathrm{Cov}}\)
\(\newcommand{\Expect}{{\rm I\kern-.3em E}}\)

What is the average (expected) value of \(x_n\)? Since our process is stationary, that is, a is fixed, and \(w_n\) does not varying it’s statistics over time, \(\Expect(x_n)\) is constant, let’s call it \(\Expect(x)\). Then

\begin{align}
\Expect(x_{n+1})&=a\Expect(x_n)+\Expect(w_n) \\
\Expect(x)(1-a)&=\Expect(w_n) \\
\Expect(x)&=0
\end{align}
where we used \(E(w_n)=0\) for the final step.

What about the variance of x?
\begin{align}
\Var(x_{n+1})&=a^2\Var(x_n)+\Var(w_n) \\
\Var(x)(1-a^2)&=\sigma^2 \\
\Var(x)&=\frac{\sigma^2}{1-a^2}
\end{align}
\begin{align}
\Cov(x_{n+1},x_n)&=\Expect[(x_{n+1}-\Expect(x_{n+1}))(x_{n}-\Expect(x_{n}))]\\
&=\Expect[x_{n+1}x_{n}]\\
&=\Expect[(ax_n+w_n)x_n]\\
&=\Expect[ax_n^2]\\
&=\frac{a\sigma^2}{1-a^2}
\end{align}

Let’s look at the increments:
\begin{align}
\Delta x_{n+1}&=x_{n+1}-x_n\\
&=(a-1)x_n+w_n\\
\Var(\Delta x)&=(a-1)^2\Var(x)+\sigma^2\\
\Var(\Delta x)&=\sigma^2[\frac{(1-a)^2}{1-a^2}+1]\\
&=\sigma^2\frac{2}{1+a}.
\end{align}
Note If a is close to 1, then the variance of the increments is close to \(\sigma^2\).

Now, let’s get a little more involved and compute the variance over k steps. From some basic system theory (or inspection):
\begin{align}
x_k=x_0 a^k + \sum_{s=0}^{k-1}a^{k-s-1}w_s
\end{align}
To make sure we have the right limits, let’s check for k=1:
\[
x_1=a x_0 + w_0
\]
That looks right, let’s continue on.
\begin{align}
\Var(x_k-x_0)=(a^k-1)^2\Var(x)+\sigma^2\sum_{s=0}^{k-1}(a^{k-s-1})^2
\end{align}
Now, using the Sum of a Geometric Series, we have
\begin{align}
\Var(x_k-x_0)&=\frac{(a^k-1)^2}{1-a^2}\sigma^2 + \frac{1-a^{2k}}{1-a^2}\\
&=\sigma^2 [ \frac{(a^k-1)^2 + 1 – a^{2k}}{1-a^2}\\
&=2 \sigma^2 \frac{1-a^k}{1-a^2}.
\end{align}
Checking when \(k=1\) and we see it matches our earlier result.

In Part 2 we will look more closely at some simulations and trading results on these series.

Notes:

  1. Generally credited to General Creighton Williams Abrams, Jr., Chief of Staff of United States Army 1972-1974, but an earlier reference to the concept is Frank Cody, Detroit’s Superintendent of Schools in 1921 link

Sum of a Geometric Series

What is the sum of the following geometric series?
\[
\sum_{i=0}^{k}a^i=1+a+a^2+…+a^k
\]
We will frequently need a simple formula for this finite series. It is called a geometric series because each term is related by a multiple to the previous one. If each term was related by a fixed difference, it would be called an arithmetic series; but that’s for another day.
Let us define the sum as \(S\). Then writing the equation for \(S\) and \(aS\) with some clever alignment:

\begin{array}{rrrccc}
S=&1&+a&+a^2&+…&+a^k \\
aS=&&a&+a^2&+…&+a^k&+a^{k+1}
\end{array}
Subtract the equations
\[
S(1-a)=1-a^{k+1}
\]
or
\begin{align}
S=
\begin{cases}
\frac{1-a^{k+1}}{1-a} &a \neq 1 \\
k+1 &\text{otherwise}
\end{cases}
\end{align}
where we have to be careful to divide by \(1-a\) only if \(a\neq 1\), and the answer for \(a=1\) is determined by inspection.
Q.E.D.

Trading Bitcoins

This is part 2 of our Bitcoin analysis.  Following part 1 Do Bitcoin Prices Walk Randomly? we can apply some simple analysis on bitcoin trading profitability.

Recall that in the previous post, we learned that Bitcoin prices are slightly negatively correlated over various time periods. Thus we might ask the question, what if we followed a very simple strategy: Sell when prices rose in the prior period, and Buy when they fell. Our percentage profits would be roughly proportional to \(\sigma r / p\) where \(\sigma\) is the standard deviation of the price change, \(r\) is the correlation coefficient, and \(p\) is the current price. We analyzed the Bitcoin prices from the BTC-e exchange for 2015-2016. Here is the variation data (legend 2015 indicates aggregated data 2015-2016):

Bitcoin data for different time periods over 20167
Note that for short time periods (1 minute), there is more correlation in prices, but since the variance (standard deviation) is smaller, the profit on each trade will be smaller. This is where transaction costs come into play.

We now add two more things:

  1. A threshold that a price change must exceed in order to make the trade
  2. A transaction fee of 0.20%.  This is typical on Bitcoin exchanges.  Some might be less, depending on trading volume, and also Maker fees can be lower (where you place a limit order rather than execute against a current order)
profit for different periods and thresholds

profit for different periods and thresholds

You can see that we cannot quite get a positive profit on this admittedly simple trading strategy.  The correlation between prices changes is just not big enough to overcome the transaction fees.  You can see that as we raise the threshold, the profit per trade increases, but does not quite rise to the level of 0.20%, our transaction costs.

 

Do Bitcoin Prices Walk Randomly?

Or, A Random Walk Down Bitcoin Avenue

A process follows a random walk if it has changes that are independent from one time interval to the next.  A common question is “Does the price of X follow a random walk?”  In particular, people are curious about the prices of stocks because they want to make money. Without getting too technical yet, if a price follows a random walk, you can’t make money trying to predict future prices.  If you can show that it does not follow such a random walk,  then it might be possible to make money at it.  So, do stock prices follow a random walk?

NO YES
A Random Walk Down Wall Street 1
A Non-Random Walk Down Wall Street 2

Bitcoin is a cryptocurrency, one of the better known ones, which uses something called Blockchain technology [look for future post on this topic].  There are several exchanges that allow one to buy and sell Bitcoins.  So, a natural question would be “Do those prices follow a random walk?”

Let’s follow Prof. Andrew Lo’s excellent analysis from Chapter 2 of his book and see what conclusions we can draw about Bitcoin prices.  Chapter 2 says that if the price change at period N+1 is independent of the price change at period N, then the variance over 2 periods should be twice the variance of changes over 1 period.  Let \(X_n\) be the log-price of bitcoins for period \(n\).

Let’s derive the variance relationship.  Let \(x, y\) be two random variables, and recall that the variance is the expected value of the squared deviation from the mean:

\[
var(x) \equiv E[ (x – \bar{x})^2 ]
\]

The variance of a sum of random variables

\begin{align}
var(x+y)=&E[ (x+y-\bar{x}-\bar{y})^2 ] \\
=&E[ (x-\bar{x})^2 + (y-\bar{y})^2 ] + 2 E[(x-\bar{x})(y-\bar{y})] \\
=&var(x) + var(y) + 2 cov(x,y)
\end{align}

Following Lo, let \(X_0,…,X_{nq} \) be the sequence of log-prices, and we define the variance ratio for period \(q\) as the variance at q steps divided by the variance at 1 step, divided again by q. This ratio should be 1.0 for all q if the steps are uncorrelated. Formally, letting \(\hat{\mu}\) be the mean increment, \(\bar{\sigma}_a^2\) be the variance estimate for 1 step, and \(\bar{\sigma}_c^2\) be the variance estimate for 1 step based on q-step differences, we have

\begin{gather}
\hat{\mu}=\frac{1}{nq} \sum_{k=1}^{nq}(X_k-X_{k-1})=\frac{1}{nq}(X_{nq}-X_0) \\
\bar{\sigma}_a^2= \frac{1}{nq-1} \sum_{k=1}^{nq}(X_k-X_{k-1}-\hat{\mu})^2 \\
\bar{\sigma}_c^2(q)= \frac{1}{m} \sum_{k=q}^{nq}(X_k-X_{k-q}-q\hat{\mu})^2 \\
m=q(nq-q+1)\left( 1 – \frac{q}{nq}\right)
\end{gather}

The results for running these calculations on Bitcoin prices from the BTC-e exchange from 2011, when trading started, to the present time (October 2016) is shown in table 1. Also shown in the table is the z-statistic \(z^*\) from Theorem 2.3 of Lo. This computes the significance of the variance ratio result. If the \(|z^*|\) is greater then 2, then the variance ratio result is significant at the .05 level, meaning that there is a only a 5% chance that the variance ratio differs from the null-hypothesis value of 1.

First, price and volume data for the time period of interest for the BTC-e exchange:

pr-vol

Observation Period Years Num obs (nq) q=2 q=4 q=8
5 min 2011-4 241,059 .729
(-3.39)
.540
(-5.67)
.411
(-7.04)
2015-6 187,118 .853
(-17.04)
.735
(-22.49)
.670
(-21.80)
1 hour 2011-4 28,079 .791
(-6.47)
.620
(-9.06)
.568
(-8.20)
2015-6 15,659 .940
(-2.40)
.880
(-3.85)
.855
(-3.46)
1 day 2011-4 1,234 .977
(-.27)
.917
(-.82)
.991
(-.07)
2015-6 659 .983
(-.14)
.869
(-.96)
.862
(-.92)

What conclusions that we can draw from this table?

  1. For 1 day periods, the variance ratio does not indicate any non-randomness.
  2. For shorter periods, there is significant non-randomness showing.  In general, it looks like prices are negatively correlated with the prior period (because the variance ratio < 1.0).  At 5 minute intervals, the correlation coefficient is approximately -0.15.
  3. The randomness has decreased since 2015.
  4. Note that from Lo’s book, (weekly) stock prices has variance rations > 1 (around 1.3), meaning that price changes were positively correlated.  Note that the Bitcoin price changes are generally negatively correlated.

Our next post will look at whether you can profit from this non-randomness.

 

 

 

Notes:

  1. A Random Walk Down Wall Street, 11th ed., Burton G. Malkiel 2106
  2. A Non-Random Walk Down Wall Street, Andrew W. Lo & A. Craig MacKinlay