1. Home
2. Article Catalog
3. ARMA Modelling with Arbitrary/Non-Consecutive Lags and Exogenous Variables in Python using Sympy: From Matrix Mathematics and Statistics to Parallel Computing

ACADEMIC ARTICLE SERIES: ECONOMICS AND FINANCE INVESTIGATION:

ARMA Modelling with Arbitrary/Non-Consecutive Lags and Exogenous Variables in Python using Sympy: From Matrix Mathematics and Statistics to Parallel Computing: Part 1

Author:
Jonathan Legrand

This article constructs a Python function allowing anyone to discern all Maximum Likelihood Estimates of any ARMA(p,q) model using matrices, using the Python library 'Sympy'. We go through the fundamentals of ARMA models, the Log-Likelihood estimation method, matrix constructions in Sympy, and then put it all together to build our generalised model. In this 1st Part/Draft, we lay the foundations needed before continuing and make one function to encapsulate any ARMA model, with or without arbitrary lags.

## Setting up Sympy Variables, Matrices and First Order Conditions (FOCs)

First we need to import our primary libraries:

    	
import sys
print("This code is running on Python version: " + sys.version)





This code is running on Python version: 3.8.2 (tags/v3.8.2:7b3ab59, Feb 25 2020, 22:45:29) [MSC v.1916 32 bit (Intel)]

    	
import sympy
print("The Sympy library imported in this code is version: " + sympy.__version__)





The Sympy library imported in this code is version: 1.6.2

    	
# Setting up our innitial arguments:
p, q, c, T, symb_func_list = 1, 1, True, 6, []
Â
if c is True:
Â  Â  _c = 1
Â
# This is how we create Sympy variables:
t, sigma_s = sympy.symbols("t, \hat{\sigma}^2")





Now let's create our $T$x1 regressand matrix, $Y$:

    	
Y = sympy.MatrixSymbol('Y', T-max(p, q)-1, 1)
Y.shape





(4, 1)

Lets evaluate our $Y$ matrix, call it ' Ytm ' - the matrix full of our regressands ($y_{t-i}$), it is called such as it stands for 'Y for t minus' i for each i

    	
Ytm = sympy.Matrix([sympy.Symbol('y_{t-' + str(_t) + '}')
Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  for _t in range(T - max(p, q) - 1)])
Ytm





$\displaystyle \left[\begin{matrix}y_{t-0}\\y_{t-1}\\y_{t-2}\\y_{t-3}\end{matrix}\right]$

Now let's see our $\left[T - max(p,q)\right]$ x $k$ regressor matrix, $X$, where $k = \begin{Bmatrix} p + q & \text{if the model has no constant, c} \\ p + q + 1 & \text{if the model has a constant, c} \end{Bmatrix}$ :

    	
X = sympy.MatrixSymbol('X', T-max(p, q)-1, p+q+c)
X.shape





(4, 3)

Let's evaluate $X$ and name the resulting matrix simillarly to its $Y$ counterpart:

    	
Xtm_list = [[sympy.Symbol('y_{t-' + str(_t - i + p) + '}')
Â  Â  Â  Â  Â  Â  Â for _t in range(max(p, q), T-1)] for i in range(p, 0, -1)]
for j in range(q, 0, -1):
Â  Â  Xtm_list.append([sympy.Symbol('{\epsilon}_{t-' + str(_t - j + q) + '}')
Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â for _t in range(max(p, q), T-1)])
Xtm = sympy.Matrix(Xtm_list).TÂ  # ' .T ' transposes the matrix.
if c is True:
Â  Â  Xtm = Xtm.col_insert(0, sympy.ones(T - max(p, q) - 1, 1))
Xtm





$\displaystyle \left[\begin{matrix}1 & y_{t-1} & {\epsilon}_{t-1}\\1 & y_{t-2} & {\epsilon}_{t-2}\\1 & y_{t-3} & {\epsilon}_{t-3}\\1 & y_{t-4} & {\epsilon}_{t-4}\end{matrix}\right]$

In order to get the inverse of our X'X matrix, its determinant must be non-zero. Let's check our determinant:

    	
(Xtm.T * Xtm).det() == 0




False

Now let's see our $T$x1 stochastic error matrix, $\epsilon$:

    	
Epsilon = sympy.MatrixSymbol('\epsilon', T-max(p,q)-1, 1)
Epsilon.shape





(4, 1)

Now let's see our $k$ x $\left[T - max(p,q)\right]$ regressor matrix, $\beta$, where $k = \begin{Bmatrix} p + q & \text{if the model has no constant, c} \\ p + q + 1 & \text{if the model has a constant, c} \end{Bmatrix}$ :

    	
Beta = sympy.MatrixSymbol('beta', p+q+c, 1)
Beta.shape





(3, 1)

Let's evaluate that $\beta$ matrix with its estimate, $\hat{\beta}$:

    	
coef_list = [sympy.Symbol('\hat{\phi_' + str(i) + '}') for i in range(1, p+1)]
coef_list.extend([sympy.Symbol('\hat{\psi_' + str(j) + '}') for j in range(1, q+1)])
_Beta_hat = sympy.Matrix([coef_list]).T
if c == True:
Â  Â  _Beta_hat = _Beta_hat.row_insert(0, sympy.Matrix([sympy.Symbol('\hat{c}')]))
_Beta_hat





$\displaystyle \left[\begin{matrix}\hat{c}\\\hat{\phi_1}\\\hat{\psi_1}\end{matrix}\right]$

Now let's have a look at our matrix of all estimators, $\Omega$:

    	
Omega = sympy.MatrixSymbol('\Omega', p+q+c+1, 1)
Â
# Let's evaluate it too:
_Omega_hat = _Beta_hat.row_insert(Beta.shape[0], sympy.Matrix([sigma_s]))
_Omega_hat





$\displaystyle \left[\begin{matrix}\hat{c}\\\hat{\phi_1}\\\hat{\psi_1}\\\hat{\sigma}^2\end{matrix}\right]$

We now have all the components for our log likelihood equation remember that we are -this far - keeping the $T$ because we did not normalise the log likelihood; but the T is simply a constant and bears no impact on its maximising. Remember - also - that the superscipt $T$ does not stand for 'the poser of $T$', but 'transpose instead; i.e.: $X^T \neq X$ to the power of $T$.

    	
log_lik = (
Â  Â  (
Â  Â  Â  Â  - (T / 2) * sympy.log(2 * sympy.pi) - (T / 2) * sympy.log(sigma_s)
Â  Â  )*sympy.Identity(1) - (1 / (2 * sigma_s)) * (
Â  Â  Â  Â  (Y - (X * Beta)).T*(Y - (X * Beta))
Â  Â  )
)
display(log_lik)





$\displaystyle - \frac{1}{2} \frac{1}{\hat{\sigma}^2} \left(Y^{T} - \beta^{T} X^{T}\right) \left(- X \beta + Y\right) + \left(- 3.0 \log{\left(\hat{\sigma}^2 \right)} - 3.0 \log{\left(2 \pi \right)}\right) \mathbb{I}$

Let's evaluate our log likelihood equation too:

    	
_log_lik = (
Â  Â  (
Â  Â  Â  Â  -(T / 2) * sympy.log(2 * sympy.pi) - (T / 2) * sympy.log(sigma_s)
Â  Â  )*sympy.Identity(1) - (1 / (2 * sigma_s)) * (
Â  Â  Â  Â  (Ytm - (Xtm * _Beta_hat)).T * (Ytm - (Xtm * _Beta_hat))
Â  Â  )
)
display(_log_lik)





$\displaystyle \left(- 3.0 \log{\left(\hat{\sigma}^2 \right)} - 3.0 \log{\left(2 \pi \right)}\right) \mathbb{I} + \left[\begin{matrix}- \frac{\left(- \hat{\phi_1} y_{t-1} - \hat{\psi_1} {\epsilon}_{t-1} - \hat{c} + y_{t-0}\right)^{2} + \left(- \hat{\phi_1} y_{t-2} - \hat{\psi_1} {\epsilon}_{t-2} - \hat{c} + y_{t-1}\right)^{2} + \left(- \hat{\phi_1} y_{t-3} - \hat{\psi_1} {\epsilon}_{t-3} - \hat{c} + y_{t-2}\right)^{2} + \left(- \hat{\phi_1} y_{t-4} - \hat{\psi_1} {\epsilon}_{t-4} - \hat{c} + y_{t-3}\right)^{2}}{2 \hat{\sigma}^2}\end{matrix}\right]$

As per the above, our FOC $\hat{\beta}$ is $(X' X)^{-1} X'Y$:

    	
FOC_Beta_hat = (X.T * X).inverse() * X.T * Y
FOC_Beta_hat





$\displaystyle \left(X^{T} X\right)^{-1} X^{T} Y$

We can evaluate it:

    	
_FOC_Beta_hat = FOC_Beta_hat.subs(Y, Ytm)
_FOC_Beta_hat = _FOC_Beta_hat.subs(X, Xtm)
_FOC_Beta_hat





$\displaystyle \left(\left(\left[\begin{matrix}1 & y_{t-1} & {\epsilon}_{t-1}\\1 & y_{t-2} & {\epsilon}_{t-2}\\1 & y_{t-3} & {\epsilon}_{t-3}\\1 & y_{t-4} & {\epsilon}_{t-4}\end{matrix}\right]\right)^{T} \left[\begin{matrix}1 & y_{t-1} & {\epsilon}_{t-1}\\1 & y_{t-2} & {\epsilon}_{t-2}\\1 & y_{t-3} & {\epsilon}_{t-3}\\1 & y_{t-4} & {\epsilon}_{t-4}\end{matrix}\right]\right)^{-1} \left(\left[\begin{matrix}1 & y_{t-1} & {\epsilon}_{t-1}\\1 & y_{t-2} & {\epsilon}_{t-2}\\1 & y_{t-3} & {\epsilon}_{t-3}\\1 & y_{t-4} & {\epsilon}_{t-4}\end{matrix}\right]\right)^{T} \left[\begin{matrix}y_{t-0}\\y_{t-1}\\y_{t-2}\\y_{t-3}\end{matrix}\right]$

    	
_sol_Beta_hat = (Xtm.T * Xtm)**(-1) * Xtm.T * YtmÂ  # Note that with T < 5, the rank of the X'X matrix is too low and its determinant is 0; thus it would not be invertible. You can verify this with ' (Xtm.T * Xtm).rank() '




Now let's have a look at our Score matrix:

    	
Score = log_lik.diff(Omega)
Score





$\displaystyle \frac{\partial}{\partial \Omega} \left(- \frac{1}{2} \frac{1}{\hat{\sigma}^2} \left(Y^{T} - \beta^{T} X^{T}\right) \left(- X \beta + Y\right) + \left(- 3.0 \log{\left(\hat{\sigma}^2 \right)} - 3.0 \log{\left(2 \pi \right)}\right) \mathbb{I}\right)$

Let's evaluate it too:

    	
_Score = _log_lik.diff(_Omega_hat)
_Score





$\displaystyle \left[\begin{matrix}\left[\begin{matrix}- \frac{2 \hat{\phi_1} y_{t-1} + 2 \hat{\phi_1} y_{t-2} + 2 \hat{\phi_1} y_{t-3} + 2 \hat{\phi_1} y_{t-4} + 2 \hat{\psi_1} {\epsilon}_{t-1} + 2 \hat{\psi_1} {\epsilon}_{t-2} + 2 \hat{\psi_1} {\epsilon}_{t-3} + 2 \hat{\psi_1} {\epsilon}_{t-4} + 8 \hat{c} - 2 y_{t-0} - 2 y_{t-1} - 2 y_{t-2} - 2 y_{t-3}}{2 \hat{\sigma}^2}\end{matrix}\right] + \mathbb{0}\\\left[\begin{matrix}- \frac{- 2 y_{t-1} \left(- \hat{\phi_1} y_{t-1} - \hat{\psi_1} {\epsilon}_{t-1} - \hat{c} + y_{t-0}\right) - 2 y_{t-2} \left(- \hat{\phi_1} y_{t-2} - \hat{\psi_1} {\epsilon}_{t-2} - \hat{c} + y_{t-1}\right) - 2 y_{t-3} \left(- \hat{\phi_1} y_{t-3} - \hat{\psi_1} {\epsilon}_{t-3} - \hat{c} + y_{t-2}\right) - 2 y_{t-4} \left(- \hat{\phi_1} y_{t-4} - \hat{\psi_1} {\epsilon}_{t-4} - \hat{c} + y_{t-3}\right)}{2 \hat{\sigma}^2}\end{matrix}\right] + \mathbb{0}\\\left[\begin{matrix}- \frac{- 2 {\epsilon}_{t-1} \left(- \hat{\phi_1} y_{t-1} - \hat{\psi_1} {\epsilon}_{t-1} - \hat{c} + y_{t-0}\right) - 2 {\epsilon}_{t-2} \left(- \hat{\phi_1} y_{t-2} - \hat{\psi_1} {\epsilon}_{t-2} - \hat{c} + y_{t-1}\right) - 2 {\epsilon}_{t-3} \left(- \hat{\phi_1} y_{t-3} - \hat{\psi_1} {\epsilon}_{t-3} - \hat{c} + y_{t-2}\right) - 2 {\epsilon}_{t-4} \left(- \hat{\phi_1} y_{t-4} - \hat{\psi_1} {\epsilon}_{t-4} - \hat{c} + y_{t-3}\right)}{2 \hat{\sigma}^2}\end{matrix}\right] + \mathbb{0}\\- 3.0 \frac{1}{\hat{\sigma}^2} \mathbb{I} + \left[\begin{matrix}\frac{\left(- \hat{\phi_1} y_{t-1} - \hat{\psi_1} {\epsilon}_{t-1} - \hat{c} + y_{t-0}\right)^{2} + \left(- \hat{\phi_1} y_{t-2} - \hat{\psi_1} {\epsilon}_{t-2} - \hat{c} + y_{t-1}\right)^{2} + \left(- \hat{\phi_1} y_{t-3} - \hat{\psi_1} {\epsilon}_{t-3} - \hat{c} + y_{t-2}\right)^{2} + \left(- \hat{\phi_1} y_{t-4} - \hat{\psi_1} {\epsilon}_{t-4} - \hat{c} + y_{t-3}\right)^{2}}{2 \left(\hat{\sigma}^2\right)^{2}}\end{matrix}\right]\end{matrix}\right]$

From there, we cound rearage to get - for e.g. - $\sigma^2$:

    	
sympy.solve(_Score[-1][0], sigma_s)[sigma_s]




$\displaystyle 0.166666666666667 \hat{\phi_1}^{2} y_{t-1}^{2} + 0.166666666666667 \hat{\phi_1}^{2} y_{t-2}^{2} + 0.166666666666667 \hat{\phi_1}^{2} y_{t-3}^{2} + 0.166666666666667 \hat{\phi_1}^{2} y_{t-4}^{2} + 0.333333333333333 \hat{\phi_1} \hat{\psi_1} y_{t-1} {\epsilon}_{t-1} + 0.333333333333333 \hat{\phi_1} \hat{\psi_1} y_{t-2} {\epsilon}_{t-2} + 0.333333333333333 \hat{\phi_1} \hat{\psi_1} y_{t-3} {\epsilon}_{t-3} + 0.333333333333333 \hat{\phi_1} \hat{\psi_1} y_{t-4} {\epsilon}_{t-4} + 0.333333333333333 \hat{\phi_1} \hat{c} y_{t-1} + 0.333333333333333 \hat{\phi_1} \hat{c} y_{t-2} + 0.333333333333333 \hat{\phi_1} \hat{c} y_{t-3} + 0.333333333333333 \hat{\phi_1} \hat{c} y_{t-4} - 0.333333333333333 \hat{\phi_1} y_{t-0} y_{t-1} - 0.333333333333333 \hat{\phi_1} y_{t-1} y_{t-2} - 0.333333333333333 \hat{\phi_1} y_{t-2} y_{t-3} - 0.333333333333333 \hat{\phi_1} y_{t-3} y_{t-4} + 0.166666666666667 \hat{\psi_1}^{2} {\epsilon}_{t-1}^{2} + 0.166666666666667 \hat{\psi_1}^{2} {\epsilon}_{t-2}^{2} + 0.166666666666667 \hat{\psi_1}^{2} {\epsilon}_{t-3}^{2} + 0.166666666666667 \hat{\psi_1}^{2} {\epsilon}_{t-4}^{2} + 0.333333333333333 \hat{\psi_1} \hat{c} {\epsilon}_{t-1} + 0.333333333333333 \hat{\psi_1} \hat{c} {\epsilon}_{t-2} + 0.333333333333333 \hat{\psi_1} \hat{c} {\epsilon}_{t-3} + 0.333333333333333 \hat{\psi_1} \hat{c} {\epsilon}_{t-4} - 0.333333333333333 \hat{\psi_1} y_{t-0} {\epsilon}_{t-1} - 0.333333333333333 \hat{\psi_1} y_{t-1} {\epsilon}_{t-2} - 0.333333333333333 \hat{\psi_1} y_{t-2} {\epsilon}_{t-3} - 0.333333333333333 \hat{\psi_1} y_{t-3} {\epsilon}_{t-4} + 0.666666666666667 \hat{c}^{2} - 0.333333333333333 \hat{c} y_{t-0} - 0.333333333333333 \hat{c} y_{t-1} - 0.333333333333333 \hat{c} y_{t-2} - 0.333333333333333 \hat{c} y_{t-3} + 0.166666666666667 y_{t-0}^{2} + 0.166666666666667 y_{t-1}^{2} + 0.166666666666667 y_{t-2}^{2} + 0.166666666666667 y_{t-3}^{2}$

### Step 1: Compute $\epsilon_{max(p,q)+p}$

#### In Scalar Terms

Since we need the first $p$ and $q$ values for $y$ for our first modelled $\hat{y}$, we need to start at $\hat{y}_{max(p,q)+1}$ carrying on for $p$ steps (i.e.: from $max(p,q)+1$ to $max(p,q)+p$) such that for each $1 \leq i \leq p$:

$$\label{eq:3} \hat{y}_{max(p,q)+i} = \hat{c} + \left(\sum_{k=1}^{p} \hat{\phi_k} y_{max(p,q)+i-k}\right) + \left( \sum_{j=1}^{q} \hat{\psi_j} \epsilon_{max(p,q)+i-j} \right) \tag{3}$$ and \begin{align} y_{max(p,q)+i} &= c + \left(\sum_{k=1}^{p}\phi_k y_{max(p,q)+i-k}\right) + \epsilon_{max(p,q)+i} + \left( \sum_{j=1}^{q} \psi_j \epsilon_{max(p,q)+i-j} \right) \\ \label{eq:4} &= c + \left( \sum_{k=1}^{p}\phi_k y_{max(p,q)+i-k} \right)+ \epsilon_{max(p,q)+i} \tag{4} \end{align} since $\epsilon_1, \epsilon_2, ..., \epsilon_{max(p,q)+p-1} = 0$. Therefore, the optimal values of $c, \phi_1, \phi_2, ..., \phi_p$ minimising $\epsilon_{max(p,q)+p} = \hat{y}_{max(p,q)+p} - y_{max(p,q)+p}$ are the ones abiding the $p$ FOC equations. Remember that this is the whole point of the FOC equations; they were deducted from our MLE method to compute estimates ($\hat{\Theta}$) that would maximise the likelyness of observing the regressands we have in our data ($Y$), which equates to the estimates reducing our error term ($\epsilon_{max(p,q)+p}$). Letting $m = max(p,q)$, the FOC equations in question are: $$\begin{array}{ll} 0 &= \sum_{t=m+1}^{m + p}{ y_{t-i} \left[- y_t + \widehat{c} + { \left( \sum_{k=1}^{p} \widehat{\phi_k} y_{t-k} \right) } + \left( \sum_{j=1}^{q} \widehat{\psi_j} \epsilon_{t-j} \right) \right]} \\ &\label{eq:5}= \sum_{t=m+1}^{m + p}{ y_{t-i} \left[- y_t + \widehat{c} + { \left( \sum_{k=1}^{p} \widehat{\phi_k} y_{t-k} \right) } \right]} \\ \tag{5} \end{array}$$ for each $1 \leq i \leq p$, where $$\begin{array}{ll} \widehat{c} &= - \frac{1}{m + p} \sum_{t=m+1}^{m + p}{ \left[ - y_t + \left( \sum_{r=1}^{p} \widehat{\phi_r} y_{t-r} \right) + \left( \sum_{j=1}^{q} \widehat{\psi_j} \epsilon_{t-j} \right) \right] } \\ \label{eq:6} &= - \frac{1}{m + p} \sum_{t=m+1}^{m + p}{ \left[ - y_t + \left( \sum_{r=1}^{p} \widehat{\phi_r} y_{t-r} \right) \right] \tag{6} } \end{array}$$ This - as per \eqref{eq:5} - equates to - for each $1 \leq i \leq p$: $$\begin{array}{ll} \label{eq:7} 0 &= \sum_{t=m+1}^{m + p}{ y_{t-i} \begin{Bmatrix}- y_t - \frac{1}{m + p} \left[ \sum_{\tau=m+1}^{m + p}{ - y_{\tau} + \left( \sum_{r=1}^{p} \widehat{\phi_r} y_{\tau-r} \right) }\right] + { \left( \sum_{k=1}^{p} \widehat{\phi_k} y_{t-k} \right) } \end{Bmatrix}} \tag{7} \\ \end{array}$$ From equations \eqref{eq:3} to \eqref{eq:7}, the usual $i$ denoting elements related to the AR part of our model was replaced with $k$ and $r$ here to eleviate from possible confusion as to which $i$ is used where, similarly to the substitution of $t$ and $\tau$. Now that we have our estimate values, we can compute $\hat{y}_{max(p,q)+p} = \hat{c} + \left( \sum_{i=1}^{p}\hat{\phi_i} y_{max(p,q)+p-i} \right)$, from which we can compute $\epsilon_{max(p,q)+p} = \hat{y}_{max(p,q)+p} - y_{max(p,q)+p}$,*i.e*: $$\epsilon_{m+p} = \hat{y}_{m+p} - y_{m+p}$$

Example 1.scalar: ARMA(1,1) with constant: Step 1:

For an ARMA(1,1) with constant model, $p = 1$ and $q = 1$. Thus, as per the above, we let $\epsilon_1 = \epsilon_2 = ... = \epsilon_{m+p-1} = 0$, which is to say $\epsilon_1 = 0$, and $m = max(p,q) = 1$ such that: $$\begin{array}{ll} & y_2 &= c + \phi_1 y_1 + \epsilon_2 + \psi_1 \epsilon_1 & \\ & &= c + \phi_1 y_1 + \epsilon_2 & \text{ since } \epsilon_1 = 0 \\ => & \hat{y_2} &= \hat{c} + \hat{\phi_1} y_1 \\ \text{and} & \epsilon_2 &= \hat{y_2} - y_2 \end{array}$$ Furthermore, as per \eqref{eq:7}: $$\begin{array}{ll} & 0 &= y_1 \begin{Bmatrix} - y_2 - \frac{1}{2} \left[- y_2 + \left( \widehat{\phi_1} y_1 \right) \right] + \left( \widehat{\phi_1} y_1 \right) \end{Bmatrix} \\ & &= \frac{y_1}{2} \begin{Bmatrix} \widehat{\phi_1} y_1 - y_2 \end{Bmatrix} \\ => & y_2 &= \hat{\phi_1} y_1 \\ => & \hat{\phi_1} &= \frac{y_2}{y_1} \end{array}$$ and as per \eqref{eq:6}: $$\begin{array}{ll} & \hat{c} &= -\frac{1}{2} ( - y_2 + \hat{\phi} y_1) \\ => & \hat{c} &= -\frac{1}{2} ( - y_2 + \frac{y_2}{y_1} y_1) & \text{ since } \hat{\phi_1} = \frac{y_2}{y_1}\\ & &= 0 \end{array}$$ Now: $$\begin{array}{ll} \epsilon_2 &= \hat{y_2} - y_2 \\ &= \hat{c} + \hat{\phi_1} y_1 - y_2 \\ &= 0 + \frac{y_2}{y_1} y_1 - y_2 \\ &= y_2 - y_2 \\ &= 0 \end{array}$$ It is natural for $c$ and $\epsilon$ of an ARMA(1,1) to originally be computed as 0, we hope it to change into a useful value as we go up the steps. **Note** that the result here is the same with a model without a constant.

#### In Matrix Terms:

In matrix term, \eqref{eq:4} boils down to

$$\begin{bmatrix} y_{m + 1} \\ y_{m + 2} \\ ... \\ y_{m + p} \end{bmatrix} = \begin{bmatrix} 1 & y_{m} & y_{m-1} & ... & y_{m-p+1} & \epsilon_{m} & \epsilon_{m-1} & ... & \epsilon_{m-q+1} \\ 1 & y_{m+1} & y_{m} & ... & y_{m-p+2} & \epsilon_{m+1} & \epsilon_{m} & ... & \epsilon_{m-q+2} \\ ... & ... & ... & ... & ... & ... & ... & ... & ... \\ 1 & y_{m+p-1} & y_{m+p-2} & ... & y_{m} & \epsilon_{m+p-1} & \epsilon_{m+p-2} & ... & \epsilon_{m} \end{bmatrix} \begin{bmatrix} c \\ \phi_1 \\ ... \\ \phi_p \\ \psi_1 \\ ... \\ \psi_q \end{bmatrix} + \begin{bmatrix} \epsilon_{m + 1} \\ \epsilon_{m + 2} \\ ... \\ \epsilon_{m + p} \end{bmatrix}$$

which, since $\epsilon_1, \epsilon_2, ..., \epsilon_{m+p-1} = 0$, equates to:

$$\begin{bmatrix} y_{m + 1} \\ y_{m + 2} \\ ... \\ y_{m + p} \end{bmatrix} = \begin{bmatrix} 1 & y_{m} & y_{m-1} & ... & y_{m-p+1} \\ 1 & y_{m+1} & y_{m} & ... & y_{m-p+2} \\ ... & ... & ... & ... & ... \\ 1 & y_{m+p-1} & y_{m+p-2} & ... & y_{m} \end{bmatrix} \begin{bmatrix} c \\ \phi_1 \\ ... \\ \phi_p \end{bmatrix} + \begin{bmatrix} 0 \\ ... \\ 0 \\ \epsilon_{m + p} \end{bmatrix}$$

Then - as per \eqref{eq:3}: $$\begin{bmatrix} \hat{y}_{m + 1} \\ \hat{y}_{m + 2} \\ ... \\ \hat{y}_{m + p} \end{bmatrix} = \begin{bmatrix} 1 & y_{m} & y_{m-1} & ... & y_{m-p+1} \\ 1 & y_{m+1} & y_{m} & ... & y_{m-p+2} \\ ... & ... & ... & ... & ... \\ 1 & y_{m+p-1} & y_{m+p-2} & ... & y_{m} \end{bmatrix} \begin{bmatrix} \hat{c} \\ \hat{\phi_1} \\ ... \\ \hat{\phi_p} \end{bmatrix}$$

Thus, letting $\hat{Y} = \begin{bmatrix} \hat{y}_{m + 1} \\ \hat{y}_{m + 2} \\ ... \\ \hat{y}_{m + p} \end{bmatrix}$, $X = \begin{bmatrix} 1 & y_{m} & y_{m-1} & ... & y_{m-p+1} \\ 1 & y_{m+1} & y_{m} & ... & y_{m-p+2} \\ ... & ... & ... & ... & ... \\ 1 & y_{m+p-1} & y_{m+p-2} & ... & y_{m} \end{bmatrix}$ and $\hat{\bf{\beta}} = \begin{bmatrix} \hat{c} \\ \hat{\phi_1} \\ \hat{\phi_2} \\ ... \\ \hat{\phi_p} \end{bmatrix}$, we can write $\hat{Y} = \hat{X} \hat{\bf{\beta}}$ and our FOC equations equate to $\hat{\bf{\beta}} = (X'X)^{-1} X' \hat{Y}$ which - using actual $y$ values (i.e.: $Y$ as opposed to $\hat{Y}$) - renders: $\hat{\bf{\beta}} = (X'X)^{-1} X' Y$.

We revert back to scalars: now that we have our estimate values, we can compute $\hat{y}_{max(p,q)+p} = \hat{c} + \left( \sum_{i=1}^{p}\hat{\phi_i} y_{max(p,q)+p-i} \right)$, from which we can compute $\epsilon_{max(p,q)+p} = \hat{y}_{max(p,q)+p} - y_{max(p,q)+p}$,i.e: $$\epsilon_{m+p} = \hat{y}_{m+p} - y_{m+p}$$

    	
# Import relevent Libraries
import datetime
import numpy as np
import pandas as pd
Â
# Use Eikon in Python
import eikon as ek # Eikon or Refinitiv Workspace must run in the background for this library to work.
eikon_key = open("eikon.txt","r") # This will fetch my Eikon app_key stored in a text file alongside my code.
eikon_key.close() # We must close the .txt file we opened.
Â
for w,z in zip(["numpy", "pandas", "eikon"], [np, pd, ek]):
Â  Â  print("The " + w + " library imported in this code is version: " + z.__version__)





The numpy library imported in this code is version: 1.21.2
The pandas library imported in this code is version: 1.2.4
The eikon library imported in this code is version: 1.1.8

    	
# Collect real world data for our example (for more info. see 'Part 2' bellow)
SPX = ek.get_data(instruments=".SPX",Â  # Looking at the S&P 500, thus the ' SPX '; indecies are preceded by a full stop ' . '.
Â  Â  Â  Â  Â  Â  Â  Â  Â  fields=["TR.CLOSEPRICE.timestamp",
Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  "TR.CLOSEPRICE"],
Â  Â  Â  Â  Â  Â  Â  Â  Â  parameters={'SDate': '2018-01-01',
Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  'EDate': '2019-01-01',
Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  'FRQ': 'D'}
Â  Â  Â  Â  Â  Â  Â  Â  Â )[0]Â  # ' ek.get_data ' returns 2 objects, the 0th is the data-frame that we're interested in, thus the ' [0] '.
SPX = SPX.dropna()Â  # We want to remove all non-business days, thus the ' .dropna() '.




    	
# Import the relevent plotting libraries:
Â
import plotly
import plotly.express
from plotly.graph_objs import *
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
Â
init_notebook_mode(connected = False)
Â
# cufflinks.set_config_file(offline = True, world_readable = True)
Â
# seaborn is needed to plot plotly line graphs from pandas data-frames
import seaborn
Â
import matplotlib
import matplotlib.pyplot as pltÂ  # the use of ' as ... ' (specifically here: ' as plt ') allows us to create a shorthand for a module (here: ' matplotlib.pyplot ')
%matplotlib inline
Â
Â
import chart_studio
chart_studio.tools.set_config_file(
Â  Â  plotly_domain='https://chart-studio.plotly.com',
Â  Â  plotly_api_domain='https://plotly.com')
Â
for i,j in zip(["plotly", "cufflinks", "seaborn", "matplotlib"], [plotly, cufflinks, seaborn, matplotlib]):
Â  Â  print("The " + str(i) + " library imported in this code is version: " + j.__version__)





The plotly library imported in this code is version: 4.14.3
The cufflinks library imported in this code is version: 0.17.3
The seaborn library imported in this code is version: 0.11.1
The matplotlib library imported in this code is version: 3.2.1

    	
pd.DataFrame(
Â  Â  data=list(SPX["Close Price"]),
Â  Â  index=SPX["Timestamp"]).iplot(
Â  Â  title="S&P500's Close Prices",
Â  Â  theme="solar",Â  # The theme ' solar ' is what gives the graph its dark theme.
Â  Â  yTitle="Nominal $", Â Â xTitle='Date')     import statsmodels from statsmodels.tsa.stattools import adfuller print("The statsmodels library imported in this code is version: " + statsmodels.__version__)     def adf_table(series=SPX["Close Price"]): Â Â """This function returns a Pandas data-frame with the results of the Â Â Augmented Dickeyâ€“Fuller test from the statsmodels library. Â Â Â Arguments: Â Â Â series (Pandas series): The series of numbers on which the test will be performed Â Â Â Dependencies: Â Â Â pandas 1.1.4 Â Â numpy 1.18.5 Â Â statsmodels 0.11.1 Â Â """ Â Â Â _adf_stationarity_test = adfuller(x = series) Â Â Â columns = [ Â Â Â Â ("ADF test statistic", ""),Â # The ', ""' is there for the Pandas MultiIndex Â Â Â Â ("MacKinnon's approximate p-value", ""), Â Â Â Â ("The number of lags used", ""), Â Â Â Â ("Number of observations used for the ADF regression and calculation of the critical values", ""), Â Â Â Â ("Critical values for the test statistic", '1%'), Â Â Â Â ("Critical values for the test statistic", '5%'), Â Â Â Â ("Critical values for the test statistic", '10%'), Â Â Â Â ("A dummy class with results attached as attributes", "")] Â Â Â columns = pd.MultiIndex.from_tuples(columns) Â Â Â _adf_stationarity_test = ( Â Â Â Â list(_adf_stationarity_test[0:4]) + Â Â Â Â [_adf_stationarity_test[4][i] for i in _adf_stationarity_test[4]] + Â Â Â Â [_adf_stationarity_test[5]]) Â Â Â adf_stationarity_test = pd.DataFrame( Â Â Â Â data=np.array(_adf_stationarity_test)[np.newaxis], Â Â Â Â columns=columns) Â Â Â return adf_stationarity_test     adf_table(series=SPX["Close Price"])     SPX["1d Close Price"] = SPX["Close Price"].diff()Â # ' 1d ' for the 1st lag's difference. SPX    Instrument Timestamp Close Price 1d Close Price 1 .SPX 2018-01-02T00:00:00Z 2695.81 2 .SPX 2018-01-03T00:00:00Z 2713.06 17.25 3 .SPX 2018-01-04T00:00:00Z 2723.99 10.93 4 .SPX 2018-01-05T00:00:00Z 2743.15 19.16 5 .SPX 2018-01-08T00:00:00Z 2747.71 4.56 ... ... ... ... ... 247 .SPX 2018-12-24T00:00:00Z 2351.1 -65.52 248 .SPX 2018-12-26T00:00:00Z 2467.7 116.6 249 .SPX 2018-12-27T00:00:00Z 2488.83 21.13 250 .SPX 2018-12-28T00:00:00Z 2485.74 -3.09 251 .SPX 2018-12-31T00:00:00Z 2506.85 21.11   adf_table(series=SPX["1d Close Price"].dropna())    ADF test statistic MacKinnon's approximate p-value The number of lags used Number of observations used for the ADF regression and calculation of the critical values Critical values for the test statistic A dummy class with results attached as attributes 1% 5% 10% 0 -15.63264 1.68E-28 0 249 -3.456888 -2.873219 -2.572994 2240.947043   # for ease, let's remove our 1st observation as it includes an 'NaN' value: SPX = SPX.iloc[1:].reset_index() SPX_Y_1 = sympy.Matrix([SPX["1d Close Price"].iloc[2], Â Â Â Â Â Â Â Â Â Â Â Â SPX["1d Close Price"].iloc[3]])Â # Remember that Python tends to be 0 indexed, so the 2nd element of our (1 indexed) list is the 1st (of its 0 indexed one) Â SPX_X_1 = sympy.Matrix([[1, SPX["1d Close Price"].iloc[1], Â Â Â Â Â Â Â Â Â Â Â Â Â SPX["1d Close Price"].iloc[0]], Â Â Â Â Â Â Â Â Â Â Â Â [1, SPX["1d Close Price"].iloc[2], Â Â Â Â Â Â Â Â Â Â Â Â Â SPX["1d Close Price"].iloc[1]]]) Â SPX_FOC_Beta_hat_1 = (((SPX_X_1.T * SPX_X_1)**(-1)) * Â Â Â Â Â Â Â Â Â Â Â SPX_X_1.T * SPX_Y_1)     newX_1 = sympy.Matrix([1, SPX["1d Close Price"].iloc[2], Â Â Â Â Â Â Â Â Â Â Â Â SPX["1d Close Price"].iloc[1]]).T SPX_epsilon4 = ( Â Â newX_1 * SPX_FOC_Beta_hat_1)[0] - SPX["1d Close Price"].iloc[3]Â # The ' [0] ' is needed here to pick out the scalar out of the 1x1 matrix.   $$\begin{array}{ll} 0 &= \sum_{t=m+1}^{T}{ y_{t-i} \begin{Bmatrix} - y_t - \frac{1}{T} \left[ \sum_{\tau=m+1}^{T}{ - y_{\tau} + \left( \sum_{r=1}^{p} \widehat{\phi_r} y_{\tau-r} \right) }\right] + { \left( \sum_{k=1}^{p} \widehat{\phi_k} y_{t-k} \right) } \end{Bmatrix}} & \\ &= \label{eq:8} \sum_{t=m+1}^{m+p+1}{ y_{t-i} \begin{Bmatrix} - y_t - \frac{1}{m+p+1} \left[ \sum_{\tau=m+1}^{m+p+1}{ - y_{\tau} + \left( \sum_{r=1}^{p} \widehat{\phi_r} y_{\tau-r} \right) }\right] + { \left( \sum_{k=1}^{p} \widehat{\phi_k} y_{t-k} \right) } \end{Bmatrix}} & \text{ since } T = m+p+1 \tag{8} \\ &= \sum_{t=m+1}^{m+1+1}{ y_{t-1} \begin{Bmatrix} - y_t - \frac{1}{m+1+1} \left[ \sum_{\tau=m+1}^{m+1+1}{ - y_{\tau} + \left( \sum_{r=1}^{1} \widehat{\phi_r} y_{\tau-r} \right) }\right] + { \left( \sum_{k=1}^{1} \widehat{\phi_k} y_{t-k} \right) } \end{Bmatrix}} & \text{ since } p = 1 \text{ and thus } i = 1 \\ &= \sum_{t=1+1}^{1+1+1}{ y_{t-1} \begin{Bmatrix} - y_t - \frac{1}{1+1+1} \left[ \sum_{\tau=1+1}^{1+1+1}{ - y_{\tau} + \left( \sum_{r=1}^{1} \widehat{\phi_r} y_{\tau-r} \right) }\right] + { \left( \sum_{k=1}^{1} \widehat{\phi_k} y_{t-k} \right) } \end{Bmatrix}} & \text{ since } m = 1 \\ &= \sum_{t=2}^{3}{ y_{t-1} \begin{Bmatrix} - y_t - \frac{1}{3} \left[ \sum_{\tau=2}^{3}{ - y_{\tau} + \left( \sum_{r=1}^{1} \widehat{\phi_r} y_{\tau-r} \right) }\right] + { \left( \sum_{k=1}^{1} \widehat{\phi_k} y_{t-k} \right) } \end{Bmatrix}} & \\ &= y_1 \begin{Bmatrix} - y_2 - \frac{1}{3} \left[ - y_2 + \left( \widehat{\phi_1} y_1 \right) - y_3 + \left( \widehat{\phi_1} y_2 \right) \right] + \left( \widehat{\phi_1} y_1 \right) \end{Bmatrix} \\ & \text{ } + y_2 \begin{Bmatrix} - y_3 - \frac{1}{3} \left[ - y_2 + \left( \widehat{\phi_1} y_1 \right) - y_3 + \left( \widehat{\phi_1} y_2 \right) \right] + \left( \widehat{\phi_1} y_{2} \right) \end{Bmatrix} \end{array}$$   y1, y2, y3, phi1 = sympy.symbols("y_1, y_2, y_3, \phi_1") equation = y1*(-y2 - ((1/3)*(-y2 + phi1*y1 - y3 + phi1*y2)) + phi1*y1 Â Â Â Â Â Â Â ) + y2*(-y3 - ((1/3)*(-y2 + phi1*y1 - y3 + phi1*y2)) + phi1*y2) phi_1_equation = sympy.solve(equation, phi1)[0] phi_1_equation  $\displaystyle \frac{y_{1} y_{2} - 0.5 y_{1} y_{3} - 0.5 y_{2}^{2} + y_{2} y_{3}}{y_{1}^{2} - y_{1} y_{2} + y_{2}^{2}}$Thus$\hat{\phi_1} = \frac{y_1 y_2 - 0.5 y_1 y_3 - 0.5 {y_2}^2 + y_2 y_3}{{y_1}^2 - y_1 y_2 + {y_2}^2}$and, as per (6) which becomes $$\begin{array}{ll} \widehat{c} &= - \frac{1}{T} \sum_{t=m+1}^{T}{ \left[ - y_t + \left( \sum_{r=1}^{p} \widehat{\phi_r} y_{t-r} \right) + \left( \sum_{j=1}^{q} \widehat{\psi_j} \epsilon_{t-j} \right) \right] } & \\ &= - \frac{1}{3} \sum_{t=2}^{3}{ \left[ - y_t + \left( \sum_{r=1}^{1} \widehat{\phi_r} y_{t-r} \right) + \left( \sum_{j=1}^{1} \widehat{\psi_j} \epsilon_{t-j} \right) \right] } & \text{ since } p=q=m=1 \text{ and thus } T = m+p+1 = 3 \\ &= - \frac{1}{3} \sum_{t=2}^{3}{ \left[ - y_t + \left(\widehat{\phi_1} y_{t-1} \right) \right] } & \text{ since } \epsilon_1 = \epsilon_2 = 0 \\ &= - \frac{1}{3} \begin{Bmatrix} \left[ - y_2 + \left(\widehat{\phi_1} y_1 \right) \right] + \left[ - y_3 + \left(\widehat{\phi_1} y_2 \right) \right] \end{Bmatrix} & \\ &= - \frac{1}{3} \left[ \widehat{\phi_1} \left( y_1 + y_2 \right) - y_2 - y_3 \right] & \end{array}$$   c_hat_equation = (-1/3)*(phi_1_equation*(y1+y2)-y2-y3) c_hat_equation  $\displaystyle 0.333333333333333 y_{2} + 0.333333333333333 y_{3} - \frac{0.333333333333333 \left(y_{1} + y_{2}\right) \left(y_{1} y_{2} - 0.5 y_{1} y_{3} - 0.5 y_{2}^{2} + y_{2} y_{3}\right)}{y_{1}^{2} - y_{1} y_{2} + y_{2}^{2}}$Thus$\hat{c} = \frac{y_2 + y_3}{3} - \frac{y_1 + y_2}{3} \frac{y_1 y_2 - 0.5 y_1 y_3 - 0.5 {y_2}^2 + y_2 y_3}{y_1^2 - y_1 y_2 + y_2^2}$Now:$\epsilon_3 = \hat{y_3} - y_3 = \hat{c} + \hat{\phi_1} y_2 + \hat{\psi_1} \epsilon_2 - y_3$  y_hat_3 = c_hat_equation + phi_1_equation * y2 epsilon_3 = y_hat_3 - y3 epsilon_3  $\displaystyle 0.333333333333333 y_{2} + \frac{y_{2} \left(y_{1} y_{2} - 0.5 y_{1} y_{3} - 0.5 y_{2}^{2} + y_{2} y_{3}\right)}{y_{1}^{2} - y_{1} y_{2} + y_{2}^{2}} - 0.666666666666667 y_{3} - \frac{0.333333333333333 \left(y_{1} + y_{2}\right) \left(y_{1} y_{2} - 0.5 y_{1} y_{3} - 0.5 y_{2}^{2} + y_{2} y_{3}\right)}{y_{1}^{2} - y_{1} y_{2} + y_{2}^{2}}$  y_hat_3_evaluated = y_hat_3.subs(y1, SPX["1d Close Price"].iloc[0] Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â ).subs(y2, SPX["1d Close Price"].iloc[1] Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â ).subs(y3, SPX["1d Close Price"].iloc[2]) epsilon_3_evaluated = y_hat_3_evaluated - SPX["1d Close Price"].iloc[2]Â # Remember, y3 is the 2nd SPX_close_prices element since the latter is 0 indexed. epsilon_3_evaluated   âˆ’7.96667082813824   SPX_Y_2 = sympy.Matrix([SPX["1d Close Price"].iloc[i] for i in [2, 3, 4]])Â # Remember that Python tends to be 0 indexed, so the 2nd element of our (1 indexed) list is the 1st (of its 0 indexed one) SPX_X_2 = sympy.Matrix( Â Â [[1] + [SPX["1d Close Price"].iloc[i] for i in [1, 0]] + [0],Â # ' [1] + ' inserts the integer ' 1 ' at the start of the list; similarly to ' [0] ' a the end. Â Â Â [1] + [SPX["1d Close Price"].iloc[i] for i in [2, 1]] + [0], Â Â Â [1] + [SPX["1d Close Price"].iloc[i] for i in [3, 2]] + [SPX_epsilon4]]) SPX_FOC_Beta_hat_2 = ((SPX_X_2.T * SPX_X_2)**(-1)) * SPX_X_2.T * SPX_Y_2 newX_2 = sympy.Matrix([1] + [SPX["1d Close Price"].iloc[i] for i in [3, 2]] + [SPX_epsilon4]).T SPX_epsilon5 = (newX_2 * SPX_FOC_Beta_hat_2)[0] - SPX["1d Close Price"].iloc[4]Â # The ' [0] ' is needed here to pick out the scalar out of the 1x1 matrix. SPX_epsilon5   âˆ’32.3306317768573   SPX    index Instrument Timestamp Close Price 1d Close Price 0 2 .SPX 2018-01-03T00:00:00Z 2713.06 17.25 1 3 .SPX 2018-01-04T00:00:00Z 2723.99 10.93 2 4 .SPX 2018-01-05T00:00:00Z 2743.15 19.16 3 5 .SPX 2018-01-08T00:00:00Z 2747.71 4.56 4 6 .SPX 2018-01-09T00:00:00Z 2751.29 3.58 ... ... ... ... ... ... 245 247 .SPX 2018-12-24T00:00:00Z 2351.1 -65.52 246 248 .SPX 2018-12-26T00:00:00Z 2467.7 116.6 247 249 .SPX 2018-12-27T00:00:00Z 2488.83 21.13 248 250 .SPX 2018-12-28T00:00:00Z 2485.74 -3.09 249 251 .SPX 2018-12-31T00:00:00Z 2506.85 21.11 250 rows Ã— 5 columns   SPX_Y_3 = sympy.Matrix([SPX["1d Close Price"].iloc[i] for i in [2, 3, 4, 5]])Â # Remember that Python tends to be 0 indexed, so the 2nd element of our (1 indexed) list is the 1st (of its 0 indexed one) Â SPX_X_3 = sympy.Matrix( Â Â [[1] + [SPX["1d Close Price"].iloc[i] for i in [1, 0]] + [0] + [0], # ' [1] + ' inserts the integer ' 1 ' at the start of the list; similarly to ' [0] ' a the end. Â Â Â [1] + [SPX["1d Close Price"].iloc[i] for i in [2, 1]] + [0] + [0], Â Â Â [1] + [SPX["1d Close Price"].iloc[i] for i in [3, 2]] + [SPX_epsilon4] + [0], Â Â Â [1] + [SPX["1d Close Price"].iloc[i] for i in [4, 3]] + [SPX_epsilon4] + [SPX_epsilon5]]) Â SPX_FOC_Beta_hat_3 = ((SPX_X_3.T * SPX_X_3)**(-1)) * SPX_X_3.T * SPX_Y_3 newX_3 = sympy.Matrix([1] + [SPX["1d Close Price"].iloc[i] for i in [4, 3]] + [SPX_epsilon4] + [SPX_epsilon5]).T SPX_epsilon6 = (newX_3 * SPX_FOC_Beta_hat_3)[0] - SPX["1d Close Price"].iloc[5] # The ' [0] ' is needed here to pick out the scalar out of the 1x1 matrix. SPX_epsilon6   âˆ’38.685458159182   c, p, q = True, 3, 2 m = max(p, q)   Setting up empty lists:   SPX_X, SPX_Y, SPX_epsilon, SPX_FOC_Beta_hat, _SPX_epsilon, Timestamp, SPX_Y_hat = [], [], [], [], [], [], []   Let's set '$C$' to be 0 if there is no constant:   if c is True: Â Â C = 1 elif c is False: Â Â C = 0 else: Â Â print("argument 'c' as to be Boolean; either 'True' or 'False'.")     for a in range(10): Â Â SPX_Y_component = [SPX["1d Close Price"].iloc[b] Â Â Â Â Â Â Â Â Â Â Â Â for b in [m+d for d in range(a+p)]]Â # This will start at the 'm'th. Â Â Timestamp_component = [SPX["Timestamp"].iloc[b] Â Â Â Â Â Â Â Â Â Â Â Â Â Â for b in [m+d for d in range(a+p)]] Â Â SPX_Y.append(SPX_Y_component) Â Â Timestamp.append(Timestamp_component)     SPX_Y_component = [SPX["1d Close Price"].iloc[b] for b in [2+d for d in range(2+2)]] SPX_Y_component   [19.16000000000031, 4.559999999999945, 3.5799999999999272, -3.0599999999999454]   SPX_Y_hat_component_Timestamp = SPX["Timestamp"].iloc[2+2+2-1] SPX_Y_hat_component_Timestamp   '2018-01-10T00:00:00Z'   SPX.head(10)    index Instrument Timestamp Close Price 1d Close Price 0 2 .SPX 2018-01-03T00:00:00Z 2713.06 17.25 1 3 .SPX 2018-01-04T00:00:00Z 2723.99 10.93 2 4 .SPX 2018-01-05T00:00:00Z 2743.15 19.16 3 5 .SPX 2018-01-08T00:00:00Z 2747.71 4.56 4 6 .SPX 2018-01-09T00:00:00Z 2751.29 3.58 5 7 .SPX 2018-01-10T00:00:00Z 2748.23 -3.06 6 8 .SPX 2018-01-11T00:00:00Z 2767.56 19.33 7 9 .SPX 2018-01-12T00:00:00Z 2786.24 18.68 8 10 .SPX 2018-01-16T00:00:00Z 2776.42 -9.82 9 11 .SPX 2018-01-17T00:00:00Z 2802.56 26.14   p, q, a = 2, 6, 10 m = max(p, q) [[f+g for g in range(p)][::-1] for f in range(m-p, m+a)]   [[5, 4], [6, 5], [7, 6], [8, 7], [9, 8], [10, 9], [11, 10], [12, 11], [13, 12], [14, 13], [15, 14], [16, 15]]   p = 2 q = 1 c = True m = max(p, q) SPX_X, SPX_Y, SPX_epsilon, SPX_FOC_Beta_hat, _SPX_epsilon, SPX_Y_hat, SPX_Y_hat_Timestamp = [], [], [], [], [], [], [] Â if c is True: C = 1 elif c is False: C = 0 else: print("argument 'c' as to be Boolean; either 'True' or 'False'.") Â for a in range(200): Â Â Â # 1st: Set up the Y matrix: Â Â SPX_Y_component = [SPX["1d Close Price"].iloc[b] Â Â Â Â Â Â Â Â Â Â Â Â for b in [m+d for d in range(a+p)]]Â # This will start at the 'm'th. Remember also that this range goes up to - but does not include the 'm+a+p'th element of the 0 indexed ' SPX["1d Close Price"] ' pandas data-frame. Â Â SPX_Y_hat_component_Timestamp = SPX["Timestamp"].iloc[m+a+p-1]Â # ' -1 ' because it is 0 indexed Â Â # # You can check the timestamp in the senario where q is 1 and p & a is 2: Â Â # SPX_Y_component_test = [SPX["1d Close Price"].iloc[b] for b in [2+d for d in range(2+2)]] Â Â # display(SPX_Y_component_test) Â Â # SPX_Y_hat_component_Timestamp_test = SPX["Timestamp"].iloc[2+2+2-1] Â Â # display(SPX_Y_hat_component_Timestamp_test) Â Â Â Â # 2nd: Setup the AR component of our X matrix used in the FOC equation: Â Â SPX_X_component = [] Â Â for f in range(m-p, m+a):Â # ' m-p ' is there in case q > p. Â Â Â Â Â F = [f+g for g in range(p)][::-1]Â # ' [::-1] ' allows us to go backwards through our list. Note that using ' range(p) ' starts at 0 and ends at p-1 inclusive. Â Â Â Â # # You can check F via the below: Â Â Â Â # p, q, a = 2, 6, 10 Â Â Â Â # m = max(p, q) Â Â Â Â # [[f+g for g in range(p)][::-1] for f in range(m-p, m+a)] Â Â Â Â Â SPX_X_component.append( Â Â Â Â Â Â [SPX["1d Close Price"].iloc[h] for h in F] + min(q, a)*[0])Â # ' min(q,a)*[0] ' adds 0s at the end of our rows, only enough for our MA components. Â Â Â Â Â if c is True: Â Â Â Â Â Â SPX_X_component[f-m+p] = [1] + SPX_X_component[f-m+p]Â # The ' -m+p ' is just there to normalise the index of our list; note that ' f-m+p ' has to start at 0. Â Â Â if a == 0: Â Â Â Â _SPX_epsilon = [[0]*j for j in range(1, p+1)] Â Â elif a > 0: Â Â Â Â _SPX_epsilon.append([0]*p + SPX_epsilon[::-1][0:min(q, a+1)]) Â Â Â Â for col in range(p+C, len(SPX_X_component[0])): Â Â Â Â Â Â for row in range(p, len(SPX_X_component)): Â Â Â Â Â Â Â Â try: Â Â Â Â Â Â Â Â Â Â if SPX_X_component[row][col] == 0: Â Â Â Â Â Â Â Â Â Â Â Â SPX_X_component[row][col] = _SPX_epsilon[row][col-C] Â Â Â Â Â Â Â Â except: Â Â Â Â Â Â Â Â Â Â pass Â Â Â SPX_newX_component = SPX_X_component[-1] Â Â SPX_X_component_M = sympy.Matrix(SPX_X_component) Â Â Â # If p = 1, we have to use our scalar method for the following 2 steps: Â Â if p == 1 and a == 0: Â Â Â Â if c is True: Â Â Â Â Â Â SPX_FOC_Beta_hat_component = [0, Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â SPX["1d Close Price"].iloc[m+p+a-1] / Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â SPX["1d Close Price"].iloc[m+p+a-2]] Â Â Â Â SPX_epsilon_component = 0 Â Â Â Â SPX_Y_hat_component = SPX["1d Close Price"].iloc[m+p+a-1] Â Â Â else: Â Â Â Â if p == 1 and a == 1: Â Â Â Â Â Â SPX_FOC_Beta_hat_component = [ Â Â Â Â Â Â Â Â phi_1_equation.subs(y1, SPX["1d Close Price"].iloc[m-p] Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â ).subs(y2, SPX["1d Close Price"].iloc[m-p+1] Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â ).subs(y3, SPX["1d Close Price"].iloc[m-p+2])] Â Â Â Â Â Â if c is True: Â Â Â Â Â Â Â Â SPX_FOC_Beta_hat_component = [ Â Â Â Â Â Â Â Â Â Â c_hat_equation.subs(y1, SPX["1d Close Price"].iloc[m-p]).subs( Â Â Â Â Â Â Â Â Â Â Â Â y2, SPX["1d Close Price"].iloc[m-p+1]).subs( Â Â Â Â Â Â Â Â Â Â Â Â y3, SPX["1d Close Price"].iloc[m-p+2])] + SPX_FOC_Beta_hat_component Â Â Â Â Â Â SPX_FOC_Beta_hat_component = sympy.Matrix(SPX_FOC_Beta_hat_component) Â Â Â Â Â Â if q > 0: Â Â Â Â Â Â Â Â SPX_newX_component = SPX_newX_component[:-1]Â # We need to ignore the previous MA part since epsilon was 0 then Â Â Â Â Â elif p == 1 and q > 0 and a <= q:Â # We need to ignore the previous MA part since epsilon was 0 then: Â Â Â Â Â Â _SPX_X_component_M = SPX_X_component_M Â Â Â Â Â Â _SPX_X_component_M.col_del(-1)Â # We are deleting this column as we are looking at an extra case where epsilon = 0. Â Â Â Â Â Â SPX_FOC_Beta_hat_component = ( Â Â Â Â Â Â Â Â ((_SPX_X_component_M.T * _SPX_X_component_M)**(-1)) * Â Â Â Â Â Â Â Â _SPX_X_component_M.T * sympy.Matrix(SPX_Y_component)) Â Â Â Â Â Â SPX_newX_component = SPX_newX_component[:-1] Â Â Â Â else: Â Â Â Â Â Â SPX_FOC_Beta_hat_component = ((( Â Â Â Â Â Â Â Â SPX_X_component_M.T *SPX_X_component_M)**(-1)) * Â Â Â Â Â Â Â Â SPX_X_component_M.T * sympy.Matrix(SPX_Y_component)) Â Â Â Â Â SPX_Y_hat_component = (sympy.Matrix(SPX_newX_component).T * SPX_FOC_Beta_hat_component)[0] Â Â Â Â SPX_epsilon_component = SPX["1d Close Price"].iloc[m+p+a-1] - SPX_Y_hat_component Â Â Â Â # Note that ' SPX["1d Close Price"].iloc[m+p+a-1] ' is the same as ' SPX["1d Close Price"][SPX["Timestamp"] == SPX_Y_hat_component_Timestamp] '. Â Â Â Â SPX_X.append(SPX_X_component) Â Â SPX_Y.append(SPX_Y_component) Â Â SPX_FOC_Beta_hat.append(SPX_FOC_Beta_hat_component) Â Â SPX_epsilon.append(SPX_epsilon_component) Â Â SPX_Y_hat_Timestamp.append(SPX_Y_hat_component_Timestamp) Â Â SPX_Y_hat.append(SPX_Y_hat_component) Â # Our method above recursively computed elements, but now that we've trained our model up to max(a), we can use our elements to compute our non-recursive estimates: non_recursive_SPX_Y_hat = (SPX_X_component_M * SPX_FOC_Beta_hat_component)[-len(SPX_Y_hat):]Â # both elements on the right-hand-side are the latest in our loop through ' a ' above. The ' -len(SPX_Y_hat) ' gets us the last x elements, where x = len(SPX_Y_hat) Â SPX_Y_hat_df = pd.DataFrame(index = SPX_Y_hat_Timestamp, Â Â Â Â Â Â Â Â Â Â Â Â Â Â data = { Â Â "SPX 1d Close Price Recursive Estimate (SPXdPRE)": [float(i) for i in SPX_Y_hat], # This 'float' business is needed here to convert our data in a way that pyplot "enjoys". Â Â "SPX 1d Close Price Recursive Estimate's Error (SPXdCPREE)": [float(i) for i in SPX_epsilon], Â Â "SPXdPRE - SPXdCPREE": [float(SPX_Y_hat[k] + SPX_epsilon[k]) for k in range(len(SPX_Y_hat))], Â Â "SPX 1d Close Price Estimate (SPXdPE)": [float(i) for i in non_recursive_SPX_Y_hat], Â Â "SPX 1d Close Price Estimate's Error (SPXdCPEE)": [float(i) - float(j) for i, j in zip(SPX["1d Close Price"].iloc[m+p-1 : m+p-1+len(SPX_Y_hat)], non_recursive_SPX_Y_hat)], Â Â "SPX 1d close prices Timestamp": list(SPX["Timestamp"].iloc[m+p-1 : m+p-1+len(SPX_Y_hat)]), Â Â "SPX 1d close prices": [float(i) for i in SPX["1d Close Price"].iloc[m+p-1 : m+p-1+len(SPX_Y_hat)]]})     SPX_Y_hat_df[["SPX 1d Close Price Recursive Estimate (SPXdPRE)", Â Â Â Â Â Â Â "SPX 1d Close Price Recursive Estimate's Error (SPXdCPREE)", Â Â Â Â Â Â Â "SPX 1d Close Price Estimate (SPXdPE)", Â Â Â Â Â Â Â "SPX 1d Close Price Estimate's Error (SPXdCPEE)", Â Â Â Â Â Â Â "SPX 1d close prices"] Â Â Â Â Â Â Â ].iplot(title = f"ARMA({p},{q})", theme = "solar") SPX_Y_hat_df    SPX 1d Close Price Recursive Estimate (SPXdPRE) SPX 1d Close Price Recursive Estimate's Error (SPXdCPREE) SPXdPRE - SPXdCPREE SPX 1d Close Price Estimate (SPXdPE) SPX 1d Close Price Estimate's Error (SPXdCPEE) SPX 1d close prices Timestamp SPX 1d close prices 2018-01-08T00:00:00Z -18.88065 2.34E+01 4.56 -8.121789 12.681789 2018-01-08T00:00:00Z 4.56 2018-01-09T00:00:00Z -28.750632 3.23E+01 3.58 9.196967 -5.616967 2018-01-09T00:00:00Z 3.58 2018-01-10T00:00:00Z -3.06 -4.97E-14 -3.06 15.262632 -18.322632 2018-01-10T00:00:00Z -3.06 2018-01-11T00:00:00Z 20.363122 -1.03E+00 19.33 1.823871 17.506129 2018-01-11T00:00:00Z 19.33 2018-01-12T00:00:00Z 14.152048 4.53E+00 18.68 -7.596277 26.276277 2018-01-12T00:00:00Z 18.68 ... ... ... ... ... ... ... ... 2018-10-16T00:00:00Z -5.404896 6.45E+01 59.13 -5.982395 65.112395 2018-10-16T00:00:00Z 59.13 2018-10-17T00:00:00Z 9.675247 -1.04E+01 -0.71 9.788073 -10.498073 2018-10-17T00:00:00Z -0.71 2018-10-18T00:00:00Z -9.035374 -3.14E+01 -40.43 -8.846948 -31.583052 2018-10-18T00:00:00Z -40.43 2018-10-19T00:00:00Z 2.139711 -3.14E+00 -1 2.064127 -3.064127 2018-10-19T00:00:00Z -1 2018-10-22T00:00:00Z 2.890344 -1.48E+01 -11.9 2.890344 -14.790344 2018-10-22T00:00:00Z -11.9   SPX_FOC_Beta_hat_component  $\displaystyle \left[\begin{matrix}0.821332045111788\\-0.421155540544277\\-0.0799433729437008\\0.504585818451687\end{matrix}\right]\$

## To Come

The above was Part 1; in Part 2, I will update this article and include (i) Ex-Ante Parameter Identification (Autocorrelation and Partial Autocorrelation Functions), (ii) a Python function to create ARMA Models with Arbitrary/Non-Consecutive Lags and Exogenous Variables using Sympy and (iii) an investigation into parallel computing's use in this example.