Academic Article Series: Economics and Finance Investigation:

Information Demand and Stock Return Predictability (Coded in R): Part 2: Econometric Modelling of Non-Linear Variances

January 06,2021

Jonathan Legrand

IDaSRP_041-Part2

This article addresses well established return forecasting challenges via frameworks that focus on the sign of the change in asset index excess returns using a family of GARCH models. It investigates them in the literature's original S&P 500 index to study the predictive power of information demand proxied by Google's internet search vector index and finds evidence suggesting that an efficient trading strategy stemming from this study can be constructed. This article is aimed at academics from undergraduate level up, and thus will explain all mathematical notations to ensure that there is no confusion and so that anyone - no matter their expertise on the subject - can follow.

Breakdown

In this article, I primarily study the frameworks outlined in CPV to investigate the explanatory and forecasting powers of information demand on the SPX. $$ \\ $$ Financial applications of my models are studied via the perspective of a single profit maximising investor in a manner akin to CPV (Kandel and Stambaugh (1996)). In the interest of time, I only investigate the scenario where the investing agent's risk aversion implies taking a risk as soon as its expected reward is positive. Later we will look into a framework to optimise this risk aversion level to allow for a truly profit maximising strategy.

Part 1: Data Retrieval and Analysis: In this part, we start the R code, construct some rudimentary functions and collect data from Refinitiv all while explaining their use. We also go through the concept of risk-free rates and excess-returns as well as realised variance & realised volatility in addition to Google’s Search Vector Index before lightly analysing the retrieved data.
Part 2: Econometric Modelling of Non-Linear Variances: In this section, there is no coding. Instead we go – step by step and including working outs – through ARMA models, their parameter estimation process (via log-likelihood), the concepts of stationarity, Taylor’s expansion, the calculus’ chain rule, first order conditions, and GARCH models (including the GACH, GJRGARCH and EGARCH models). Finally, we outline the way in which we construct error values from which to evaluate the performance of our models.
Part 3: Coding Non-Linear Variance Econometric Models: In this section we encode the mathematical models and concepts in Part 2. We build our code, explaining it step-by-step, using in-series (computationally slow) and in-parallel (computationally fast) methods. Finally we expose the Diebold-Mariano test results for each pair of model performances, with and without using SVIs.
Part 4: Sign Forecast Frameworks: Here we outline the Christopherson and Diebold model to estimate the probability of a positive returns, we compare the predictive performance of each model via rudimentary methods, Brier Scores and Diebold and Mariano statistics.
Part 5: Financial significance: Finally, we put ourselves in the shoes of a profit maximising investor who reinvests – on a daily basis - in the risk-free asset or the S&P500 if the model suggests that its excess return will be positive with more probability than not. We graph the cumulative returns of such an investor for each model-following-strategy and discuss the results with the help of Sharpe-Ratios before concluding.

Development Tools & Resources

Refinitiv's DataStream Web Services for R (DatastreamDSWS2R): Access to DataStream data. A DataStream or Refinitiv Workspace IDentification (ID) will be needed to run the code bellow.

$$ \\ $$

Part 2: Econometric Modelling of Non-Linear Variances

Part 2 Contents

ARMA models
- ARMA models' Stationarity
  - Taylor's expansion
- Generic ARMA(p,q) model Estimation
GARCH models
- Modelling Level Changes
- Modelling Non-Linear Variances
Part 2 References

ARMA models

An AutoRegressive Moving Average model of order $p$ and $q$ (ARMA($p,q$)) represents $y_t$ in terms of $p$ of its own lags and $q$ lags of a white noise process such that:

$$\begin{align} y_t &= c + \phi_1 y_{t-1} + \phi_2 y_{t-2} + ... + \phi_p y_{t-p} + \epsilon_t + \psi_1 \epsilon_{t-1} + \psi_2 \epsilon_{t-2} + ... + \psi_q \epsilon_{t-q} \\ y_t &= c + \left( \sum_{i=1}^{p}\phi_i y_{t-i} \right)+ \epsilon_t + \left( \sum_{j=1}^{q} \psi_j \epsilon_{t-j} \right) \\ y_t - \left( \sum_{i=1}^{p}\phi_i y_{t-i} \right) &= c + \epsilon_t + \sum_{j=1}^{q} \psi_j \epsilon_{t-j} \\ y_t - \phi_1 y_{t-1} - \phi_2 y_{t-2} - ... - \phi_p y_{t-p} &= c + \epsilon_t + \psi_1 \epsilon_{t-1} + \psi_2 \epsilon_{t-2} + ... + \psi_q \epsilon_{t-q} \\ y_t \left( 1 - \phi_1 L^1 - \phi_2 L^2 - ... - \phi_p L^{p} \right) &= c + \epsilon_t \left( 1 + \psi_1 L^1 + \psi_2 L^2 + ... + \psi_q L^{q} \right) \\ \label{eq:ARMApq} \tag{1} y_t \left( 1 - \sum_{i=1}^{p}\phi_i L^{i} \right) &= c + \epsilon_t \left( 1 + \sum_{j=1}^{q} \psi_j L^{j} \right) \end{align}$$

where $L$ is the Lag opperator such that $x_t L^i = x_{t-i} $ $ \forall $ $ x, t, i \in \mathbb{R}$.

It is important to note that due to the fact that we use $p$ autoregressive and $q$ moving average componants, we use $p$ lags of our dependent variable - $y$ - and $q$ lags of our error term - $\epsilon$. This means that the first forecast produced by an ARMA($p,q$) model must use at least $max(p,q)$ lags and we thus loose these observations from our observation data-set.

$\textbf{Example:}$ Let $p = 2$ and $q = 3$, then
$$\begin{align} y_t &= c + \left( \sum_{i=1}^{2}\phi_i y_{t-i} \right)+ \epsilon_t + \left( \sum_{j=1}^{3} \psi_j \epsilon_{t-j} \right) \\ y_t - \left( \sum_{i=1}^{2}\phi_i y_{t-i} \right) &= c + \epsilon_t + \sum_{j=1}^{3} \psi_j \epsilon_{t-j} \\ y_t - \phi_1 y_{t-1} - \phi_2 y_{t-2} &= c + \epsilon_t + \psi_1 \epsilon_{t-1} + \psi_2 \epsilon_{t-2} + \psi_q \epsilon_{t-3} \\ y_t \left( 1 - \phi_1 L^1 - \phi_2 L^2 \right) &= c + \epsilon_t \left( 1 + \psi_1 L^1 + \psi_2 L^2 \psi_q L^{3} \right) \\ y_t \left( 1 - \sum_{i=1}^{2}\phi_i L^{i} \right) &= c + \epsilon_t \left( 1 - \sum_{j=1}^{3} \psi_j L^{j} \right) \end{align}$$

ARMA models' Stationarity

Now we can see that \eqref{eq:ARMApq} can be re-written such that

$$\begin{align} y_t \left( 1 - \sum_{i=1}^{p}\phi_i L^{i} \right) &= c + \epsilon_t \left( 1 + \sum_{j=1}^{q} \psi_j L^{j} \right) \\ y_t \Phi_p(L) &= c + \epsilon_t \Psi_q(L) \end{align}$$

where the AR (AutoRegressive) component of our model is encapsulated in $\Phi_p(z) \equiv (1 - \sum_{i=1}^{p}\phi_i z^{i})$ and the MA (Moving Average)'s in $\Psi_q(z) \equiv (1 + \sum_{j=1}^{q} \psi_j z^{j})$ $\forall p,q,z \in \mathbb{R}$. From there, we can see that

$$\begin{align} y_t &= \frac{c + \epsilon_t \Psi_q(L)}{\Phi_p(L)} \\ y_t &= \frac{c}{\Phi_p(L)} + \frac{\epsilon_t \Psi_q(L)}{\Phi_p(L)} \end{align}$$

$c$ is just an arbitrary constant, so we can use an arbitrary $c_{_{\Psi}}$ for $\frac{c}{\Psi_q(L)}$ such that:

$$\begin{align} y_t &= c_{_{\Psi}} + \epsilon_t \frac{\Psi_q(L)}{\Phi_p(L)} \\ &= c_{_{\Psi}} + \epsilon_t \Psi_q(L) \frac{1}{\Phi_p(L)} \\ &= c_{_{\Psi}} + \epsilon_t \Psi_q(L) \frac{1}{1 - \phi_1 L^1 - \phi_2 L^2 - ... - \phi_p L^p} \end{align}$$

N.B.: Taylor's expansion

By Tylor's expansion, $\frac{1}{1 - x} = \sum_{n=0}^{∞}{x^n}$, thus $\frac{1}{1 - x} \rightarrow ∞ \text{ if } \begin{Bmatrix} x > 1 & \text{if } x \in \mathbb{R} \\ x \text{ lies outside the unit circle } & \text{if } x \in \mathbb{Z} \end{Bmatrix}$.

$\textbf{Example 1:}$ The last equality only applies thanks to Taylor's expansion: $$\begin{equation} \frac{1}{\Phi_1(L)} = \frac{1}{1 - \phi_1 L^1} = \sum_{n=0}^{∞}{(\phi_1 L)^n} \end{equation}$$

$\textbf{Example 2:}$ The last equality only applies thanks to Taylor's expansion: $$\begin{equation} \frac{1}{\Phi_p(L)} = \frac{1}{1 - \phi_1 L^1 - \phi_2 L^2 - ... - \phi_p L^p} = \frac{1}{1 - (\phi_1 L^1 + \phi_2 L^2 + ... + \phi_p L^p)} = \sum_{n=0}^{∞}{(\phi_1 L^1 + \phi_2 L^2 + ... + \phi_p L^p)^n} \end{equation}$$ similarly to how $$\begin{equation} \frac{1}{\Phi_p(L)} = \frac{1}{1 - \sum_{i=1}^{p}\phi_i L^{i}} = \sum_{n=0}^{∞}{\left(\sum_{i=1}^{p}\phi_i L^{i} \right)^n} \end{equation}$$

Thus, $\forall (\sum_{i=1}^{p}\phi_i L^{i})$ lying outside the unit circle, we end up with $\frac{1}{\Phi_p(L)} \rightarrow ∞$; i.e.:

$ \\ $

An AR process is called **stationary** if the roots of its equation $ \Phi_p(z)$ ( denoted with a star: $z^*$ ) lie outside the unit circle

In practice, an ARMA($p$, $q$) model (${y_t \Phi_p(L) = c + \epsilon_t \Psi_q(L)}$) is stationary if its AR part ($\Phi_p(L)$) is stationary, which itself only occurs if its roots lie outside the unit circle (i.e.: ${|z^*| > 1}$). More explicitly, for $\Phi_p(z)$:

$$\begin{align} \Phi_p(z) &= 0 \\ 1 - (\phi_1 z^1 + \phi_2 z^2 + ... + \phi_p z^p) &= 0 \\ \label{eq:roots}\phi_1 z^1 + \phi_2 z^2 + ... + \phi_p z^p &= 1 \tag{2} \end{align}$$

The z value abiding by \eqref{eq:roots} are its roots, ${|z^*|}$, and there often are more than one. As long as they are all greater than one, then the process is stationary.

$\textbf{Example:}$ For AR(1) processes, (i.e.: when $p = 1$) $$\begin{align} \Phi_1(z) &= 0 \\ 1 - \phi_1 z^1 &= 0 \\ \phi_1 z &= 1 \\ z &= \frac{1}{\phi_1} \\ \end{align}$$

Then the modulus is simply the absolute value; and so ${|z^*| > 1}$ if ${\phi_1 < 1}$, by extention: when $\sum_{i=1}^{p}\phi_i L^{i}$ is lying inside the unit circle, $\frac{1}{\Phi_p(L)}$ is finite (or, with notational abuse: $\frac{1}{\Phi_p(L)} < ∞$).

Generic ARMA(p,q) model Estimation

Method of parameter estimation

In this study, we will use the Maximum Likelihood method of estimation of our unknown parameters.

The log likelihood function of independantly and identically distributed $y$ variables is defined as $L_T(\Theta) \equiv \prod_{t=1}^T{f(y_t|y_{1}, y_{2}, ... , y_{t}; \Theta)}.$ It is often normalised (without loss of information or generalisation) by $T$ such that $L_T(\Theta) \equiv \frac{1}{T} \prod_{t=1}^T{f(y_t|y_{1}, y_{2}, ... , y_{t}; \Theta)}.$

Likelihood function with normally, identically and independently distributed independent-variables and only with known parameters

Looking into the likelyhood function with normally, identically and independently distributed independent-variables and only with known parameters $\Theta = $[${\mu, \sigma^2}$], we start by asumming independently and normaly distributed independent variables {$y_t$}$^T_1$, i.e.: assuming that ${y_t \stackrel{iid}{\sim} N \left( \mu, \sigma^2 \right)}$, then - with known parameters $\Theta = $[${\mu, \sigma^2}$]:

$$\begin{array}{ll} f(y_t;\Theta) = \frac{1}{\sqrt[2]{2 \pi \sigma^2}} exp \left[ -\frac{(y_t - \mu)^2}{2 \sigma^2} \right] & & \text{where $exp[\cdot] \equiv e^{[\cdot]}$, since ${y_t \sim N(\mu, \sigma^2)}$}, \\ \prod_{t=1}^T{f(y_t|y_1;\Theta)} = \prod_{t=1}^T{f(y_t|y_{t-1}, y_{t-2}, ... , y_{1}; \Theta) } = \prod_{t=1}^T{\frac{1}{\sqrt[2]{2 \pi \sigma^2}} exp \left[ -\frac{(y_t - \mu)^2}{2 \sigma^2} \right]} & & \text{due to the independance in $y_t$'s distributions ${ \left( \text{i.e.: } y_t \stackrel{\textbf{iid}}{\sim} N(\mu, \sigma^2) \right) }$,} \\ L_T(\Theta) = \prod_{t=1}^T{\frac{1}{\sqrt[2]{2 \pi \sigma^2}} exp \left[ -\frac{(y_t - \mu)^2}{2 \sigma^2} \right]} & & \text{since the likelihood function is } L_T(\Theta) \equiv \prod_{t=1}^T{f(y_t|y_{1}, y_{2}, ... , y_{t}; \Theta)}. \end{array}$$

Then, we define the log-likelihood functin as ${ln[ L_T(\Theta) ]}$ such that: $$\begin{align} ln[ L_T(\Theta) ] \equiv ln[ f(y_T|y_{T-1}, y_{T-2}, ... , y_{1}; \Theta) ] &= ln \left( \prod_{t=1}^T{\frac{1}{\sqrt[2]{2 \pi \sigma^2}} exp \left[ -\frac{(y_t - \mu)^2}{2 \sigma^2} \right]} \right) \\ &= ln \left[ \prod_{t=1}^T{f(y_t;\Theta)} \right] \\ &= \sum_{t=1}^T{ln [ f(y_t;\Theta) ] } \\ \end{align}$$

Without loss of generality, and to take an individualistic aproach - normalising by $T$, we can write

$$\begin{align} ln[ L_T(\Theta) ] &= \frac{1}{T} \sum_{t=1}^T{ln [ f(y_t;\Theta) ] } \\ &= \frac{1}{T} \sum_{t=1}^T{ln \left( \frac{1}{\sqrt[2]{2 \pi \sigma^2}} exp \left[ -\frac{(y_t - \mu)^2}{2 \sigma^2} \right] \right) } \\ &= \frac{1}{T} \sum_{t=1}^T{ln \left( \left[2 \pi \sigma^2 \right]^{ - \frac{1}{2}} exp \left[ -\frac{(y_t - \mu)^2}{2 \sigma^2} \right] \right) } \\ &= \frac{1}{T} \sum_{t=1}^T{ \left[ - \frac{1}{2} ln(2 \pi) - \frac{1}{2} ln \left( \sigma^2 \right) - \frac{(y_t - \mu)^2}{2 \sigma^2} \right]} \\ &= - \frac{1}{2} ln(2 \pi) - \frac{1}{2} ln \left( \sigma^2 \right) - \frac{1}{2 \sigma^2 T} \sum_{t=1}^T{ (y_t - \mu)^2} \\ \end{align}$$

With unknown parameters

We may have situations where we do not know all the parameters in $\Theta$ (which, as a reminder, previously was [${\mu, \sigma^2}$]). Then, we may estimate it as:

$ \begin{equation} { \hat{\Theta} = \stackrel{\Theta}{arg max} ln[L_T(\Theta)]}. \end{equation} $

There, the notation $\stackrel{x}{arg max} f(x)$ denotes 'the value of $x$ maximising the function $f(x)$', i.e.: $\hat{\Theta} = \stackrel{\Theta}{arg max}[f(\Theta)]$ denotes 'the value of $\Theta$ maximising the function $f(\Theta)$'.

With normally, identically and independently distributed independent-variables and unknown ARMA parameters

With normally, identically and independently distributed independent-variables and unknown ARMA parameters (i.e.: $\Theta = $[${c, \phi_1, ... , \phi_p, \psi_1, ... , \psi_q, \sigma^2}$]), the assumption of 'normally, identically and independently distributed independent-variables' translates to: $ \epsilon_t \stackrel{\textbf{iid}}{\sim} N \left( 0, \sigma^2 \right) $, which itself implies that $ \epsilon_{t \vert t-1} \sim N \left( 0, \sigma^2 \right) $

We may define our (fixed but) unknown parameters as $\Theta = \left[ {c, \phi_1, ... , \phi_p, \psi_1, ... , \psi_q, \sigma^2}\right]$ and estimate them as $\hat{\Theta} = \left[ {\hat{c}, \hat{\phi_1}, ... , \hat{\phi_p}, \hat{\psi_1}, ... , \hat{\psi_q}, \hat{\sigma^2}} \right]$:

$$\begin{array}{ll} \hat{\Theta} & = \stackrel{\Theta}{arg max} ln[L_T(\Theta)] \\ & = \stackrel{\Theta}{arg max} \left[ - \frac{ln(2 \pi)}{2} - \frac{ln \left( \sigma^2 \right)}{2} - \frac{1}{2 \sigma^2 T} \sum_{t=1}^T{ (y_t - \mu)^2} \right] & \text{since } ln[L_T(\Theta)] = \left[ - \frac{ln(2 \pi)}{2} - \frac{ln \left( \sigma^2 \right)}{2} - \frac{1}{2 \sigma^2 T} \sum_{t=1}^T{ (y_t - \mu)^2} \right] \\ & = \stackrel{\Theta}{arg max} \left[ - \frac{ln(2 \pi)}{2} - \frac{ln \left( \sigma^2 \right)}{2} - \frac{1}{2 \sigma^2 T} \sum_{t=1}^T{ (y_t - \mathbb{E} \left( y_t \vert y_{t-1} \right) )^2} \right] & {\text{since $\mu = \mathbb{E} \left( y_t \vert y_{t-1} \right)$ and where $ \vert $ is the standard notation for 'knowing'} \\ \text{Here we only look at conditional Likelihood.}} \\ & = \stackrel{\Theta}{arg max} \left[ - \frac{ ln(2 \pi)}{2} - \frac{ln \left( \sigma^2 \right)}{2} - \frac{1}{2 \sigma^2 T} \sum_{t=1}^T{ \left( y_t - \mathbb{E} \left[ c + \left( \sum_{i=1}^{p}\phi_i y_{t-i} \right) \\ + \epsilon_t + \left( \sum_{j=1}^{q} \psi_j \epsilon_{t-j} \right) \vert y_{t-1} \right] \right)^2 } \right] & \text{since } y_t = c + \left( \sum_{i=1}^{p}\phi_i y_{t-i} \right)+ \epsilon_t + \left( \sum_{j=1}^{q} \psi_j \epsilon_{t-j} \right) \\ & = \stackrel{\Theta}{arg max} \left[ - \frac{ ln(2 \pi)}{2} - \frac{ln \left( \sigma^2 \right)}{2} - \frac{1}{2 \sigma^2 T} \sum_{t=1}^T{ \left( y_t - \left[ \mathbb{E}(c \vert y_{t-1} ) + \left( \sum_{i=1}^{p} \mathbb{E}( \phi_i y_{t-i} \vert y_{t-1}) \right) \\ + \mathbb{E}(\epsilon_t \vert y_{t-1} ) + \left( \sum_{j=1}^{q} \mathbb{E} (\psi_j \epsilon_{t-j} \vert y_{t-1} ) \right) \right] \right)^2 } \right] & \text{since the Expectation of summs is the sum of Expectations} \\ & = \stackrel{\Theta}{arg max} \left[ - \frac{ ln(2 \pi)}{2} - \frac{ln \left( \sigma^2 \right)}{2} - \frac{1}{2 \sigma^2 T} \sum_{t=1}^T{ \left( y_t - c + \left( \sum_{i=1}^{p}\phi_i y_{t-i} \right) \\ + \mathbb{E}(\epsilon_t \vert y_{t-1} ) + \left( \sum_{j=1}^{q} \psi_j \epsilon_{t-j} \right) \right)^2 } \right] & \text{since only } \epsilon_t \text{ is random in } y_t \text{ (knowing } y_{t-1}\text{)} \\ & = \stackrel{\Theta}{arg max} \left[ - \frac{ ln(2 \pi)}{2} - \frac{ln \left( \sigma^2 \right)}{2} - \frac{1}{2 \sigma^2 T} \sum_{t=1}^T{ \left( y_t - c + \left( \sum_{i=1}^{p}\phi_i y_{t-i} \right) \\ + 0 + \left( \sum_{j=1}^{q} \psi_j \epsilon_{t-j} \right) \right)^2 } \right] & \text{since } \mathbb{E}(\epsilon_t \vert y_{t-1} ) = \mathbb{E}(\epsilon_{t \vert t-1} ) = 0 \text{, since } \epsilon_t \sim N \left( 0, \sigma^2 \right) \\ & = \stackrel{\Theta}{arg max} \left[ - \frac{ ln(2 \pi)}{2} - \frac{ln \left( \sigma^2 \right)}{2} - \frac{1}{2 \sigma^2 T} \sum_{t=1}^T{ \left( c + \left( \sum_{i=1}^{p}\phi_i y_{t-i} \right)+ \epsilon_t + \left( \sum_{j=1}^{q} \psi_j \epsilon_{t-j} \right) \\ - c + \left( \sum_{i=1}^{p}\phi_i y_{t-i} \right) + \left( \sum_{j=1}^{q} \psi_j \epsilon_{t-j} \right) \right)^2 } \right] & \text{since } y_t = c + \left( \sum_{i=1}^{p}\phi_i y_{t-i} \right)+ \epsilon_t + \left( \sum_{j=1}^{q} \psi_j \epsilon_{t-j} \right) \\ & = \stackrel{\Theta}{arg max} \left[ - \frac{ ln(2 \pi)}{2} - \frac{ln \left( \sigma^2 \right)}{2} - \frac{1}{2 \sigma^2 T} \sum_{t=1}^T{\epsilon_t^2 } \right] & \text{rearanging} \end{array}$$

Thus:

$$ \begin{array}{ll} \hat{\Theta} &= \stackrel{\Theta}{arg max} \begin{Bmatrix} - \frac{ ln(2 \pi)}{2} - \frac{ln \left( \sigma^2 \right)}{2} - \frac{1}{2 \sigma^2 T} \sum_{t=1}^T{ \left( {\left[ y_t - c - \left( \sum_{i=1}^{p}\phi_i y_{t-i} \right) - \left( \sum_{j=1}^{q} \psi_j \epsilon_{t-j} \right) \right]}^2 \right)} \end{Bmatrix} \\ \hat{\Theta} &= \stackrel{\Theta}{arg max} \begin{Bmatrix} - \frac{ ln(2 \pi)}{2} - \frac{ln \left( \sigma^2 \right)}{2} - \frac{1}{2 \sigma^2 T} \sum_{t=1}^T{ \left( \left[ y_t - c - \left( \sum_{i=1}^{p}\phi_i y_{t-i} \right) - \left( \sum_{j=1}^{q} \psi_j \epsilon_{t-j} \right) \right] \left[ y_t - c - \left( \sum_{i=1}^{p}\phi_i y_{t-i} \right) - \left( \sum_{j=1}^{q} \psi_j \epsilon_{t-j} \right) \right] \right)} \end{Bmatrix} \\ \\ &= \stackrel{\Theta}{arg max} \begin{Bmatrix} - \frac{ ln(2 \pi)}{2} - \frac{ln \left( \sigma^2 \right)}{2} - \frac{1}{2 \sigma^2 T} \sum_{t=1}^T{ \left[ {y_t}^2 - y_t c - y_t \left( \sum_{i=1}^{p}\phi_i y_{t-i} \right) - y_t \left( \sum_{j=1}^{q} \psi_j \epsilon_{t-j} \right) \\ - c y_t + c^2 + c \left( \sum_{i=1}^{p}\phi_i y_{t-i} \right) + c \left( \sum_{j=1}^{q} \psi_j \epsilon_{t-j} \right) \\ - \left( \sum_{i=1}^{p}\phi_i y_{t-i} \right) {y_t} + \left( \sum_{i=1}^{p}\phi_i y_{t-i} \right) c + {\left( \sum_{i=1}^{p}\phi_i y_{t-i} \right)}^2 + \left( \sum_{i=1}^{p}\phi_i y_{t-i} \right) \left( \sum_{j=1}^{q} \psi_j \epsilon_{t-j} \right) \\ - \left( \sum_{j=1}^{q} \psi_j \epsilon_{t-j} \right) {y_t} + \left( \sum_{j=1}^{q} \psi_j \epsilon_{t-j} \right) c + \left( \sum_{j=1}^{q} \psi_j \epsilon_{t-j} \right) \left( \sum_{i=1}^{p}\phi_i y_{t-i} \right) + {\left( \sum_{j=1}^{q} \psi_j \epsilon_{t-j} \right)}^2 \right] } \end{Bmatrix} \\ \\ &= \stackrel{\Theta}{arg max} \begin{Bmatrix} - \frac{ln(2 \pi)}{2} - \frac{ln \left( \sigma^2 \right)}{2} - \frac{1}{2 \sigma^2 T} \sum_{t=1}^T{ \left[ {y_t}^2 - 2 y_t c - 2 y_t \left( \sum_{i=1}^{p}\phi_i y_{t-i} \right) - 2 y_t \left( \sum_{j=1}^{q} \psi_j \epsilon_{t-j} \right) \\ + c^2 + 2 c \left( \sum_{i=1}^{p}\phi_i y_{t-i} \right) + 2 c \left( \sum_{j=1}^{q} \psi_j \epsilon_{t-j} \right) \\ + { \left( \sum_{i=1}^{p}\phi_i y_{t-i} \right) }^2 + 2 \left( \sum_{i=1}^{p}\phi_i y_{t-i} \right) \left( \sum_{j=1}^{q} \psi_j \epsilon_{t-j} \right) \\ + { \left( \sum_{j=1}^{q} \psi_j \epsilon_{t-j} \right) }^2 \right] } \end{Bmatrix} \end{array} $$

N.B.: The Chain Rule states that:

if $f$ and $g$ are both differntiatable and $F(x)$ is the composite funcition defined by $F(x) = f(g(x))$, then $F$ is diferentiatable and its 1st derivative (i.e.: $F'$) is given by the product:

$$ F'(x) = f'(g(x))g'(x) $$

To maximise $- \frac{ln(2 \pi)}{2} - \frac{ln \left( \sigma^2 \right)}{2} - \frac{1}{2 \sigma^2 T} \sum_{t=1}^T{\epsilon_t^2}$ with respect to $\Theta$, we differentiate with respect to $\Theta$ and equate it to $0$, this is known as First Order Condition maximisation; thus:

First Order Condition with Respect to the constant

First Order Condition with Respect to $c$: $$ \begin{array}{ll} \frac{\delta ln[L_T(\Theta)]}{\delta c} &= - \frac{1}{2 \sigma^2 T} \sum_{t=1}^T{ \left[ - 2 y_t + 2 c + 2 \left( \sum_{i=1}^{p}\phi_i y_{t-i} \right) + 2 \left( \sum_{j=1}^{q} \psi_j \epsilon_{t-j} \right) \right] } \\ &= - \frac{1}{2 \sigma^2 T} \sum_{t=1}^T{ 2 \left[ - y_t + c + \left( \sum_{i=1}^{p}\phi_i y_{t-i} \right) + \left( \sum_{j=1}^{q} \psi_j \epsilon_{t-j} \right) \right] } \\ &= - \frac{2}{2 \sigma^2 T} \sum_{t=1}^T{ \left[ - y_t + c + \left( \sum_{i=1}^{p}\phi_i y_{t-i} \right) + \left( \sum_{j=1}^{q} \psi_j \epsilon_{t-j} \right) \right] } \\ &= - \frac{1}{\sigma^2 T} \sum_{t=1}^T{ \left[ - y_t + c + \left( \sum_{i=1}^{p}\phi_i y_{t-i} \right) + \left( \sum_{j=1}^{q} \psi_j \epsilon_{t-j} \right) \right] } \end{array} $$

by FOC, $\Theta = \widehat{\Theta}$ when $\frac{\delta ln[L_T(\Theta)]}{\delta c} = 0$, therefore $\frac{\delta ln[L_T(\widehat{\Theta})]}{\delta \widehat{c}} = 0$, and:

$$ \begin{array}{ll} \frac{\delta ln[L_T(\widehat{\Theta})]}{\delta \widehat{c}} = 0 &= - \frac{1}{\widehat{\sigma^2} T} \sum_{t=1}^T{ \left[ - y_t + \widehat{c} + \left( \sum_{i=1}^{p} \widehat{\phi_i} y_{t-i} \right) + \left( \sum_{j=1}^{q} \widehat{\psi_j} \epsilon_{t-j} \right) \right] } & \\ &= - \frac{1}{T} \sum_{t=1}^T{ \left[ - y_t + \widehat{c} + \left( \sum_{i=1}^{p} \widehat{\phi_i} y_{t-i} \right) + \left( \sum_{j=1}^{q} \widehat{\psi_j} \epsilon_{t-j} \right) \right] } & \text{multiplying both sides by } \widehat{\sigma^2} \\ &= - \frac{T \widehat{c}}{T} - \frac{1}{T} \sum_{t=1}^T{ \left[ - y_t + \left( \sum_{i=1}^{p} \widehat{\phi_i} y_{t-i} \right) + \left( \sum_{j=1}^{q} \widehat{\psi_j} \epsilon_{t-j} \right) \right] } & \\ &= - \widehat{c} - \frac{1}{T} \sum_{t=1}^T{ \left[ - y_t + \left( \sum_{i=1}^{p} \widehat{\phi_i} y_{t-i} \right) + \left( \sum_{j=1}^{q} \widehat{\psi_j} \epsilon_{t-j} \right) \right] } \end{array} $$

then:

$$ \begin{array}{ll} \widehat{c} &= - \frac{1}{T} \sum_{t=1}^T{ \left[ - y_t + \left( \sum_{i=1}^{p} \widehat{\phi_i} y_{t-i} \right) + \left( \sum_{j=1}^{q} \widehat{\psi_j} \epsilon_{t-j} \right) \right] } \end{array} $$

First Order Condition with Respect to our AR coefficients

First Order Condition with Respect to $\phi_i$:

The notation here might seem a litle confusing, but it is actually ergonomic. We are going to investigate the $ln[L_T(\Theta)]$ function with respect to any one $\phi_i$ for any applicable $i$. The bellow is thus for any one applicable $i$:

$$ \begin{array}{ll} \frac{\delta ln[L_T(\Theta)]}{\delta \phi_i} &= - \frac{1}{2 \sigma^2 T} \sum_{t=1}^T{ \left[- 2 y_t y_{t-i} + 2 c y_{t-i} + 2 { \left( \sum_{k=1}^{p}\phi_k y_{t-k} \right) } y_{t-i} + 2 y_{t-i} \left( \sum_{j=1}^{q} \psi_j \epsilon_{t-j} \right) \right] } \\ &= - \frac{2}{2 \sigma^2 T} \sum_{t=1}^T{ \left[- y_t y_{t-i} + c y_{t-i} + { \left( \sum_{k=1}^{p}\phi_k y_{t-k} \right) } y_{t-i} + y_{t-i} \left( \sum_{j=1}^{q} \psi_j \epsilon_{t-j} \right) \right] } \\ &= - \frac{1}{\sigma^2 T} \sum_{t=1}^T{ \left[- y_t y_{t-i} + c y_{t-i} + { \left( \sum_{k=1}^{p}\phi_k y_{t-k} \right) } y_{t-i} + y_{t-i} \left( \sum_{j=1}^{q} \psi_j \epsilon_{t-j} \right) \right] } \end{array} $$

by FOC, $\Theta = \widehat{\Theta}$ when $\frac{\delta ln[L_T(\Theta)]}{\delta \phi_i} = 0$, therefore $\frac{\delta ln[L_T(\widehat{\Theta})]}{\delta \phi_i} = 0$, and (again, for any applicable $i$) :

$$ \begin{array}{ll} \frac{\delta ln[L_T(\widehat{\Theta})]}{\delta \widehat{\phi_i}} = 0 &= - \frac{1}{\widehat{\sigma^2} T} \sum_{t=1}^T{ \left[- y_t y_{t-i} + \widehat{c} y_{t-i} + { \left( \sum_{k=1}^{p} \widehat{\phi_k} y_{t-k} \right) } y_{t-i} + y_{t-i} \left( \sum_{j=1}^{q} \widehat{\psi_j} \epsilon_{t-j} \right) \right] } \\ &= \sum_{t=1}^T{ \left[- y_t y_{t-i} + \widehat{c} y_{t-i} + { \left( \sum_{k=1}^{p} \widehat{\phi_k} y_{t-k} \right) } y_{t-i} + y_{t-i} \left( \sum_{j=1}^{q} \widehat{\psi_j} \epsilon_{t-j} \right) \right] } \\ &= \sum_{t=1}^T{ y_{t-i} \left[- y_t + \widehat{c} + { \left( \sum_{k=1}^{p} \widehat{\phi_k} y_{t-k} \right) } + \left( \sum_{j=1}^{q} \widehat{\psi_j} \epsilon_{t-j} \right) \right] } \end{array} $$

First Order Condition with Respect to MA coefficients

First Order Condition with Respect to $\psi_j$:

The notation here might seem a litle confusing, but it is actually ergonomic. We are going to investigate the $ln[L_T(\Theta)]$ function with respect to any one $\psi_j$ for any applicable $j$. The bellow is thus for any one applicable $j$:

$$ \begin{array}{ll} \frac{\delta ln[L_T(\Theta)]}{\delta \psi_j} &= - \frac{1}{2 \sigma^2 T} \sum_{t=1}^T{ \left[- 2 y_t \epsilon_{t-j} + 2 c \epsilon_{t-j} + 2 { \left( \sum_{i=1}^{p}\phi_i y_{t-i} \right) } \epsilon_{t-j} + 2 \left( \sum_{k=1}^{q} \psi_k \epsilon_{t-k} \right) \epsilon_{t-j} \right] } \\ &= - \frac{2}{2 \sigma^2 T} \sum_{t=1}^T{ \left[- y_t \epsilon_{t-j} + c \epsilon_{t-j} + { \left( \sum_{i=1}^{p}\phi_i y_{t-i} \right) } \epsilon_{t-j} + \left( \sum_{k=1}^{q} \psi_k \epsilon_{t-k} \right) \epsilon_{t-j} \right] } \\ &= - \frac{1}{\sigma^2 T} \sum_{t=1}^T{ \left[- y_t \epsilon_{t-j} + c \epsilon_{t-j} + { \left( \sum_{i=1}^{p}\phi_i y_{t-i} \right) } \epsilon_{t-j} + \left( \sum_{k=1}^{q} \psi_k \epsilon_{t-k} \right) \epsilon_{t-j} \right] } \end{array} $$

by FOC, $\Theta = \widehat{\Theta}$ when $\frac{\delta ln[L_T(\Theta)]}{\delta \psi_j} = 0$, therefore $\frac{\delta ln[L_T(\widehat{\Theta})]}{\delta \psi_j} = 0$, and (again, for any applicable $j$) :

$$ \begin{array}{ll} \frac{\delta ln[L_T(\widehat{\Theta})]}{\delta \widehat{\psi_j}} = 0 &= - \frac{1}{\widehat{\sigma^2} T} \sum_{t=1}^T{ \left[- y_t \epsilon_{t-j} + \widehat{c} \epsilon_{t-j} + { \left( \sum_{i=1}^{p} \widehat{\phi_i} y_{t-i} \right) } \epsilon_{t-j} + \left( \sum_{k=1}^{q} \widehat{\psi_k} \epsilon_{t-k} \right) \epsilon_{t-j} \right] } \\ &= \sum_{t=1}^T{ \left[- y_t \epsilon_{t-j} + \widehat{c} \epsilon_{t-j} + { \left( \sum_{i=1}^{p} \widehat{\phi_i} y_{t-i} \right) } \epsilon_{t-j} + \left( \sum_{k=1}^{q} \widehat{\psi_k} \epsilon_{t-k} \right) \epsilon_{t-j} \right] } \\ &= \sum_{t=1}^T{ \epsilon_{t-j} \left[- y_t + \widehat{c} + { \left( \sum_{i=1}^{p} \widehat{\phi_i} y_{t-i} \right) } + \left( \sum_{k=1}^{q} \widehat{\psi_k} \epsilon_{t-k} \right) \right] } \end{array} $$

First Order Condition with Respect to our Variance

First Order Condition with Respect to $\sigma^2$:

Letting

$$ S(\widehat{c}, \widehat{\phi_i}, \widehat{\psi_j}) = \sum_{t=1}^T { \left[ {y_t}^2 - 2 y_t \widehat{c} - 2 y_t \left( \sum_{i=1}^{p} \widehat{\phi_i} y_{t-i} \right) - 2 y_t \left( \sum_{j=1}^{q} \widehat{\psi_j} \epsilon_{t-j} \right) \\ + \widehat{c}^2 + 2 \widehat{c} \left( \sum_{i=1}^{p} \widehat{\phi_i} y_{t-i} \right) + 2 \widehat{c} \left( \sum_{j=1}^{q} \widehat{\psi_j} \epsilon_{t-j} \right) \\ + { \left( \sum_{i=1}^{p} \widehat{\phi_i} y_{t-i} \right) }^2 + 2 \left( \sum_{i=1}^{p} \widehat{\phi_i} y_{t-i} \right) \left( \sum_{j=1}^{q} \widehat{\psi_j} \epsilon_{t-j} \right) \\ + { \left( \sum_{j=1}^{q} \widehat{\psi_j} \epsilon_{t-j} \right) }^2 \right] } $$

then:

$$ \begin{array}{ll} \frac{\delta ln[L_T(\Theta)]}{\delta \sigma^2} &= - \frac{1}{2 \sigma^2} + \frac{1}{2 {\left( \sigma^2 \right)}^2 T} S(\widehat{c}, \widehat{\phi_i}, \widehat{\psi_j}) \end{array} $$

Note that the $+$ in $+ \frac{1}{2 {\left( \sigma^2 \right)}^2 T}$ comes from the differential of $- \frac{1}{2 \sigma^2 T}$, i.e.: of $- \frac{(\sigma^2)^{-1}}{2 T}$

by FOC, $\Theta = \widehat{\Theta}$ when $\frac{\delta ln[L_T(\Theta)]}{\delta \sigma^2} = 0$, therefore $\frac{\delta ln[L_T(\widehat{\Theta})]}{\delta \sigma^2} = 0$, and:

$ \begin{array}{ll} \frac{\delta ln[L_T(\widehat{\Theta})]}{\delta \widehat{\sigma^2}} = 0 &= - \frac{1}{2 \widehat{\sigma^2}} + \frac{1}{2 {\left( \widehat{\sigma^2} \right)}^2 T} S(\widehat{c}, \widehat{\phi_i}, \widehat{\psi_j}) \end{array} $

Thus:

$ \begin{array}{ll} \frac{1}{2 \widehat{\sigma^2}} &= \frac{1}{2 {\left( \widehat{\sigma^2} \right)}^2 T} S(\widehat{c}, \widehat{\phi_i}, \widehat{\psi_j}) \end{array} $

then:

$ \begin{array}{ll} \frac{1}{\widehat{\sigma^2}} &= \frac{1}{{\left( \widehat{\sigma^2} \right)}^2 T} S(\widehat{c}, \widehat{\phi_i}, \widehat{\psi_j}) \end{array} $

then:

$$ \begin{array}{ll} \widehat{\sigma^2} &= \frac{1}{T} S(\widehat{c}, \widehat{\phi_i}, \widehat{\psi_j}) \\ &= \frac{1}{T} \sum_{t=1}^T { \left[ {y_t}^2 - 2 y_t \widehat{c} - 2 y_t \left( \sum_{i=1}^{p} \widehat{\phi_i} y_{t-i} \right) - 2 y_t \left( \sum_{j=1}^{q} \widehat{\psi_j} \epsilon_{t-j} \right) \\ + \widehat{c}^2 + 2 \widehat{c} \left( \sum_{i=1}^{p} \widehat{\phi_i} y_{t-i} \right) + 2 \widehat{c} \left( \sum_{j=1}^{q} \widehat{\psi_j} \epsilon_{t-j} \right) \\ + { \left( \sum_{i=1}^{p} \widehat{\phi_i} y_{t-i} \right) }^2 + 2 \left( \sum_{i=1}^{p} \widehat{\phi_i} y_{t-i} \right) \left( \sum_{j=1}^{q} \widehat{\psi_j} \epsilon_{t-j} \right) \\ + { \left( \sum_{j=1}^{q} \widehat{\psi_j} \epsilon_{t-j} \right) }^2 \right] } \\ \end{array} $$

$ \\ $

First Order Conditions

Resuming; our FOC equations produced:

$$ \begin{equation} \widehat{c} = - \frac{1}{T} \sum_{t=1}^T{ \left[ - y_t + \left( \sum_{i=1}^{p} \widehat{\phi_i} y_{t-i} \right) + \left( \sum_{j=1}^{q} \widehat{\psi_j} \epsilon_{t-j} \right) \right] } \end{equation} $$$$ \\ $$

$$\begin{align} 0 = \sum_{t=1}^T{ y_{t-i} \left[- y_t + \widehat{c} + { \left( \sum_{k=1}^{p} \widehat{\phi_k} y_{t-k} \right) } + \left( \sum_{j=1}^{q} \widehat{\psi_j} \epsilon_{t-j} \right) \right] } & \text{ for all } i \leq p \end{align}$$$$ \\ $$

$$\begin{align} 0 = \sum_{t=1}^T{ \epsilon_{t-j} \left[- y_t + \widehat{c} + { \left( \sum_{i=1}^{p} \widehat{\phi_i} y_{t-i} \right) } + \left( \sum_{k=1}^{q} \widehat{\psi_k} \epsilon_{t-k} \right) \right] } & \text{ for all } j \leq q \end{align}$$$$ \\ $$

$$ \begin{equation} \widehat{\sigma^2} = \frac{1}{T} \sum_{t=1}^T { \left[ {y_t}^2 - 2 y_t \widehat{c} - 2 y_t \left( \sum_{i=1}^{p} \widehat{\phi_i} y_{t-i} \right) - 2 y_t \left( \sum_{j=1}^{q} \widehat{\psi_j} \epsilon_{t-j} \right) \\ + \widehat{c}^2 + 2 \widehat{c} \left( \sum_{i=1}^{p} \widehat{\phi_i} y_{t-i} \right) + 2 \widehat{c} \left( \sum_{j=1}^{q} \widehat{\psi_j} \epsilon_{t-j} \right) \\ + { \left( \sum_{i=1}^{p} \widehat{\phi_i} y_{t-i} \right) }^2 + 2 \left( \sum_{i=1}^{p} \widehat{\phi_i} y_{t-i} \right) \left( \sum_{j=1}^{q} \widehat{\psi_j} \epsilon_{t-j} \right) \\ + { \left( \sum_{j=1}^{q} \widehat{\psi_j} \epsilon_{t-j} \right) }^2 \right] } \\ \end{equation} $$

From there one can only compute the parameters numerically. It does not look like it due to the large amounts of substitutions that would be required but these FOC equations offer a closed system solvable itteratively from the data we have. Remember that the data we have consist of $y_t, y_{t-1}, ... , y_1, \epsilon_{t-1}, \epsilon_{t-2}, ..., \epsilon_{1}$, i.e.: everything without a hat.

GARCH models

Modelling Level Changes

Let $\{R_t\}_{t \in \mathbb{Z}}$ be the level change in index excess return such that (i.e.: $\textbf{:}$) $R_t \in \mathbb{R}$ at time $t$. Then one may attempt to encapsulate the movement in $R_t$ via the discrete-time stochastic process $$\begin{equation} \label{eq:AR1} R_t = C + \phi_1 R_{t-1} + \epsilon_t \tag{3} \end{equation}$$ where $C \in \mathbb{R}$ is an unknown constant, $\phi_1$ is the cofactor of the regressor $R_{t-1}$ - the 1$^{st}$ lag of $R_{t}$, and the error term $\epsilon_{t} \stackrel{iid}{\sim} N(0,\sigma^2)$ has standard deviation $\sigma \in \mathbb{R}$. This is an AR process of order one - AR(1). It is then easy to incorporate SVI - enabling us to study its effect on $R_t$ - with the cofactor $\phi_2 \in \mathbb{R}$ via the AR(1)-SVI model: $$\begin{equation} \label{eq:AR1SVI} R_t = C + \phi_1 R_{t-1} + \phi_2 \Delta SVI_{t-1} + \epsilon_t \text{ .} \tag{4} \end{equation}$$

These two models will be used throughout this article to model the level change in index excess returns.

Modeling Non-Linear Variances

ARCH Models

As aforementioned, the lack of time-dependence in $\sigma$ results in the AR(1) model outlined in the above AR(1) Equation \ref{eq:AR1} and \ref{eq:AR1SVI} failing to represent SFFR (iii). To account for this heteroscedasticity, ARCH models of order $q \in \mathbb{Z}$ where $q > 0$ can be used. These ARCH($q$) models render $\sigma$ into time-dependent $\sigma_{t}$ $\textbf{:}$ $\epsilon_t \stackrel{iid}{\sim} N(0,\sigma^2_{t})$. To do so, let $\Omega_t$ denote the information set at time $t$; let the conditional mean and variance of $R_{t+1}$ respectively be $\mu_{t+1|t} = E(R_{t+1} | \Omega_{t})$ and $\sigma_{t+1|t}^2 = Var(R_{t+1} | \Omega_t)$ where $E$ and $Var$ are respectively the expectation and variance operators; let

$$\begin{equation} \epsilon_t = \sigma_{t|t-1} \eta_t \end{equation}$$

where $\eta_t$ is independent from $\Omega_t$ and $\eta_t \stackrel{iid}{\sim} N(0,1)$; and let

$$\begin{equation} \sigma_{t|t-1}^2 = c + \sum_{i=1}^{q}\alpha_i \epsilon_{t-i}^2 \end{equation}$$

where $c,\alpha_i \in \mathbb{R}$, $c>0$ and $\alpha_i \geq 0 \quad \forall i=(1,2,...,q)$ to allow only for positive - thus valid - variances.

Several expansions were applied to the ARCH model; the ones of interest here are namely the GARCH, GJRGARCH and EGARCH models of order one.

GARCH models

GARCH(1,1) account for SFFR (iv) and (v) in addition to (iii) and non-normal unconditional distributions in levels, which works in parallel with SFFR (i). To do so, an MA component is added to the ARCH(1):

$$\begin{equation} \sigma_{t|t-1}^2 = c + \alpha \epsilon_{t-1}^2 + \beta \sigma_{t-1}^2 \end{equation}$$

where elements are as aforementioned, $\beta \in \mathbb{R}$ is positive and in this specific case: $(\alpha + \beta) < 1$ for stationarity. I implement the SVI factor with positive $\delta \in \mathbb{R}$ - enabling us to study its effect on $R_t$'s variance - via:

$$\begin{equation} \sigma_{t|t-1}^2 = c + \alpha \epsilon_{t-1}^2 + \beta \sigma_{t-1}^2 + \delta \Delta SVI_{t-1} \end{equation}$$

Whilst GARCH models are a natural fit in this study by the virtues mentioned, “in general, models that incorporate volatility asymmetry such as EGARCH and GJR-GARCH perform better” (Poon and Granger (2003), p.507). Indeed, GARCH models cannot account for SFFR (v) of leverage effects, whereas GJRGARCH and EGACH models may; CPV therefore considered them too - as do I. The thicker bottom left tail of the Empirical Probability Distribution Functions (EPDFs) suggest such leverage effects.

GJRGARCH Models

GJRGARCH models account for asymmetry (SFFR (i)) via the last summand in

$$\begin{equation} \sigma_{t|t-1}^2 = c + \alpha \epsilon_{t-1}^2 + \beta \sigma_{t-i}^2 + \gamma I(\epsilon_{t-1}<0) \epsilon_{t-1}^2 \end{equation}$$

and second to last in $$\begin{equation} \sigma_{t|t-1}^2 = c + \alpha \epsilon_{t-1}^2 + \beta \sigma_{t-1}^2 + \gamma I(\epsilon_{t-1}<0) \epsilon_{t-1}^2 + \delta \Delta SVI_{t-1} \end{equation}$$ where a leverage case would result in $\gamma > 0$, and $I(\cdot)$ is the indicator function returning $1$ if its bracketed statement is true, $0$ otherwise - in this case:

$$\begin{equation} I(\epsilon_{t-1}<0) = \begin{Bmatrix} 1 & \text{if } \epsilon_{t-1}<0 \\ 0 & \text{otherwise} \end{Bmatrix} . \end{equation}$$

EGARCH Models

The use of the logarithm function insures non-negativity in EGARCH models

$$\begin{equation} ln(\sigma_{t|t-1}^2) = c + \alpha \Bigg[ \frac{|\epsilon_{t-1}|}{\sigma_{t-1}} - \sqrt{\frac{2}{\pi}} \Bigg] + \beta ln(\sigma_{t-1}^2) + \gamma \frac{\epsilon_{t-1}}{\sigma_{t-1}} \end{equation}$$

and

where ln($\cdot$) and |$\cdot$| denote the logarithm base $e$ and the absolute value functions respectively and where $c, \alpha, \beta, \gamma \in \mathbb{R}$ are not constrained $\textbf{: } c + \alpha + \beta + \gamma > 0$ . If $Var(R_t|\Omega_{t-1})$ and $R_t$ react negatively to each other, $\beta$ would be negative, and vice versa.

Generic ARMA(p,q)-Non-Linear-Variance Model Estimation

ARMA(p,q)-GARCH(P,Q), ARMA(p,q)-GJRGARCH(P,Q) and ARMA(p,q)-EGARCH(P,Q) models' estimates can be computed via the same method as the generic ARMA(p,q) model above, but substituting in the modeled value for $\sigma_{t|t-1}^2$. This would then be a (Conditional) Quazi Maximum Likelihood estimation method where (assuming normality in the errors):

$$ L_T(\Theta) = \prod_{t=1}^T{\frac{1}{\sqrt[2]{2 \pi \sigma^2}} exp \left[ -\frac{(y_t - \mu)^2}{2 \sigma^2} \right]} $$

where $\sigma^2$ is recursively computed $\forall t \geq 1$ as dictated by the non-linear variance model GARCH(P,Q), GJRGARCH(P,Q) or EGARCH(P,Q).

Error Functions

To asses the accuracy of volatility and positive sign probability forecasts, this sub-section specifies error functions.

Variance forecasts from the models GARCH to EGARCH-SVI are simply compared against realised volatilities to compute variance forecast errors as $$ \hat{\sigma}_{t|t-1} - RV_t \text{ .}$$

Sign forecast errors are computed as

$$\begin{equation} \label{eq:SignForecastErrors} \hat{\pi}_{t+1}^{j} - \mathbb{P}(R_{t+1} > 0 | \Omega_{t \textbf{+1}}) \end{equation}$$

where $j = (C\&D_M, Naive)$, and where $M$ can be replaced by a model name ranging from GARCH to EGARCH-SVI (Note the bold `$+1$' in $\Omega_{t \textbf{+1}}$ here to emphasise how $\mathbb{P}(R_{t+1}>0|\Omega_{t \textbf{+1}})=I(R_{t+1}>0)$.) and also where:

Naïve Framework

A Naïve model of sign change is used as a benchmark to construct comparisons. It is formulated as $$ \hat{\pi}_{t+1}^{Naive} = \hat{\mathbb{P}}_{Naive}(R_{t+1|t}>0) = \frac{1}{t} \sum_{k=1}^{t} I (R_k > 0) \text{ .}$$

$ \ $