Processing math: 100%

Friday, September 4, 2015

Tsay Ch10 - Multivariate Volatility Models and their applicaitons

Generalize the univariate volatility models of chapter 3 to the multivariate case as well as simplifying the dynamic relationship between volatility process of multiple asset returns, to address the curse of dimensionality and time-varying correlations. Consider a multivariate return series rrt given by rrt=μμt+aat, where μμt=E(rrt|Ft1) is the conditional expectation of rrt given the past information and aat is the shock at time t. The mean equation of rrt can be modeled as a multivariate time series process (ch 8) , e.g. a simple VARMA process μμt=ΓΓxxt+pi=1ΦΦirrtiqi=1ΘΘiaati, where xxt denotes the m-dimensional vector of exogenous (explanatory) variables with x1t=1, ΓΓ is a k×m matrix, and p and q are non-negative integers.

The conditional covariance matrix of aat given Ft1 is a k×k positive definite matrix ΣΣt defined by Cov(aat|Ft1). Multivariate volatility modeling is concerned with the time evolution of ΣΣt. This is referred to as the volatility model equation of rrt.

Exponentially weighted estimate

An equally weighted estimate of unconditional covariance matrix of the innovations can be estimated by ˆΣ=1t1t1j=1ajaTj. To allow for a time-varying covariance matrix with emphasis on recent information one can use exponential smoothing as ˆΣt=1λ1λt1t1j=1λj1atjaTtj, where 0<λ<1. For a sufficiently large t such that λt10, the equation becomes ˆΣt=(1λ)at1aTt1+λˆΣt1. This is called the EWMA estimate of covariance matrix. The parameters along with λ can be jointly estimated using log-likelihood, which can be evaluated recursively. λ of 0.94 (30 days) comes out commonly as optimal. 

Some multivariate GARCH models

  1. Diagonal Vectorization model (VEC): generalization of exponentially weighted moving-average approach. Each element is a GARCH(1,1) type mode. May not produce a positive definite covariance matrix and does not model the dynamic dependence between volatility series. 
  2. BEKK model: Baba-Engle-Kraft-Kroner model (1995) to guarantee the positive-definite constraint. Too many parameters but models dynamic dependence between the volatility series. 

Reparameterization

Σt is reparameterized by making used of the symmetric property.
  1. Use of correlations - Covariance matrix can be represented as variances and lower triangle correlations and can be jointly modeled. Specifically, we write Σt as DtρtDt, where ρt is the conditional correlation matrix of at, and Dt is a k×k diagonal matrix consisting of conditional standard deviations of elements of at. To model the volatility of at, it suffices to consider the conditional variances and correlation coefficient of ait. The k(k+1)/2 dimensional vector Ξt=(σ11,t,...,σkk,t,ϱTt)T, where ϱt is a k(k1)/2 dimensional vector obtained by stacking columns of the correlation matrix ρt, but using only the elements below the main diagonal, i.e. ϱt=(ρ21,t,...,ρk1,t|ρ32,t,...,ρk2,t|...|ρk,k1,t)T. To illustrate, for k=2, we have ϱt=ρ21,t and Ξt=(σ11,t,σ22,t,ρ21,t)T, which is a 3-dimensional vector. The approach has weaknesses because the likelihood function becomes complicated when the dimension is greater than 2. And the approach requires a constrained maximization to ensure positive definiteness. 
  2. Cholesky decomposition - This requires no constrained maximization. This is orthogonal transformation so the resulting likelihood is extremely simple. Because Σt is positive definite, there exist a lower triangular matrix Lt with unit diagonal elements and a diagonal matrix Gt with positive diagonal elements such that Σt=LtGtLTt. A feature of the decomposition is that the lower off-diagonal elements of Lt and the diagonal elements of Gt have close connections with linear regression. Using Cholesky decomposition amounts to doing an orthogonal transformation from at to bt, where b1t=a1t, and bit, for 1<ik, is defined recursively by the least-square regression ait=qi1,tb1t+qi2,tb2t+...+qi(i1),tb(i1)t+bit, where qij,t is the (i,j)th element of the lower triangular matrix Lt for 1j<i. We can write this transformation as at=Ltbt, where Lt is the lower triangular matrix with unit diagonal elements. The covariance matrix of bt is Gt. The parameter vector relevant to volatility modeling under such a transformation becomes Ξt=(g11,t,...,gkk,t,q21,t,q31,t,q32,t,...,qk1,t,...,qk(k1),t)T, which is also a k(k+1)/2 dimensional vector. The likelihood function also simplifies drastically. There are several advantages of this transformation. First, Σt can be kept positive definite simply by modeling ln(gii,t). Second, element of Ξt are simply the coefficients and residual variances of multiple linear regressions that orthogonalize the shocks to the returns. Third, the correlation coefficient between a1t and a2t, which is simply q21,tσ11,t/σ22,y, is time-varying. Finally, we get σij,t=jc=1qiv,tqjv,tgvv,t.

GARCH models for bivariate returns

Thursday, September 3, 2015

Multivariate Normal distribution

This was forthcoming, especially, if you want to understand Kalman filter.

A k-dimensional random vector xx=(x1,...,xk)T follows a multivariate normal distribution with mean μμ=(μ1,...,μk)T and positive-definite covariance matrix ΣΣ=[σij] if its probability density function is f(x|μ,Σ)=1(2π)k/2|Σ|1/2e12(xμ)TΣ1(xμ). This is denoted by xNk(μ,Σ). A square matrix A(m×m) is a positive-definite matrix if A is symmetric, and all eigenvalues of A are positive. Alternatively, A is a positive-definite matrix if for any nonzero m-dimensional vector b, we have bTAb>0. For a positive-definite matrix A all eigenvalues are positive and matrix can be decomposed as A=PΛPT, where λ is a diagonal matrix consisting of all eigenvalues of A and P is an m×m matrix consisting of the m right eigenvectors of A, making P an orthogonal matrix, if eigenvalues are distinct.

For a symmetric matrix A, there exists a lower triangular matrix L with diagonal elements being 1 and a diagonal matrix G such that A=LGLT. If A is positive definite, then the diagonal elements of G are positive. In this case we can write A=(LG)(LG)T, where LG again is a lower triangle matrix. Such a decomposition is called Cholesky decomposition of A. This shows that a positive-definite matrix A can be diagonalized as L1A(LT)1=L1A(L1)T=G.

Let c=[c1,...,ck]T be a nonzero vector partitioned as x=[xT1,xT2]T, with the first of size p and the second of size kp such that, [x1x2]N([μ1μ2],[Σ11Σ12Σ21Σ22]). Some properties of x are:

  1. cTxN(cTμ,cTΣc), any nonzero linear combination of x is univariate normal and vice-versa.
  2. The marginal distribution of xi is normal, xiNk(μi,Σii).
  3. Σ12=0 if an only if x1 and x2 are independent.
  4. The variable (xμ)TΣ1(xμ) follows a chi-squared distribution with m degrees of freedom.
  5. The conditional distribution of x1 given x2=b is also normally distributed as (x1|x2=b)N(μ1+Σ12Σ122(bμ2),Σ11Σ12Σ122Σ21).
Suppose that x, y, and z are three random vectors such that their joint distribution is multivariate normal. In addition, assume that the diagonal block covariance matrix Σww is nonsingular for w=x,y,z, and Σyz=0. Then,

  1. (x|y)N(μx+ΣxyΣ1yy(yμy),ΣxxΣxxΣ1yyΣyx)
  2. (x|y,z)N(E(x|y)+ΣxzΣ1zz(zμz),Var(x|y)ΣxzΣ1zzΣzx)