Weak stationarity and cross-correlation matrices
k-dimensional time series rtrt=[r1t,...,rkt]T is weakly stationary if its first and second moments are time-invariant, μμ=E(rrt) and ΓΓ0=E[(rrt−μμ)(rrt−μμ)T]. Let DD be a k×k diagonal matrix consisting of the standard deviations of rit. The lag-zero, cross-correlation matrix is defined as ρρ0=DD−1ΓΓ0DD−1, which is the correlation matrix. The lag-l cross-covariance matrix rrt is defined as ΓΓl=E[(rrt−μμ)(rrt−l−μμ)T]. For a weakly stationary series, the cross-covariance matrix ΓΓl is a function of l, not the time index t. The lag-l cross-correlation matrix (CCM) is defined as ρρl=DD−1ΓΓlDD−1. To understand this better, noticeρij(l)=Γij(l)√Γii(0)Γjj(0)=Cov(rit,rj,t−l)std(rit)std(rjt)
which is the correlation between rit and rj,t−l. If ρij(l)≠0 and l>0, we say that the series rjt leads the series rit at lag l. Similarly, if ρji(l)≠0 and l>0 we say that the series rit leads the series rjt at lag l. The diagonal element ρii(l) is simply the lag-l autocorrelation coefficient of rit.
Some remarks are (l>0):
1) In general ρij(l)≠ρji(l) for i≠j, because they are measuring two different lag relationships, implying ΓΓl and ρρl are in general not symmetric.
2) It is easy to see that ΓΓl=ΓΓT−l and ρρl=ρρT−l. Hence it suffices in practice to consider the cross-correlation matrices ρρl for l≥0.
For the cross-correlation matrices {ρρl|l=0,1,...}, the diagonal elements {ρii(l)|l=0,1,...} are the autocorrelation function of rit, the off-diagonal element ρij(0) measures the concurrent linear relationship between rit and rjt, and for l>0 the off-diagonal element ρij(l) measures the linear dependence of rit on past value of rj,t−l. Depending on the value in these matrices one and identify -
1) no linear relationship (ρij(l)=ρji(l)=0,∀l≥0),
2) concurrent correlation (ρij(0)≠0),
3) no lead-lag relationship (ρij(l)=ρji(l)=0,∀l>0),
4) unidirectional relationship (ρij(l)=0,∀l>0,ρji(v) for some v≥0), or
5) feedback relationship (ρij(l)≠0, for some l>0,ρji(v)≠0 for some v≥0) .
Sample Cross-correlation matrices can be estimate using ^ρρl=^DD−1^ΓΓl^DD−1, where
ˆΓl=1TT∑t=l+1(rrt−¯rr)(rrt−l−¯rr)T,l≥0.
Bootstrapping can be used to get confidence intervals on finite samples.
Multivariate Portmanteau tests: or Multivariate Ljung-Box test with statistic Q(m) have the null hypothesis H0:ρρ1=...=ρρm=00, and Ha:ρρi≠0 for some iϵ1,...,m. The test statistic assumes the form
Qk(m)=T2m∑l=11T−ltr(ˆΓTl^ΓΓ−10^ΓΓl^ΓΓ−10)
and under some regularity conditions follows a chi-squared distribution with k2m degrees of freedom, asymptotically.
Vector autoregressive models (VAR)
A multivariate time series rrt is a VAR process of order 1, or VAR(1) for short, if it follows the model rrt=ϕϕ0+ΦΦrrt−1+aat, where ϕϕ0 is a k-dimensional vector, ΦΦ is a k×k matrix, and aat is a sequence of uncorrelated random vectors with mean zero and covariance matrix ΣΣ, which is positive definite (generally assumed to be multivariate normal).
Positive definite matrix is a symmetric matrix with all eigenvalues positive. Also for any vector bb, we have bbTAbAb>0. These types of matrices can be decomposed as AA=PΛPPΛPT, where ΛΛ is a diagonal matrix consisting of eigenvalues of AA and PP is a square matrix consisting of eigenvectors of AA. These eigenvectors are orthogonal to each other. Matrix PP is orthogonal and this decomposition is referred to as spectral decomposition. For a symmetric matrix AA, there exists a lower triangular matrix LL with diagonal elements being 1 and a diagonal matrix GG such that AA=LGLLGLT. If AA is positive definite, then the diagonal elements of GG are positive. In this case, we have A=L√G√GLA=L√G√GLT=MMMMT, where M=L√GM=L√G is a lower triangular matrix. This is called Cholesky decomposition. Notice that this implies LL−1AA(LL−1)T=GG.
Reduced and Structural form: In general the off diagonal elements of matrix ΣΣ show the concurrent relationship between r1t and r2t, while the matrix ΦΦ measures the dynamic dependence or rrt. This is called reduced-form model because it does not show explicitly the concurrent dependence between the component series. The explicit-form expression of concurrent relationship (for the last series and hence any series by rearrangement) can be deduced by a simple linear transformation. Using Cholesky decomposition (possible because ΣΣ is positive definite symmetric matrix) we can find a lower triangle matrix LL with unit diagonal elements such that Σ=LGLΣ=LGLT, where GG is a diagonal matrix. If we define bbt=LL−1aat, then E(bbt)=LL−1E(atat)=00 and Cov(bbt)=E(bbtbbTt)=LL−1ΣΣ(LLT)−1=GG. Since GG is a diagonal matrix the components of bbt are uncorrelated.
Pre-multiplying the reduced-form with LL−1, to uncouple the equations, we get
LL−1rrt=LL−1ϕϕ0+LL−1ΦΦrrt−1+LL−1aat=ϕϕ∗0+ΦΦ∗rrt−1+bbt.
The last row of LL−1 has 1 as the last element, let it be (wk1,wk2,..,wk,k−1,1), and hence the structural equation for the last (kth) time series becomes:
rkt+k−1∑i=1wkirit=ϕ∗k,0+k∑i=1Φ∗kiri,t−1+bkt.
This is possible because bbt is a diagonal matrix and uncoupled. Reduced-form is commonly used for two reasons - ease in estimation, and concurrent correlations cannot be used in forecasting.
Stationarity condition and moments of a VAR(1) model: All eigenvalues of ΦΦ should be less than 1 in modulus for weak stationarity for rrt, provided the covariance matrix of aat exists. Further we have ΓΓl=ΦΓΦΓl−1, for l>0, where ΓΓj is the lag-j cross-covariance matrix of rrt. By repeated substitutions we get ΓΓl=ΦΦlΓΓ0. Further for ΥΥ=DD−1/2ΦΦDD1/2, we get ρρl=ΥΥlρρ0. A VAR(p) model is generally converted to a VAR(1) model using companion matrix and then analyzed like a VAR(1) model.
To find the order of a VAR model one can generally use the multi-variant equivalent of PACF with hypothesis tests on the successive residuals. The ith equation in the PACF is given by rrt=ϕϕ0+ΦΦ1rrt−1+...+ΦΦirrt−i+aat. Parameters of these equations can be estimated by OLS method. For the ith equation let the OLS estimates of coefficients be ^ΦΦ(i)j and ^ϕϕ(i)0, where the superscript (i) is used to denote the VAR(i) model. Then the residual is ^aa(i)t=rrt−^ϕϕ(i)0−^ΦΦ(i)1rrt−1−...−^ΦΦ(i)irrt−i. We then test the hypothesis sequentially to identify the order of VAR model. For ith and (i−1)th equations we test H0:ΦΦi=0 versus Ha:ΦΦi≠0, the test statistic is
M(i)=−(T−k−i−32)ln(|ˆΣi||ˆΣi−1|).
Asymptotically, M(i) is distributed as a chi-squared distribution with k2 degrees of freedom, where k is the dimensionality of the asset universe.
Equivalent AIC and BIC methods can also be employed. OLS or ML is generally used to estimate the parameters, the two methods being asymptotically equivalent. Once a model is fit, residuals should be tested for inadequacy using Qk(m) statistic (with k2m−g degrees of freedom). Forecasting is similar to the uni-variate case. Impulse response function is the MA version and can be derived to look at the decay rate. The MA equation is premultiplied by LL−1 to get the impulse response function of rrt with the orthogonal innovations bbt. But different ordering may lead to different response functions, which is a drawback.
Vector moving-average models (VMA) :A VMA(1) model is given by rrt=θθ0+aat−ΘΘaat−1,
where θθ0 is k-dimensional vector, ΘΘ is k×k matrix. Like the uni-variate case the cross-correlations cuts at order 1 and can be used to identify the order. Estimation of VMA model is lot more involved. The conditional or exact MLE approaches can be used.
Vector ARMA models (VARMA): In generalizing from uni-variate to vectors ARMA encounters the issue of identifiability - it may not be uniquely defined. Some constraints need to be imposed - structural specification. These are hardly used.
Marginal models of components: Given the vector models the component models are called the marginal models. For a k-dimensional ARMA(p,q) model, the marginal models are ARMA(kp,(k-1)p+q) models.
Unit root nonstationarity and cointegration: When modeling several unit-root nonstationary time series jointly, one may encounter the case of cointegration. This may be due to the common trend or unit root of one of the components. In other words one can find a linear combination which is stationary. Let h be the number of unit roots (or common trends) in the k-dimensional series xxt. Cointegration exists if 0<h<k, and the quantity k−h is called the number of cointegrating factors, these are the different linear combinations that are unit-root stationary. The linear combinations resulting in these unit-root stationary processes are called the cointegrating vectors. Two price series, if cointegrated, will have common underlying trend and we will lose this information if we take the first difference of the price series. This is because one difference per unit-root preserves useful information. Under cointegration we have more non-stationary series than uni-roots hence losing information. Overdifferencing can lead to the problem of unit roots in the MA matrix polynomial, invertability and estimation. Also, cointegration can also exist after adjusting for transaction costs and exchange-rate risk, which is artificial.
Error correction form: To overcome the difficulty of noninvertible VARMA models one can use this form. A VARMA(p,q) model is xxt=p∑i=1ΦΦixxt−i+aat−q∑j=1ΘΘjaat−j.
For Δxxt=xxt−xxt−1, we can subtract xxt−1 from both side of VARMA equation to get the error correction form. Δxxt=αβαβTxxt−1+p−1∑i=1ΦΦ∗iΔxxt−i+aat−q∑j=1ΘΘjaat−j
where αα and ββ are k×m full-rank matrix, k is the total asset dimension, m is the cointegrating factors (m<k). The term αβαβTxxt−1 is called the error-correction term, as it compensates for the over-differentiating. Further, ββTxxt−1 is stationary. Also, ΦΦ∗j=−p∑i=j+1ΦΦi,j=1,...,p−1
αβαβT=ΦΦp+...+ΦΦ1−II
The time series ββTxtxt is unit-root stationary, and the columns of ββ are the cointegrating vectors of xxt.
Co-integrated VAR models: We focus on VAR models for their simplicity in estimation, to better understand cointegration here. For a k-dimensional VAR(p) model
xxt=μμt+ΦΦ1xxt−1+...+ΦΦpxxt−p+aat,
where μμ0+μμ1t. Or equivalently, using backshift operator B
(II−ΦΦ1B−...−ΦΦpBp)xxt=μμt+aat.
The characteristic polynomial in the above is represented as ΦΦ(B). For a unit-root nonstationary process, 1 is a root making |ΦΦ(1)|=0 An error-correction form for this can be obtained by subtracting xxt−1 from both sides of the equation giving
Δxxt=μμt+ΠΠxxt−1+ΦΦ∗1Δxxt−1+...+ΦΦ∗p−1Δxxt−p+1+aat
Where ΠΠ=ΦΦ1+...+ΦΦp−II=−ΦΦ(1) and ΦΦ∗j=−∑pi=j+1ΦΦi for j=1,...,p−1. Further if Rank(ΠΠ)=0 implies that xxt is not cointegrated, Rank(xxt)=k implies that ECM is not informative and one studies xxt directly. Finally, if 0<Rank(ΠΠ)=m<k then one can write ΠΠ=αβαβT, where αα and ββ are both of rank m. We have the case of cointegration with m linearly independent cointegrated vectors wwt=ββTxxt,a nd has k−m unit roots or common trends.
To obtain the k−m common trends vector of size (k−m)×1, yyt=ααT⊥xxt, we calculate the orthogonal matrix of size k×(k−m) as ααT⊥αα=00.To uniquely identify αα and ββ we require that ββT=[IIm,ββT1], where TTm is a m×m identity matrix and ββ1 is a (k−m)×m matrix. There are a few more constraints for the process wwt=ββTxxt to be unit-root stationary.
The rank of ΠΠ in the ECM is the number of cointegrating vectors. Thus, to test for cointegration, once can examine the rank of ΠΠ, the approach taken in Johansen test.
Deterministic function: Limiting distributions of cointegration tests depend on the deterministic function μμt.
1) μμt=00: All components series of xxt are I(1) without drift and the stationary series wwt=ββTxxt has mean zero.
2) μμt=ααcc0: Components of xxt are I(1) without drift, but wwt have a nonzero mean −cc0, called restricted constant.
3) μμt=μμ0: Component series are I(1) with drift μμ0 and wwt may have a nonzero mean.
4) μμt=μμ0+αcαc1t: Components of xxt are I(1) with drift μμ0 and wwt has a linear time trend, called restricted trend.
5) μμt=μμ0+μμ1t: Both the constant and trend are unrestricted. The components of xtxt are I(1) and have a quadratic time trend and wwt have a linear trend.
Maximum likelihood estimation: Estimation of VAR(p) is quite involved. The deterministic term (xxt) and stationary terms (Δxxt) are first bifurcated and estimated using linear regression and have error terms are uut and vvt respectively. A relevant eigenvalue problem when solved leads to a likelihood which when maximized gives the estimates of the coefficients.
Johansen test for cointegration: This is esentially testing the rank of the matrix ΠΠ, for a specified deterministic term μμt. The number of non-zero eigenvalues of ΠΠ can be obtained if a consistent estimate of ΠΠ is available. Looking at the ECM equation it is clear that ΠΠ is related to the covariance matrix between xxt−1 and Δxxt after adjusting for the effects of deterministic trend term and Δxxt−i for i=1,...,p−1. Using canonical correlation analysis between the two adjusted equation, the squared correlation between them are calculated to be ˆλi. There are two versions of Johansen test:
1) Trace cointegration test:- H0: Rank(ΠΠ) = m versus Ha: Rank(ΠΠ)>m. The Likelihood ratio (LR) statistic is
LKtr(m)=−(T−p)k∑i=m+1ln(1−ˆλi)
Due to the presence of unit-roots, the asymptotic distribution of statistic is not chi-squared, but a function of standard Brownian motions. Thus, the critical values must be obtained via simulation.
2) Sequential test:- H0: Rank(ΠΠ) = m versus Ha: Rank(ΠΠ)=m+1. The LK ratio test statistic, called the maximum eigenvalue statistic, is
LKmax(m)=−(T−p)ln(1−ˆλm+1)
Again, the critical values of the test statistics are nonstandard and must be evaluated via simulation.
Left out sections: 8.7, 8.8
No comments:
Post a Comment