Processing math: 100%

Tuesday, June 30, 2015

Tsay Ch5 - High Frequency Data Analysis and Market Microstructure

Non-synchronous trading

For daily stock returns, non-synchronous trading can introduce
1) cross correlations between stock returns
2) serial correlation in a portfolio return
3) sometimes negative serial correlations in the return series of a stock.

Bid-ask spread

Introduces lag-1 serial correlation in an asset return, called bid-ask bounce.

Empirical characteristics

Aggregation don't show some of the characteristics of transactions data
1) unequally spaced time intervals - duration between trades might contain useful information about market micro-structure e.g. trading intensity
2) discrete-valued prices - along with limits
3) Existence of a daily periodic pattern - thinner during lunch hour.
4) Multiple transactions within a single second

Overnight stocks returns differ substantially from intraday returns (Stoll and Whaley 1990). Intraday trading has exploded with multiple transactions within second.

Models for price change

The discreteness and concentration on 'no change' make it difficult to model the intraday price changes. There are two models - ordered probit model (Hauseman, Lo and MacKinlay 1992) and a decomposition model (McCulloch and Tsay 2000). These models find prediction challenging, but are more used for understanding purposes.
Ordered Probit model - For Pt being the fundamental value of the asset in a friction-less market and Pt being the observed price, we define yi=PtiPti1 and model yi as a continuous random variable given by yi=xiβ+ϵi. The observed value yi can be categorized in ordered set s1,...,sk. Generally a normal distribution is assumed. The model can be estimated by maximum likelihood or MCMC methods. Explanatory variables xi can be time duration, lagged prices, lagged SP500 price, bid-ask spread and direction, lagged volume. Volatility can be explained using duration and bid-ask spread as well.
A decomposition model (ADS) - indicator for price change, direction of price change, and the size of price change, yi=PtiPti1=AiDiSi, where ordering is important. Each of these terms are modeled as logistic regression using explanatory variables and estimated using log likelihood.

Duration models - ACD

Concerned with time intervals between trades. Longer durations indicate lack of trading activities, which means no new information. Before the duration can be modeled the diurnal pattern has to be removed from the time series. This is done by calculating adjusted time duration Δti=Δti/f(ti). f(ti)=exp(β0+71βjfj(ti)), where the fi are functions defined to take care of first 5 minutes, last 30 minutes, and mid period, depending on the asset and profile. We can then fit the autoregressive conditional duration model. f(ti) is commonly estimated using smoothing splines. One way is to use combination of quadratic functions and indicator variables to take care of deterministic components of daily trading activities.
f(ti)=ed(ti)d(ti)=β0+71βjfj(ti))

where, f1,f2,f3,f4 are quadratic functions fitted for specific data (Tsay pg 225).  f5 and f6 are indicator variables for the first and second 5 minutes of market open, and f7 is the indicator for the last 30 minutes of daily trading. The coefficients can be determined by least sqaures method
ln(δti)=β0+71βjfj(ti))+ϵi

The autoregressive conditional duration (ACD) model uses the idea of GARCH models to study the dynamic structure of the adjusted duration Δti. For xi=Δtt and ψi=E(xi|Fi1), the model is defined as xi=ψiϵi, where ϵi follows a standard exponential (EACD) or a standard Weibull (WACD) distribution. Further, similar to GARCH, we have ACD(r,s) model
ψi=ω+rj=1γjxij+sj=1ωjψij

with γj=0 for j>r and ωj=0 for j>s. For stationarity ω>0 and 1>j(γj+ωj).

EACD(1,1) model: ϵi as exponential distribution xi=ψiϵi and ψi=ω+γ1xi1+ω1ψi1. We have E(ϵi)=1 and Var(ϵi)=1, implying E(ϵ2i)=2. Assuming weak stationarity,
E(xi)=ω1γ1ω1.

Var(xi)=μ2x1ω212γ1ω11ω212γ212γ1ω1.

Hence, 1>2γ21+ω21+2γ1ω1 for stationarity.

Bivariate models for price change and duration - PCD

Jointly modeling the price change and associated duration process. Focus on transactions that result in a price change Pti=Pti1+DiSi, where Di is the direction change dummy and Si is the size change variable. It reduces the number of data point and there is no diurnal pattern in time durations between price changes. The PCD model decomposes the joint distribution of (Δti,Ni,Di,Si) given Fi1 as
f(Δti,Ni,Di,Si|Fi1)=f(Si|Di,Ni,Δti,Fi1)f(Di|Ni,Δti,Fi1)f(Ni|Δti,Fi1)f(Δti|Fi1)

where the ith transaction data consists of Δti duration, Ni number of trades in the period, Di direction of price change, Si size of price change in ticks. There are many ways to specify the conditional distributions depending on the asset under study. Using McCulloch and Tsay (2000) generalized linear models for discrete valued variables and time series model for continuous variable ln(Δti) we get
ln(Δti)=β0+β1ln(Δti1)+β2Si1+σϵi.

Log transformation is added to ensure positiveness. Due to concentration of Ni at 0, we partition the model for Ni in tow parts.
p(Ni=0|Δti,Fi1)=logit[α0+α1ln(Δti)]

where logit(x)=ex/(1+ex), whereas the second part of the model is
Ni|(Ni>0,Δti,Fi1)1+g(λi)λi=eγ0+γ1ln(Δti)1+eγ0+γ1ln(Δti)

where g(λ) denotes a geometric distribution with parameter λ, in interval (0,1). The model for direction Di is
Di|(Ni,Δti,Fi1)=sign(μi+σiϵ)

where ϵ is a N(0,1) random variable, and
μi=ω0+ω1Di1+ω2ln(Δti),

ln(σi)=β|Di1+Di2+Di3+Di4|

To allow for different dynamics between positive and negative price movements, we use different models for the size of a price change.
Si|(Di=1,Ni,Δti,Fi1)p(λd,i)+1ln(λd,i)=ηd,0+ηd,1Ni+ηd,2ln(δti)+ηd,3Si1

Si|(Di=1,Ni,Δti,Fi1)p(λu,i)+1ln(λu,i)=ηu,0+ηu,1Ni+ηu,2ln(δti)+ηu,3Si1

where p(λ) denotes a Poisson distribution with parameter λ, and 1 is added to the size because the minimum size is 1 tick when there is a price change. Estimation can be done either by maximum likelihood or MCMC methods.

Left out sections: 5.6

No comments:

Post a Comment