Forward algorithm for hidden semi-Markov models with periodically varying transition probability matrices

Hidden semi-Markov models (HSMMs) are a flexible extension of HMMs that can be approximated by HMMs on an enlarged state space (of size \(M\)) and with structured transition probabilities. Recently, this inference procedure has been generalised to allow either the dwell-time distributions or the conditional transition probabilities to depend on external covariates such as the time of day. This special case is implemented here. This function allows for that, by expecting a transition probability matrix for each time point in a period, and an integer valued (\(1, \dots, L\)) time variable that maps the data index to the according time.

Usage

forward_sp(delta, Gamma, allprobs, sizes, tod)

Arguments

delta

initial or stationary distribution of length N, or matrix of dimension c(k,N) for k independent tracks, if trackID is provided

Gamma

array of transition probability matrices of dimension c(M,M,L).

Here we use the definition \(\Pr(S_t=j \mid S_{t-1}=i) = \gamma_{ij}^{(t)}\) such that the transition probabilities between time point \(t-1\) and \(t\) are an element of \(\Gamma^{(t)}\).

allprobs

matrix of state-dependent probabilities/ density values of dimension c(n, N) which will automatically be converted to the appropriate dimension.

sizes

state aggregate sizes that are used for the approximation of the semi-Markov chain.

tod

(Integer valued) variable for cycle indexing in 1, ..., L, mapping the data index to a generalised time of day (length n). For half-hourly data L = 48. It could, however, also be day of year for daily data and L = 365.

Value

log-likelihood for given data and parameters

Examples

## generating data from homogeneous 2-state HSMM
mu = c(0, 6)
beta = matrix(c(log(4),log(6),-0.2,0.2,-0.1,0.4), nrow=2)
# time varying mean dwell time
Lambda = exp(cbind(1, trigBasisExp(1:24, 24, 1))%*%t(beta))
omega = matrix(c(0,1,1,0), nrow = 2, byrow = TRUE)
# simulation
# for a 2-state HSMM the embedded chain always alternates between 1 and 2
s = rep(1:2, 100)
C = x = numeric(0)
tod = rep(1:24, 50) # time of day variable
time = 1
for(t in 1:100){
  dt = rpois(1, Lambda[tod[time], s[t]])+1 # dwell time depending on time of day
  time = time + dt
  C = c(C, rep(s[t], dt))
  x = c(x, rnorm(dt, mu[s[t]], 1.5)) # fixed sd 2 for both states
}
tod = tod[1:length(x)]

## negative log likelihood function
mllk = function(theta.star, x, sizes, tod){
  # parameter transformations for unconstraint optimization
  omega = matrix(c(0,1,1,0), nrow = 2, byrow = TRUE) # omega fixed (2-states)
  mu = theta.star[1:2]
  sigma = exp(theta.star[3:4])
  beta = matrix(theta.star[5:10], nrow=2)
  # time varying mean dwell time
  Lambda = exp(cbind(1, trigBasisExp(1:24, 24, 1))%*%t(beta))
  dm = list()
  for(j in 1:2){
    dm[[j]] = sapply(1:sizes[j]-1, dpois, lambda = Lambda[,j])
  }
  Gamma = tpm_phsmm2(omega, dm)
  delta = stationary_p(Gamma, tod[1])
  # calculate all state-dependent probabilities
  allprobs = matrix(1, length(x), 2)
  for(j in 1:2){ allprobs[,j] = dnorm(x, mu[j], sigma[j]) }
  # return negative for minimization
  -forward_sp(delta, Gamma, allprobs, sizes, tod)
}

## fitting an HSMM to the data
theta.star = c(1, 4, log(2), log(2), # state-dependent parameters
                 log(4), log(6), rep(0,4)) # state process parameters dm
# mod = nlm(mllk, theta.star, x = x, sizes = c(10, 15), tod = tod, stepmax = 5)