Skip to contents

For HMMs, pseudo-residuals are used to assess the goodness-of-fit of the model. These are based on the cumulative distribution function (CDF) $$F_{X_t}(x_t) = F(x_t \mid x_1, \dots, x_{t-1}, x_{t+1}, \dots, x_T)$$ and can be used to quantify whether an observation is extreme relative to its model-implied distribution.

This function calculates such residuals via probability integral transform, based on the local state probabilities obtained by stateprobs or stateprobs_g and the respective parametric family.

Usage

pseudo_res(
  obs,
  dist,
  par,
  stateprobs = NULL,
  mod = NULL,
  normal = TRUE,
  discrete = NULL,
  randomise = TRUE,
  seed = NULL
)

Arguments

obs

vector of continuous-valued observations (of length n)

dist

character string specifying which parametric CDF to use (e.g., "norm" for normal or "pois" for Poisson)

par

named parameter list for the parametric CDF

Names need to correspond to the parameter names in the specified distribution (e.g. list(mean = c(1,2), sd = c(1,1)) for a normal distribution and 2 states). This argument is as flexible as the parametric distribution allows. For example you can have a matrix of parameters with one row for each observation and one column for each state.

stateprobs

matrix of local state probabilities for each observation (of dimension c(n,N), where N is the number of states) as computed by stateprobs, stateprobs_g or stateprobs_p

mod

optional model object containing initial distribution delta, transition probability matrix Gamma, matrix of state-dependent probabilities allprobs, and potentially a trackID variable

If you are using automatic differentiation either with RTMB::MakeADFun or qreml and include forward, forward_g or forward_p in your likelihood function, the objects needed for state decoding are automatically reported after model fitting. Hence, you can pass the model object obtained from running RTMB::report() or from qreml directly to this function and avoid calculating local state proabilities manually. In this case, a call should look like pseudo_res(obs, "norm", par, mod = mod).

normal

logical, if TRUE, returns Gaussian pseudo residuals

These will be approximately standard normally distributed if the model is correct.

discrete

logical, if TRUE, computes discrete pseudo residuals (which slightly differ in their definition)

By default, will be determined using dist argument, but only works for standard discrete distributions. When used with a special discrete distribution, set to TRUE manually. See pseudo_res_discrete for details.

randomise

for discrete pseudo residuals only. Logical, if TRUE, return randomised pseudo residuals. Recommended for discrete observations.

seed

for discrete pseudo residuals only. Integer, seed for random number generation

Value

vector of pseudo residuals

Details

When used for discrete pseudo-residuals, this function is just a wrapper for pseudo_res_discrete.

Examples

## continuous-valued observations
obs = rnorm(100)
stateprobs = matrix(0.5, nrow = 100, ncol = 2)
par = list(mean = c(1,2), sd = c(1,1))
pres = pseudo_res(obs, "norm", par, stateprobs)

## discrete-valued observations
obs = rpois(100, lambda = 1)
stateprobs = matrix(0.5, nrow = 100, ncol = 2)
par = list(lambda = c(1,2))
pres = pseudo_res(obs, "pois", par, stateprobs)
#> Discrete pseudo-residuals are calculated
#> Randomised between lower and upper