Factor analysis is a well known statistical method to describe the variability among observed variables in terms of a smaller number of unobserved latent variables called factors. While dealing with multivariate time series, the temporal correlation structure of data may be modeled by including correlations in latent factors, but a crucial choice is the covariance function to be implemented. We show that analyzing multivariate time series in terms of latent Gaussian processes, which are mutually independent but with each of them being characterized by exponentially decaying temporal correlations, leads to an efficient implementation of the expectation–maximization algorithm for the maximum likelihood estimation of parameters, due to the properties of block-tridiagonal matrices. The proposed approach solves an ambiguity known as the identifiability problem, which renders the solution of factor analysis determined only up to an orthogonal transformation. Samples with just two temporal points are sufficient for the parameter estimation: hence the proposed approach may be applied even in the absence of prior information about the correlation structure of latent variables by fitting the model to pairs of points with varying time delay. Our modeling allows one to make predictions of the future values of time series and we illustrate our method by applying it to an analysis of published gene expression data from cell culture HeLa.

Inverse problem for multivariate time series using dynamical latent variables

Zamparo, Marco;
2012-01-01

Abstract

Factor analysis is a well known statistical method to describe the variability among observed variables in terms of a smaller number of unobserved latent variables called factors. While dealing with multivariate time series, the temporal correlation structure of data may be modeled by including correlations in latent factors, but a crucial choice is the covariance function to be implemented. We show that analyzing multivariate time series in terms of latent Gaussian processes, which are mutually independent but with each of them being characterized by exponentially decaying temporal correlations, leads to an efficient implementation of the expectation–maximization algorithm for the maximum likelihood estimation of parameters, due to the properties of block-tridiagonal matrices. The proposed approach solves an ambiguity known as the identifiability problem, which renders the solution of factor analysis determined only up to an orthogonal transformation. Samples with just two temporal points are sufficient for the parameter estimation: hence the proposed approach may be applied even in the absence of prior information about the correlation structure of latent variables by fitting the model to pairs of points with varying time delay. Our modeling allows one to make predictions of the future values of time series and we illustrate our method by applying it to an analysis of published gene expression data from cell culture HeLa.
File in questo prodotto:
File Dimensione Formato  
physica_zamparo.pdf

file disponibile solo agli amministratori

Licenza: DRM non definito
Dimensione 709.05 kB
Formato Adobe PDF
709.05 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11579/173843
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 1
social impact