Interpretable time-series modelling using Gaussian processes

submited by
Style Pass
2024-05-10 01:30:04

Welcome! If it wasn’t already obvious, I love time-series analysis, so we are going to dive deep into it here. Another thing I love is Gaussian processes (GP), so why not combine the two? We’ll get into the details later, but GPs are an insanely powerful tool that can model an absurd range of data (including continuous and discrete) in an interpretable way that gives the modeller a large degree of control over how the technique learns from data. This makes GPs an invaluable tool in any applied statistician/machine learning practitioner’s toolkit.

It is important to remember that using a GP for time series basically converts the problem from one of modelling a generative process (e.g., if we used an ARIMA model) to that of essentially a curve-fitting exercise. GPs are a Bayesian method, so if you are unfamiliar with Bayesian inference, please go check out some resources on that, such as this great video by Richard McElreath. Essentially, the key bits to remember for this post is that in Bayesian inference, we are interested in the posterior distribution, which is proportional to the product of the prior (our beliefs about an event before seeing the data) and the likelihood (the probability of the data given model parameters). Throughout this post, we are going to build a basic GP from scratch to highlight the intuition and mechanics underlying the modelling process.

In practice, we would most likely fit GPs using a purpose-built tool, such as coding one in the probabilistic programming language Stan to utilise Markov chain Monte Carlo for sampling, or using a dedicated GP package such as GPy, GauPro, or even Tensorflow Probability.

Leave a Comment