Conducting Time Series Bayesian Analysis using PyMC

Charles Copley
3 min readJun 16, 2023

In the field of data analysis, we often encounter time series data. From stock prices and weather forecasts to customer demand, these sequential data points indexed in time order carry significant information for various analytical tasks.

Bayesian analysis is a powerful statistical technique that allows us to express uncertainty about our models. With the help of Bayesian analysis, we can build and refine our models as more data becomes available. The combination of Bayesian analysis with time series can yield potent insights.

In this blog post, we will walk through a simple example of performing a Bayesian analysis on time series data using PyMC, a Python library for probabilistic programming.

Generate Time Series Data

For simplicity, let’s generate some synthetic time series data. We will generate a sinusoidal function with some Gaussian noise.

import numpy as np
import matplotlib.pyplot as plt
import pymc as pm
n_data_points = 100 # number of data points
x = np.linspace(0, 10, n_data_points)
y = np.sin(x) + np.random.normal(0, 0.2, size=n_data_points)

plt.plot(x, y, 'o')
plt.title('Generated time series data')

Building the Bayesian Model

Now that we have our time series data, let’s proceed with the Bayesian analysis. We’ll use a simple model here, where we’re fitting a sinusoidal function plus some Gaussian noise.

with pm.Model() as model:
# Priors
amplitude = pm.HalfNormal('amplitude', sigma=1)
frequency = pm.Normal('frequency', mu=1, sigma=1)
phase = pm.Uniform('phase', lower=0, upper=np.pi)
noise_sd = pm.HalfNormal('noise_sd', sigma=1)

# Expected value
y_hat = amplitude * pm.math.sin(2 * np.pi * frequency * x + phase)

# Likelihood
y_obs = pm.Normal('y_obs', mu=y_hat, sigma=noise_sd, observed=y)

# Sampling
trace = pm.sample(2000, tune=1000)

We have defined our priors and likelihood. We then sample from the posterior distribution of the parameters using Markov Chain Monte Carlo (MCMC).

Model Diagnostics

PyMC provides built-in functions for model diagnostics and posterior analysis. We can plot the trace of our parameters and visualize the posterior distributions.



Plotting our estimates

# Get posterior samples
amplitude_samples = trace.posterior['amplitude'].values
frequency_samples = trace.posterior['frequency'].values
phase_samples = trace.posterior['phase'].values

# Add a new axis to x
x_new = x[None, None, :]

# Generate predicted values
y_hat = amplitude_samples[..., None] * np.sin(2 * np.pi * frequency_samples[..., None] * x_new + phase_samples[..., None])

# Compute mean and standard deviation along the first two axes (chains and samples)
y_hat_mean = y_hat.mean(axis=(0, 1))
y_hat_std = y_hat.std(axis=(0, 1))
# Plotting
plt.figure(figsize=(10, 5))
plt.plot(x, y, 'o', label='Observed data')
plt.plot(x, y_hat_mean, color='C1', label='Predicted value')
plt.fill_between(x, y_hat_mean - 2*y_hat_std, y_hat_mean + 2*y_hat_std, color='C1', alpha=0.3, label='Confidence interval')
plt.title('Time Series Bayesian Analysis with PyMC')

This simple example demonstrated how we can utilize Bayesian analysis to understand time series data. By specifying our beliefs through priors and updating these beliefs based on observed data, Bayesian analysis provides a powerful framework for learning from data. With PyMC, the process becomes much easier and more intuitive, allowing us to focus more on model building and interpretation.

Remember, the quality of the Bayesian analysis highly depends on the chosen priors and the model. Hence, having domain knowledge can lead to more sensible priors and yield more accurate and interpretable results.

The field of Bayesian time series analysis is vast and offers many exciting opportunities. With the right tools and knowledge, you can start uncovering deep insights from your time series data today. Happy Bayesian modeling!