Navigating Bayesian Analysis with PyMC

Charles Copley
2 min readJun 16, 2023

--

Have you ever wondered how you could use probability distributions to understand uncertain events or predict future outcomes? Bayesian analysis provides a powerful framework for doing just this, and PyMC is a Python library that makes implementing Bayesian analysis straightforward and efficient.

First, it’s important to understand that Bayesian analysis is all about updating your prior beliefs with new data to obtain a posterior distribution. The power of PyMC lies in its ability to perform this complex process using Markov Chain Monte Carlo (MCMC) methods.

Let’s start with installing PyMC. You can do this by using pip:
pip install pymc

pip install pymc

Let’s demonstrate a simple use-case of Bayesian analysis with PyMC using the example of a coin toss.

import pymc as pm

# Observations: 60 Heads, 40 Tails
observations = [1]*60 + [0]*40

with pm.Model() as model:
# Prior
p = pm.Uniform('p', lower=0, upper=1)

# Likelihood
obs = pm.Bernoulli('obs', p, observed=observations)

# Inference
trace = pm.sample(2000, tune=1000, target_accept=0.95)

In the code above, we first import PyMC. Our observations are represented as a list with ‘1’ for heads and ‘0’ for tails, which we’ve observed 60 and 40 times respectively.

Within the model context (with pm.Model() as model:), we first define our prior distribution. In this case, we don't have any strong belief about the fairness of the coin, so we use a Uniform distribution ('p') over the interval [0,1].

The likelihood function is defined as a Bernoulli distribution, which is suitable for binary outcomes. We pass our observations to the observed parameter.

Finally, we infer the posterior distribution using MCMC sampling (pm.sample). The sample function runs the MCMC algorithm, and parameters tune and target_accept are used to control the sampling process.

You can now analyze the trace of the sampling process using PyMC’s built-in tools:

pm.plot_trace(trace)
pm.summary(trace).round(2)

The traceplot function will display a kernel density estimate and the trace for each variable. The summary function provides a nice tabular overview of the mean, standard deviation, and credible intervals for each parameter.

This simple example demonstrates how Bayesian analysis with PyMC can provide robust probabilistic models, enhancing the inferential and predictive power of your analyses. The flexibility of PyMC’s syntax allows it to scale from simple examples like the one above to more complex, real-world problems. With PyMC, Bayesian analysis is accessible to anyone with a basic understanding of Python and probability theory. Happy Bayesian modeling!

--

--

Charles Copley
Charles Copley

No responses yet