Using PyMC for A/B testing experiments

3 min readJun 16, 2023

Let’s consider a hypothetical case where a company is testing two different website designs: Design A and Design B, and wants to determine which one leads to more user conversions.

Firstly, we collect our data. For simplicity’s sake, let’s say 1,000 users are shown Design A, with 150 users converting (taking the desired action on the website). Meanwhile, 1,300 users are shown Design B, with 210 users converting.

We want to use Bayesian Analysis to compare the conversion rates of both designs.

import pymc as pm

# Observed data
n_A = 1000
n_B = 1300
obs_A = 150
obs_B = 210

with pm.Model() as model:
    # Prior distributions for the probabilities p_A and p_B
    p_A = pm.Beta('p_A', alpha=2, beta=2)
    p_B = pm.Beta('p_B', alpha=2, beta=2)

    # Deterministic delta function to calculate the difference in p_A and p_B
    delta = pm.Deterministic('delta', p_A - p_B)

    # Observed data is modeled as a Binomial distribution
    obs_A = pm.Binomial('obs_A', n=n_A, p=p_A, observed=obs_A)
    obs_B = pm.Binomial('obs_B', n=n_B, p=p_B, observed=obs_B)

    # Perform Markov Chain Monte Carlo sampling
    trace = pm.sample(draws=2000, tune=1000, cores=2)

The prior distributions for p_A and p_B are modeled as Beta distributions. We've chosen alpha=2 and beta=2 as it yields a prior that is uniform-ish, but doesn't rule out extreme values of p_A and p_B.

The delta function is deterministic and represents the difference between p_A and p_B.

The observed data is modeled as a Binomial distribution, where n_A and n_B are the number of trials (the number of users shown each design), and p_A and p_B are the unknown true probabilities of conversion for each design.

Finally, we perform MCMC sampling over this model, to infer the posterior distributions of p_A, p_B, and delta.

We can visualize and examine the results using PyMC’s built-in functions:

pm.plot_forest(trace, kind='ridgeplot',var_names=['p_A', 'p_B', 'delta'],combined=True)

The plot_posterior function will display the posterior distributions for p_A, p_B and delta. This gives us a probabilistic understanding of the conversion rates of Design A and Design B, and their difference.

For instance, if most of the delta distribution lies below zero (as we see above), we can be reasonably confident that Design B has a higher conversion rate. If delta spans both positive and negative values, it's likely there's no significant difference between the two designs.

In conclusion, Bayesian A/B testing powered by PyMC offers a robust and flexible framework for analyzing experiments and making data-driven decisions. Instead of merely providing a point estimate or a binary outcome, it provides a full probability distribution over the parameters of interest. This allows us to quantify the uncertainty associated with our estimates, thus giving a richer understanding of the results.

In our example, the posterior distributions of p_A, p_B, and delta not only give insights into the conversion rates of Design A and Design B, but also their comparative efficacy. By visualizing these distributions, we gain a nuanced view of the conversion dynamics, a step-up from traditional A/B testing methods.

As companies and data scientists strive to become increasingly data-driven, Bayesian methods facilitated by tools like PyMC are set to become even more critical. Whether it’s conversion rate optimization, customer behavior analysis, or any other realm where uncertainty reigns, Bayesian analysis enables us to tackle these challenges head-on.

This flexibility, combined with Python’s simplicity, makes PyMC an invaluable tool for anyone looking to harness the power of Bayesian analysis. So why wait? Start your Bayesian journey today, and unlock a new level of understanding from your data.

Using PyMC for A/B testing experiments

Written by Charles Copley