Computational Statistics for Bayesian Inference with PyMC3¶
This series of notebooks and material is being put together by Dr. Srijith Rajamohan with the introductory lectures on the foundations of Probability and the Bayes Theorem being offered by Dr. Robert Settlage. The purpose of this series is to teach the basics of Bayesian statistics for the purpose of performing Inference. This is not intended to be a comprehensive course that teaches the basics of statistics and probability nor does it cover Frequentist statistical techniques based on the Null Hypothesis Significance Testing (NHST). What it does cover is:
The basics of Bayesian probability
Understanding Bayesian inference and how it works
The bare-minimum set of tools and a body of knowledge required to perform Bayesian inference in Python, i.e. the PyData stack of NumPy, Pandas, Scipy, Matplotlib, Seaborn and Plot.ly
A scalable Python-based framework for performing Bayesian inference, i.e. PyMC3
With this goal in mind, the content is divided into the following three main sections (courses).
Introduction to Bayesian Statistics
Introduction to Monte Carlo Methods
PyMC3 for Bayesian Modeling and Inference
Please read the section titled ‘The What, Why and Whom…’.
Note
This draws a lot of inspiration from some brilliant people in this field and I will list those names here as this work is being developed.
PyMC3 can be found here.
Course Outline¶
Course 1 - Introduction to Bayesian Statistics¶
The Foundations of Probability
Distributions, Central Tendencies and Shape Parameters
Parameter Estimation
Introduction to the Bayes Theorem
Inference and Decisions
Bayesian and Frequentist approach
Distributions
Generate data and Parameter estimation
Gaussian Mixture Models
Information Criterion
Non-parametric Methods and Kernel Density Estimation
Introduction to Sampling
Sampling from Discrete Distributions
Inverse Transform Method
Rejection Sampling Method
Importance Sampling Method
Course 2 - Introduction to Monte Carlo Methods¶
R2 and Explained Variance
Underfitting vs. Overfitting, Simplicity vs. Accuracy
Cross Validation
Log-likelihood and Deviance
AIC and WAIC
Entropy
KL Divergence
Model Averaging
Stationarity and Ergodicity
Building blocks - Markov Chains
Building blocks - Why does it work?
Foundations of Bayesian Inference
Outline of the Metropolis Algorithm
Building the Inferred Distribution
Python Code for the Metropolis Algorithm
Introduction to the Metropolis-Hastings Algorithm
Introduction to Gibbs Sampling
Details of the Gibbs Sampling algorithm 1
Details of the Gibbs Sampling algorithm 2
Hamiltonian Monte Carlo
Characteristics of MCMC
Course 3 - PyMC3 for Bayesian Modeling and Inference¶
Introduction to PyMC3
Introduction to PyMC3 with Linear Regression
Introduction to PyMC3 - Traces
Composition of Distributions for Uncertainty
Highest Posterior Density and Region of Practical Equivalence
Credible Intervals and Confidence Intervals
Modeling with a Gaussian Distribution
Using PyMC3 to Model a Phenomenon with a Gaussian distribution
Posterior Predictive Checks
Robust Models with a Student’s t-Distribution
Hierarchical/Multilevel Models
Hierarchical Models - Shrinkage
Linear Regression
Mean-center for Linear Regression
Hierarchical Linear Regression
Polynomial Regression for Non-linear Data
Multiple Linear Regression
Logistic Regression
Multiple Logistic Regression
Multiclass Classification
Inferring Rate Change with a Poisson Distribution
Tuning
Mixing and Potential Scale Reduction Factor
Centered vs. Non-centered Parameterization
Autocorrelation and Effective Sample Size
Monte Carlo Error
Divergence
Revisiting the Multiclass Classification problem
PyMC3 metrics
Diagnosing and Debugging MCMC with PyMC3 (7 min)
ArViz Data Representation
PyMC3 COVID project