pymc-bayesian-modeling

Bayesian modeling with PyMC. Build hierarchical models, MCMC (NUTS), variational inference, LOO/WAIC comparison, posterior checks, for probabilistic programming and inference.

Install

Hot:28

Download and extract to your skills directory

Copy command and send to OpenClaw for auto-install:

Download and install this skill https://openskills.cc/api/download?slug=k-dense-ai-scientific-skills-pymc&locale=en&source=copy

PyMC Bayesian Modeling Skills - Python Probabilistic Programming and Inference Tools

Skill Overview


Use PyMC for Bayesian modeling and probabilistic programming, supporting hierarchical models, MCMC sampling (NUTS), variational inference, model comparison, and a complete diagnostic workflow.

Applicable Scenarios

1. Small-sample Data and Uncertainty Quantification


When data are limited or you need to quantify predictive uncertainty, Bayesian methods have advantages over traditional frequentist statistics. Applicable in medical research, social sciences, A/B testing, etc., where combining prior information with data yields more robust estimates.

2. Hierarchical Structure and Multilevel Data Analysis


Handle nested data structures such as student-class-school, patient-hospital, repeated measures, etc. The skill provides non-centered parameterization templates to avoid sampling divergence issues and effectively estimate between- and within-group variation.

3. Complex Models and Model Selection


Build various model types including linear regression, logistic regression, Poisson regression, time series (AR models), and more. Compare models using LOO/WAIC information criteria, support model averaging, and automate prior predictive checks, fit diagnostics, and posterior predictive checks across the full workflow.

Core Features

1. Modern Bayesian Modeling Workflow


Built on the PyMC 5.x+ API, using named dimensions (dims) instead of shape to improve code readability. Full standard workflow: data standardization → prior predictive checks → MCMC fitting → diagnostics (R-hat, ESS, divergences) → posterior predictive checks → prediction and inference.

2. Sampling Diagnostics and Troubleshooting


Automated diagnostic scripts check convergence (R-hat < 1.01), effective sample size (ESS > 400), and divergences. Provide targeted solutions: increase the target_accept parameter, use non-centered parameterization, ADVI initialization, etc. Support variational inference (ADVI) for quick exploration or large-scale models.

3. Model Comparison and Distribution Guidance


Use ArviZ for LOO/WAIC model comparison and automatically check Pareto-k values for reliability. Provide a complete guide for distribution choices: priors (HalfNormal, Exponential for scale parameters; Beta for probabilities; LKJCorr for correlation matrices) and likelihoods (Normal, StudentT, Poisson, NegativeBinomial, etc.).

Frequently Asked Questions

What types of data analysis is PyMC suitable for?


PyMC is suitable for scenarios that require uncertainty quantification, have limited data, or have hierarchical structure. Typical applications include Bayesian regression (linear/logistic/Poisson), hierarchical models, time series forecasting, missing-data imputation, A/B testing, and more. When confidence intervals from traditional methods are insufficient or prior knowledge needs to be incorporated, Bayesian methods are advantageous.

How to resolve divergences in Bayesian sampling?


Divergences usually indicate complex geometry or inappropriate step sizes. Solutions include: 1) increasing target_accept to 0.95–0.99; 2) using non-centered parameterization for hierarchical models; 3) standardizing predictors; 4) using ADVI initialization; 5) adding stronger prior constraints. The diagnostic report will automatically detect the number of divergences and offer recommendations.

How to choose between LOO and WAIC for model comparison?


Both estimate out-of-sample predictive error. LOO (leave-one-out cross-validation) is more accurate but computationally intensive; WAIC is faster. Prefer LOO but check Pareto-k values: LOO is reliable when k < 0.7; if k > 0.7 consider WAIC or k-fold cross-validation. Δloo < 2 indicates models are similar; > 10 provides strong evidence in favor of the better model.