GitHub - daehyun99/pgmpy: Python Library for Causal and Probabilistic Modeling using Bayesian Networks

pgmpy provides the building blocks for causal and probabilistic reasoning using graphical models. It implements data structures for a range of causal and graphical models such as DAGs, PDAGs, MAGs, PAGs, Bayesian Networks, Dynamic Bayesian Networks, and Structural Equation Models, along with algorithms for various tasks such as causal discovery, causal identification, causal and probabilistic inference, model validation, parameter estimation, simulations, and more.

Algorithms for each task follow a unified composable API, making them modular and extensible. They are also scikit-learn compatible when possible. They can be used directly, combined in sklearn pipelines, or used to build higher-level tools on top of them.

	Documentation · Examples . Tutorials
Open Source
Tutorials
Community
CI/CD
Code
Downloads
Supported By

Key Features

Feature	Description
Causal Discovery / Structure Learning	Learn the model structure from data, with optional integration of expert knowledge.
Causal Validation	Assess how compatible the causal structure is with the data.
Parameter Learning	Estimate model parameters (e.g., conditional probability distributions) from observed data.
Probabilistic Inference	Compute posterior distributions conditioned on observed evidence.
Causal Inference	Compute interventional and counterfactual distributions using do-calculus.
Simulations	Generate synthetic data under specified evidence or interventions.

Resources and Links

Example Notebooks: Examples
Tutorial Notebooks: Tutorials
Blog Posts: Medium
Documentation: Website
Bug Reports and Feature Requests: GitHub Issues
Questions: discord · Stack Overflow

Quickstart

Installation

pgmpy is available on both PyPI and anaconda. To install from PyPI, use:

pip install pgmpy

To install from conda-forge, use:

conda install conda-forge::pgmpy

Examples

Discrete Data

from pgmpy.example_models import load_model

# Load a Discrete Bayesian Network and simulate data.
discrete_bn = load_model("bnlearn/alarm")
alarm_df = discrete_bn.simulate(n_samples=100)

# Learn a network from simulated data.
from pgmpy.estimators import PC

dag = PC(data=alarm_df).estimate(ci_test="chi_square", return_type="dag")

# Learn the parameters from the data.
from pgmpy.models import DiscreteBayesianNetwork

discrete_bn = DiscreteBayesianNetwork(dag.edges())
discrete_bn.add_nodes_from(dag.nodes())
dag_fitted = discrete_bn.fit(alarm_df)
dag_fitted.get_cpds()

# Drop a column and predict using the learned model.
evidence_df = alarm_df.drop(columns=["FIO2"], axis=1)
pred_FIO2 = dag_fitted.predict(evidence_df)

Linear Gaussian Data

from pgmpy.example_models import load_model

# Load an example Gaussian Bayesian Network and simulate data
gaussian_bn = load_model("bnlearn/ecoli70")
ecoli_df = gaussian_bn.simulate(n_samples=100)

# Learn the network from simulated data.
from pgmpy.estimators import PC

dag = PC(data=ecoli_df).estimate(ci_test="pearsonr", return_type="dag")

# Learn the parameters from the data.
from pgmpy.models import LinearGaussianBayesianNetwork

gaussian_bn = LinearGaussianBayesianNetwork(dag.edges())
dag_fitted = gaussian_bn.fit(ecoli_df)
dag_fitted.get_cpds()

# Drop a column and predict using the learned model.
evidence_df = ecoli_df.drop(columns=["ftsJ"], axis=1)
pred_ftsJ = dag_fitted.predict(evidence_df)

Mixture Data with Arbitrary Relationships

from pgmpy.global_vars import config

config.set_backend("torch")

import pyro.distributions as dist

from pgmpy.models import FunctionalBayesianNetwork
from pgmpy.factors.hybrid import FunctionalCPD

# Create a Bayesian Network with mixture of discrete and continuous variables.
func_bn = FunctionalBayesianNetwork(
    [
        ("x1", "w"),
        ("x2", "w"),
        ("x1", "y"),
        ("x2", "y"),
        ("w", "y"),
        ("y", "z"),
        ("w", "z"),
        ("y", "c"),
        ("w", "c"),
    ]
)

# Define the Functional CPDs for each node and add them to the model.
cpd_x1 = FunctionalCPD("x1", fn=lambda _: dist.Normal(0.0, 1.0))
cpd_x2 = FunctionalCPD("x2", fn=lambda _: dist.Normal(0.5, 1.2))

# Continuous mediator: w = 0.7*x1 - 0.3*x2 + ε
cpd_w = FunctionalCPD(
    "w",
    fn=lambda parents: dist.Normal(0.7 * parents["x1"] - 0.3 * parents["x2"], 0.5),
    parents=["x1", "x2"],
)

# Bernoulli target with logistic link: y ~ Bernoulli(sigmoid(-0.7 + 1.5*x1 + 0.8*x2 + 1.2*w))
cpd_y = FunctionalCPD(
    "y",
    fn=lambda parents: dist.Bernoulli(
        logits=(-0.7 + 1.5 * parents["x1"] + 0.8 * parents["x2"] + 1.2 * parents["w"])
    ),
    parents=["x1", "x2", "w"],
)

# Downstream Bernoulli influenced by y and w
cpd_z = FunctionalCPD(
    "z",
    fn=lambda parents: dist.Bernoulli(
        logits=(-1.2 + 0.8 * parents["y"] + 0.2 * parents["w"])
    ),
    parents=["y", "w"],
)

# Continuous outcome depending on y and w: c = 0.2 + 0.5*y + 0.3*w + ε
cpd_c = FunctionalCPD(
    "c",
    fn=lambda parents: dist.Normal(0.2 + 0.5 * parents["y"] + 0.3 * parents["w"], 0.7),
    parents=["y", "w"],
)

func_bn.add_cpds(cpd_x1, cpd_x2, cpd_w, cpd_y, cpd_z, cpd_c)
func_bn.check_model()

# Simulate data from the model
df_func = func_bn.simulate(n_samples=1000, seed=123)

# For learning and inference in Functional Bayesian Networks, please refer to the example notebook: https://github.com/pgmpy/pgmpy/blob/dev/examples/Functional_Bayesian_Network_Tutorial.ipynb

Contributing

We welcome all contributions --not just code-- to pgmpy. Please refer out contributing guide for more details. We also offer mentorship for new contributors and maintain a list of potential mentored projects. If you are interested in contributing to pgmpy, please join our discord server and introduce yourself. We will be happy to help you get started.

Name		Name	Last commit message	Last commit date
Latest commit History 3,548 Commits
.github		.github
dev		dev
devtools		devtools
docs		docs
examples		examples
logo		logo
pgmpy		pgmpy
.codecov.yml		.codecov.yml
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.test_durations		.test_durations
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
Contributing.md		Contributing.md
LICENSE		LICENSE
README.md		README.md
doctest_modules.txt		doctest_modules.txt
funding.json		funding.json
notebook_test_list.txt		notebook_test_list.txt
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Key Features

Resources and Links

Quickstart

Installation

Examples

Discrete Data

Linear Gaussian Data

Mixture Data with Arbitrary Relationships

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Key Features

Resources and Links

Quickstart

Installation

Examples

Discrete Data

Linear Gaussian Data

Mixture Data with Arbitrary Relationships

Contributing

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages