Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added sphinx documentation for simplify #5

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added documentation/.DS_Store
Binary file not shown.
20 changes: 20 additions & 0 deletions documentation/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = source
BUILDDIR = build

# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
35 changes: 35 additions & 0 deletions documentation/make.bat
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
@ECHO OFF

pushd %~dp0

REM Command file for Sphinx documentation

if "%SPHINXBUILD%" == "" (
set SPHINXBUILD=sphinx-build
)
set SOURCEDIR=source
set BUILDDIR=build

%SPHINXBUILD% >NUL 2>NUL
if errorlevel 9009 (
echo.
echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
echo.installed, then set the SPHINXBUILD environment variable to point
echo.to the full path of the 'sphinx-build' executable. Alternatively you
echo.may add the Sphinx directory to PATH.
echo.
echo.If you don't have Sphinx installed, grab it from
echo.https://www.sphinx-doc.org/
exit /b 1
)

if "%1" == "" goto help

%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
goto end

:help
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%

:end
popd
28 changes: 28 additions & 0 deletions documentation/source/conf.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# Configuration file for the Sphinx documentation builder.
#
# For the full list of built-in configuration values, see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html

# -- Project information -----------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information

project = 'Causaleffectpy documentation'
author = 'Haley Hummel'
release = '0.0.1'

# -- General configuration ---------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration

master_doc = 'index'
extensions = []

templates_path = ['_templates']
exclude_patterns = []



# -- Options for HTML output -------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output

html_theme = "sphinx_rtd_theme"
html_static_path = ['_static']
11 changes: 11 additions & 0 deletions documentation/source/functions.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
CausalEffect Functions
=======================

.. toctree::
:maxdepth: 4
:titlesonly:

simplify
join
insert
powerset
29 changes: 29 additions & 0 deletions documentation/source/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
.. Causaleffectpy documentation master file, created by
sphinx-quickstart on Tue Aug 13 12:31:43 2024.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.

`Causaleffectpy` Documentation
==========================

This documentation provides an overview of `causaleffectpy`, which is derived from Santu Tikka's `causaleffect` R package. This documentation will focus on `simplify` and related functions in order to integrate them into the open source `y0` (Why Not?) Python package. For further information, see Tikka & Karvanen (2017) "Simplifying Probabilistic Expressions in Causal Inference".

.. toctree::
:maxdepth: 2

functions


References
===============

Hoyt, C.T., Zucker, J., & Parent, M-A. (2021). Y0 “Why Not?” for Causal Inference in Python (1.0) [Python package]. 10.5281/zenodo.4950768. https://github.com/y0-causal-inference/y0.
Tikka, S., & Karvanen, J. (2017). Simplifying probabilistic expressions in causal inference. Journal of Machine Learning Research, 18(36), 1-30.


Indices and tables
==================

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
69 changes: 69 additions & 0 deletions documentation/source/insert.rst

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace the capital letters with more descriptive variable names.

Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
Insert
======

The `Insert` function inserts a missing variable into a joint distribution `P(J|D)` using d-separation criteria in a given graph `G`. It is called when there are variables without corresponding terms in the expression.

Parameters
----------
J : list of str
The set of variables representing the joint distribution.
D : list of str
The set of variables representing the conditioning set of the joint distribution.
M : str
The variable to be inserted.
cond : list of str
The set of conditioning variables.
S : list of str
The current summation variable.
O : list of str
The set of observed variables.
G_unobs : y0.Graph
Separate graph that turns bidirected edges into explicit nodes for unobserved confounders.
G : y0.Graph
Main graph `G`. Includes bidirected edges.
G_obs : y0.Graph
Separate graph that does not contain bidirected edges (only contains the directed edges with observed nodes).
topo : list of str
The topological ordering of the vertices in graph `G`.

Returns
-------
dict
A dictionary with the following keys:
- `J_new`: list of str. An updated set of joint distribution variables.
- `D_new`: list of str. An updated set of conditioning variables.
- `M`: str. The inserted variable.
- `ds_i`: list of str. The subset from the power set used in the insertion.

If no conditions were met, `insert` will return the original `J` and `D`.


Examples
--------
Section in-progress
.. code-block:: python


See Also
--------
- :func:`join`
- :func:`simplify`
- :func:`wrap_dSep`
- :func:`powerset`

Keywords
--------
models, manip, math, utilities, graphs, methods, multivariate, distribution, probability

Concepts
--------
probabilistic expressions, graph theory, joint distribution, causal inference, d-separation

References
----------
Tikka, S., & Karvanen, J. (2017). Simplifying probabilistic expressions in causal inference. *Journal of Machine Learning Research*, 18(36), 1-30.

Author
------
Haley Hummel,
Psychology PhD student at Oregon State University
75 changes: 75 additions & 0 deletions documentation/source/join.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
Join
====

The `join` function determines whether the terms of the atomic expression actually represent a joint distribution.
It attempts to combine two terms: the joint term `P(J|D)` obtained from `simplify()` and the
term `P(V|C) := P(Vk|Ck)` of the current iteration step. The goal is to
determine if these terms can be combined based on the d-separation criteria in the graph `G`.

Parameters
----------
J : list of str
Joint set `P(J|D)`; already processed and included in the joint distribution
from previous `simplify` iteration. Initially, may be empty for the starting point of
the joint distribution. `vari` is added to expand it if d-separation conditions are met.
D : list of str
Term `P(V|C) := P(Vk|Ck)`; set of variables that condition the joint distribution.
`join` checks and updates `D` as necessary to maintain the validity of the joint distribution
when combined with `vari`.
vari : str
Current variable being considered for inclusion in the joint distribution.
cond : list of str
Set of variables that condition the current variable `vari`. `join` uses `cond`
to evaluate conditional independence and determine if `vari` can be added to `J`.
S : list of str
Not used directly in `join`. Current summation variable.
M : list of str
Missing variables (variables not contained within the expression).
O : list of str
Observed variables (variables contained within the expression).
G_unobs : y0.Graph
Separate graph that turns bidirected edges into explicit nodes for unobserved confounders.
G : y0.Graph
Main graph `G`. Includes bidirected edges.
G_obs : y0.Graph
Separate graph that does not contain bidirected edges (only contains the directed edges with observed nodes).
topo : list of str
The topological ordering of the vertices in graph `G`.

Returns
-------
list of str
The joint result, or the original result if none of the conditions for joining were met.

Dependencies
-------
This function depends on several functions from the causaleffect package, including: :func:`powerset`, :func:`is_d_separated`, and :func:`insert`.

See Also
--------
- :func:`simplify`
- :func:`is_d_separated`
- :func:`insert`

Examples
--------
Section in-progress
.. code-block:: python

Keywords
--------
models, manip, math, utilities

Concepts
--------
probabilistic expressions, graph theory, causal inference

References
----------
Tikka, S., & Karvanen, J. (2017). Simplifying probabilistic expressions in causal inference. Journal of Machine Learning Research, 18(36), 1-30.

Author
------
Haley Hummel,
Psychology PhD student at Oregon State University

44 changes: 44 additions & 0 deletions documentation/source/powerset.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
Powerset
========

The `Powerset` function generates the power set of a given set. The power set is the set of all possible subsets of the original set, including the empty set and the set itself.

Parameters
----------
set : list
A list representing the original set for which the power set will be generated. The set can contain any type of elements (e.g., numeric, string, or boolean).

Details
-------
The function computes all possible combinations of the elements of the input set. This includes the empty subset, individual elements, and all larger subsets up to and including the full set. The number of subsets in the power set of a set of size `n` is `2^n`.

Returns
-------
list of lists
A list of lists, where each inner list is a subset of the original input set. The list contains `2^n` subsets, where `n` is the length of the input set. If the input set is empty, the function returns a list containing only the empty set.

Examples
--------
Section in-progress
.. code-block:: python

See Also
--------
- `join`: for using `powerset` with conditional independence in probabilistic graphical models.

Keywords
--------
set theory, combinatorics

Concepts
--------
power set, subsets

References
----------
Tikka, S., & Karvanen, J. (2017). Simplifying probabilistic expressions in causal inference. *Journal of Machine Learning Research*, 18(36), 1-30.

Author
------
Haley Hummel,
Psychology PhD student at Oregon State University
63 changes: 63 additions & 0 deletions documentation/source/simplify.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
Simplify
========

This function algebraically simplifies probabilistic expressions given by the ID algorithm from :func:`identify`. It always attempts to perform maximal simplification, meaning that as many variables of the set are removed as possible. If the simplification in terms of the entire set cannot be completed, the intermediate result with as many variables simplified as possible should be returned.

Run :func:`identify` with the graph information first, then use the output of :func:`identify` as the `P` in :func:`parse_causaleffect`. Use the output from :func:`parse_causaleffect` as the `P` in :func:`simplify`.

For further information, see Tikka & Karvanen (2017) "Simplifying Probabilistic Expressions in Causal Inference" Algorithm 1.


Parameters
----------
P : `sympy` expression or `y0` `Probability` object
The probabilistic expression that will be simplified, typically created using symbolic expressions in `sympy` or using `y0`'s Probability class.
topo : list of nodes
The topological ordering of the vertices in graph `G`, which can be obtained using `networkx.topological_sort`.
G_unobs : networkx.DiGraph object
A separate directed acyclic graph (DAG) that includes explicit nodes for unobserved confounders, created using `networkx.DiGraph`.
G : networkx.DiGraph object
Main graph G, which includes bidirected edges, and is created with :func:`igraph.graph_formula`.
G_obs : networkx.DiGraph object
A DAG that only includes directed edges, representing observed variables, created using `networkx.DiGraph`.


Returns
-------
list
Section in-progress

Dependencies
-------
This function depends on several other functions and classes, including: :func:`parents`, :func:`ancestors`, :func:`parse_causaleffect`, :func:`is_d_separated`, and :class:`probability`.


See Also
--------
- :func:`identify`
- :func:`parse_causaleffect`
- :func:`get.expression`
- :class:`probability`

Examples
--------
Section in-progress

.. code-block:: python


Keywords
--------
models, manip, math, utilities
Concepts
--------
probabilistic expressions, graph theory, causal inference

References
----------
Tikka, S., & Karvanen, J. (2017). Simplifying probabilistic expressions in causal inference. Journal of Machine Learning Research, 18(36), 1-30.

Author
------
Haley Hummel,
Psychology PhD student at Oregon State University