Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -129,3 +129,6 @@ dmypy.json
.pyre/

.idea/

# Generated by Mac Finder
.DS_Store
20 changes: 20 additions & 0 deletions documentation/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = source
BUILDDIR = build

# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
35 changes: 35 additions & 0 deletions documentation/make.bat
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
@ECHO OFF

pushd %~dp0

REM Command file for Sphinx documentation

if "%SPHINXBUILD%" == "" (
set SPHINXBUILD=sphinx-build
)
set SOURCEDIR=source
set BUILDDIR=build

%SPHINXBUILD% >NUL 2>NUL
if errorlevel 9009 (
echo.
echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
echo.installed, then set the SPHINXBUILD environment variable to point
echo.to the full path of the 'sphinx-build' executable. Alternatively you
echo.may add the Sphinx directory to PATH.
echo.
echo.If you don't have Sphinx installed, grab it from
echo.https://www.sphinx-doc.org/
exit /b 1
)

if "%1" == "" goto help

%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
goto end

:help
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%

:end
popd
31 changes: 31 additions & 0 deletions documentation/source/conf.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Configuration file for the Sphinx documentation builder.
#
# For the full list of built-in configuration values, see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html

# -- Project information -----------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information

project = 'Causaleffectpy Documentation'
copyright = '2024, Haley Hummel'
author = 'Haley Hummel'
release = '0.0.1'

# -- General configuration ---------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration

extensions = [
# other extensions
'sphinx_rtd_theme',
]

templates_path = ['_templates']
exclude_patterns = []



# -- Options for HTML output -------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output

html_theme = 'sphinx_rtd_theme'
html_static_path = ['_static']
11 changes: 11 additions & 0 deletions documentation/source/functions.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
Causaleffectpy Functions
=======================

.. toctree::
:maxdepth: 4
:titlesonly:

simplify
join
insert
powerset
30 changes: 30 additions & 0 deletions documentation/source/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
.. Causaleffectpy documentation master file, created by
sphinx-quickstart on Tue Aug 13 12:31:43 2024.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.

`Causaleffectpy` Documentation
==========================

This documentation provides an overview of `causaleffectpy`, which is derived from Santu Tikka's `causaleffect` R package. This documentation will focus on `simplify` and related functions in order to integrate them into the open source `y0` (Why Not?) Python package. For further information, see Tikka & Karvanen (2017) "Simplifying Probabilistic Expressions in Causal Inference".

.. toctree::
:maxdepth: 2

functions


References
===============

Hoyt, C.T., Zucker, J., & Parent, M-A. (2021). Y0 “Why Not?” for Causal Inference in Python (1.0) [Python package]. 10.5281/zenodo.4950768. https://github.com/y0-causal-inference/y0.
Tikka, S. (2022). causaleffect: Deriving Expressions of Joint Interventional Distributions and Transport Formulas in Causal Models (1.3.15) [R package]. https://github.com/santikka/causaleffect/.
Tikka, S., & Karvanen, J. (2017). Simplifying probabilistic expressions in causal inference. Journal of Machine Learning Research, 18(36), 1-30.


Indices and tables
==================

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
73 changes: 73 additions & 0 deletions documentation/source/insert.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
Insert
======

The `Insert` function inserts a missing variable into a joint distribution :math:`P(J|D)` using d-separation criteria in a given graph `G`. It is called when there are variables without corresponding terms in the expression.

Parameters
----------
joint_dist_variables : list of str
Equivalent to `J` in Tikka's `causaleffect` R package.
Existing joint set :math:`P(J|D)`; already processed and included in the joint distribution
from previous `simplify` iteration. Initially, may be empty for the starting point of
the joint distribution. `new_variable` is added to expand it using `insert` if d-separation conditions are met.
joint_conditioning_set : list of str
Equivalent to `D` in Tikka's `causaleffect` R package. Represented by the term :math:`P(V|C) := P(V_k|C_k)` in Tikka & Karvanen (2017).
Conditioning set for the already existing joint distribution :math:`P(J|D)`, used to condition the joint distribution over the set `joint_dist_variables`.
As `join` iterates, `conditioning_set` is modified to determine how the joint distribution :math:`P(J|D)` can be updated to
include the new variable `new_variable`, while preserving the required conditional independencies.
inserted_variables : str
Equivalent to `M` in Tikka's `causaleffect` R package.
Missing variables (variables not contained within the expression).
prob_conditioning_set : list of str
Equivalent to `cond` in Tikka's `causaleffect` R package.
Conditioning set for the current probabilistic term P(vari|cond); the set of variables that condition the current variable `new_variable`.
`join` uses `prob_conditioning_set` to evaluate conditional independence and determine if `new_variable` can be added to `joint_dist_variables`.
summation_variables : list of str
Equivalent to `S` in Tikka's `causaleffect` R package.
Not used directly in `join`. Current summation variable.
observed_variables : list of str
Equivalent to `O` in Tikka's `causaleffect` R package.
Observed variables (variables contained within the expression).
G_unobs : y0.Graph
Separate graph that turns bidirected edges into explicit nodes for unobserved confounders.
G : y0.Graph
Main graph `G`. Includes bidirected edges.
G_obs : y0.Graph
Separate graph that does not contain bidirected edges (only contains the directed edges with observed nodes).
topo : list of str
The topological ordering of the vertices in graph `G`.

Returns
-------
Section in-progress


Examples
--------
Section in-progress
.. code-block:: python


See Also
--------
- :func:`join`
- :func:`simplify`
- :func:`is_d_separated`
- :func:`powerset`

Keywords
--------
models, manip, math, utilities, graphs, methods, multivariate, distribution, probability

Concepts
--------
probabilistic expressions, graph theory, joint distribution, causal inference, d-separation

References
----------
Tikka, S., & Karvanen, J. (2017). Simplifying probabilistic expressions in causal inference. *Journal of Machine Learning Research*, 18(36), 1-30.

Author
------
Haley Hummel,
Psychology PhD student at Oregon State University
91 changes: 91 additions & 0 deletions documentation/source/join.rst
Copy link

@djinnome djinnome Aug 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @hmhummel Sphinx will display the mathematical formulas such as $P(J|D)$ and $P(V|C) := P(V_k |C_k)$ in latex if you put a :math: before the backquotes, for example:

:math:`P(V|C) := P(V_k | C_k)`

Note that I included underscores in Vk and Ck so that they would be subscripted.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, awesome!! Love this & will implement in the next commit — thanks

Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
Join
====

The `join` function determines whether the terms of the atomic expression actually represent a joint distribution.
It attempts to combine two terms: the joint term :math:`P(J|D)` obtained from `simplify()` and the term :math:`P(V|C) := P(V_k|C_k)`
of the current iteration step. `join` iterates over potential subsets to find a valid set where the variable `new_variable`
can be added to the joint distribution `joint_dist_variables`. During this process, `join` checks conditional
independencies using both `joint_conditioning_set` and `prob_conditioning_set`. The goal is to determine if these
terms can be combined based on the d-separation criteria in the graph `G`.

Parameters
----------
joint_dist_variables : list of str
Equivalent to `J` in Tikka's `causaleffect` R package.
Existing joint set :math:`P(J|D)`; already processed and included in the joint distribution
from previous `simplify` iteration. Initially, may be empty for the starting point of
the joint distribution. `new_variable` is added to expand it using `insert` if d-separation conditions are met.
joint_conditioning_set : list of str
Equivalent to `D` in Tikka's `causaleffect` R package. Represented by the term :math:`P(V|C) := P(V_k|C_k)` in Tikka & Karvanen (2017).
Conditioning set for the already existing joint distribution :math:`P(J|D)`, used to condition the joint distribution over the set `joint_dist_variables`.
As `join` iterates, `conditioning_set` is modified to determine how the joint distribution :math:`P(J|D)` can be updated to
include the new variable `new_variable`, while preserving the required conditional independencies.
new_variable : str
Equivalent to `vari` in Tikka's `causaleffect` R package.
New variable being considered for inclusion in the joint distribution (the new variable that we may want to add to the joint distribution `joint_dist_variables`).
`join` attempts to update the joint distribution `joint_dist_variables` by adding `new_variable` to define a new probabilistic term if the term still
satisfies the required conditional independencies. `insert` adds `new_variable` to `joint_dist_variables`.
prob_conditioning_set : list of str
Equivalent to `cond` in Tikka's `causaleffect` R package.
Conditioning set for the current probabilistic term P(vari|cond); the set of variables that condition the current variable `new_variable`.
`join` uses `prob_conditioning_set` to evaluate conditional independence and determine if `new_variable` can be added to `joint_dist_variables`.
summation_variables : list of str
Equivalent to `S` in Tikka's `causaleffect` R package.
Not used directly in `join`. Current summation variable.
inserted_variables : list of str
Equivalent to `M` in Tikka's `causaleffect` R package.
Missing variables (variables not contained within the expression).
observed_variables : list of str
Equivalent to `O` in Tikka's `causaleffect` R package.
Observed variables (variables contained within the expression).
G_unobs : `networkx.DiGraph` object
A separate directed acyclic graph (DAG) that includes explicit nodes for unobserved confounders, created using :func:`networkx.DiGraph`.
G : `networkx.DiGraph` object
Main graph G, which includes bidirected edges, and is created with :func:`networkx.DiGraph`.
G_obs : `networkx.DiGraph` object
A DAG that only includes directed edges, representing observed variables, created using :func:`networkx.DiGraph`.
topo : list of nodes
The topological ordering of the vertices in graph `G`, which can be obtained using :func:`networkx.topological_sort`.

Returns
-------
Section in-progress

Dependencies
-------
This function depends on several other functions and classes, including:
- :func:`powerset`
- :func:`is_d_separated`
- :func:`insert`. `insert` adds `new_variable` to `joint_dist_variables`.

See Also
--------
- :func:`simplify`
- :func:`is_d_separated`
- :func:`insert`

Examples
--------
Section in-progress
.. code-block:: python


Keywords
--------
models, manip, math, utilities

Concepts
--------
probabilistic expressions, graph theory, causal inference

References
----------
Tikka, S. (2022). `causaleffect`: Deriving Expressions of Joint Interventional Distributions and Transport Formulas in Causal Models (1.3.15) [R package]. https://github.com/santikka/causaleffect/.
Tikka, S., & Karvanen, J. (2017). Simplifying probabilistic expressions in causal inference. Journal of Machine Learning Research, 18(36), 1-30.
Tikka, S., & Karvanen, J. (2018). Identifying causal effects with the R package causaleffect. arXiv preprint arXiv:1806.07161.

Author
------
Haley Hummel,
Psychology PhD student at Oregon State University

45 changes: 45 additions & 0 deletions documentation/source/powerset.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
Powerset
========

The `Powerset` function generates the power set of a given set. The power set is the set of all possible subsets of the original set, including the empty set and the set itself.

Parameters
----------
set : list
A list representing the original set for which the power set will be generated. The set can contain any type of elements (e.g., numeric, string, or boolean).

Details
-------
The function computes all possible combinations of the elements of the input set. This includes the empty subset, individual elements, and all larger subsets up to and including the full set. The number of subsets in the power set of a set of size `n` is `2^n`.

Returns
-------
list of lists
A list of lists, where each inner list is a subset of the original input set. The list contains `2^n` subsets, where `n` is the length of the input set. If the input set is empty, the function returns a list containing only the empty set.

Examples
--------
Section in-progress
.. code-block:: python


See Also
--------
- `join`: for using :func:`powerset` with conditional independence in probabilistic graphical models.

Keywords
--------
set theory, combinatorics

Concepts
--------
power set, subsets

References
----------
Tikka, S., & Karvanen, J. (2017). Simplifying probabilistic expressions in causal inference. Journal of Machine Learning Research, 18(36), 1-30.

Author
------
Haley Hummel,
Psychology PhD student at Oregon State University
Loading