Skip to content
Merged

Rx #18

Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
# Book settings
title : Probabilistic Modelling # The title of the doc. Will be placed in the left navbar.
author : Tom Schierenbeck # The author of the doc
copyright : "2024" # Copyright year to be placed in the footer
copyright : "2025" # Copyright year to be placed in the footer
logo : "Logo.svg" # A path to the doc logo

# Force re-execution of notebooks on each build.
Expand Down
2 changes: 0 additions & 2 deletions doc/_toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,6 @@ parts:
- file: truncated_gaussian
- file: nyga_distribution
- file: joint_probability_trees
- file: template_modelling
- file: area_validation_metric
- caption: Miscellaneous
chapters:
- file: pendulum
Expand Down
6 changes: 3 additions & 3 deletions doc/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ The class inheritance diagram for parametric distributions is shown below.
For bayesian networks, the next class diagram is relevant.

```{eval-rst}
.. autoclasstree:: probabilistic_model.bayesian_network.bayesian_network probabilistic_model.bayesian_network.distributions
.. autoclasstree:: probabilistic_model.bayesian_network.bayesian_network
:zoom:
:namespace: probabilistic_model
:strict:
Expand All @@ -28,11 +28,11 @@ For bayesian networks, the next class diagram is relevant.
For networkx based probabilistic circuits the next class diagram is relevant.

```{eval-rst}
.. autoclasstree:: probabilistic_model.probabilistic_circuit.nx.probabilistic_circuit probabilistic_model.probabilistic_circuit.nx.distributions.distributions probabilistic_model.probabilistic_circuit.nx.helper
.. autoclasstree:: probabilistic_model.probabilistic_circuit.rx.probabilistic_circuit probabilistic_model.probabilistic_circuit.rx.helper
:zoom:
:namespace: probabilistic_model
:strict:
:caption: Inheritance Diagram for probabilistic circuits implemented with networkx.
:caption: Inheritance Diagram for probabilistic circuits implemented with rustworkx.
```

Finally, for jax based faster circuits with limited inference, this class diagram is relevant.
Expand Down
22 changes: 11 additions & 11 deletions doc/circuits.md
Original file line number Diff line number Diff line change
Expand Up @@ -209,9 +209,7 @@ from random_events.interval import *

import plotly.graph_objects as go
from probabilistic_model.distributions import *
from probabilistic_model.probabilistic_circuit.nx.distributions import *
from probabilistic_model.probabilistic_circuit.nx.probabilistic_circuit import *
from probabilistic_model.probabilistic_circuit.nx.helper import leaf
from probabilistic_model.probabilistic_circuit.rx.probabilistic_circuit import *
import numpy as np

```
Expand All @@ -220,10 +218,10 @@ import numpy as np

x = Continuous("X")

model = SumUnit()
model.add_subcircuit(leaf(GaussianDistribution(x, 0, 0.5)), np.log(0.1))
model.add_subcircuit(leaf(GaussianDistribution(x, 1, 2)), np.log(0.9))
model = model.probabilistic_circuit
model = ProbabilisticCircuit()
s1 = SumUnit(probabilistic_circuit = model)
s1.add_subcircuit(leaf(GaussianDistribution(x, 0, 0.5), model), np.log(0.1))
s1.add_subcircuit(leaf(GaussianDistribution(x, 1, 2), model), np.log(0.9))

wrong_mode, wrong_max_likelihood = model.root.subcircuits[1].distribution.mode()
wrong_max_likelihood = model.likelihood(np.array([[wrong_mode.simple_sets[0][x].simple_sets[0].lower]]))[0]
Expand All @@ -241,10 +239,12 @@ fig.show()
The next figure shows that if we truncated the children of the sum node to a disjoint support, we get the correct mode.

```{code-cell} ipython3
model = SumUnit()
model.add_subcircuit(leaf(TruncatedGaussianDistribution(x, open_closed(-np.inf, 0.5).simple_sets[0], 0, 0.5)), np.log(0.1))
model.add_subcircuit(leaf(TruncatedGaussianDistribution(x, open(0.5, np.inf).simple_sets[0], 1, 2)), np.log(0.9))
model = model.probabilistic_circuit

model = ProbabilisticCircuit()
s1 = SumUnit(probabilistic_circuit = model)
s1.add_subcircuit(leaf(TruncatedGaussianDistribution(x, open_closed(-np.inf, 0.5).simple_sets[0], 0, 0.5), model), np.log(0.1))
s1.add_subcircuit(leaf(TruncatedGaussianDistribution(x, open(0.5, np.inf).simple_sets[0], 1, 2), model), np.log(0.9))

fig = go.Figure(model.plot(), model.plotly_layout())
fig.show()
```
Expand Down
55 changes: 24 additions & 31 deletions doc/graphical_models.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,15 +62,12 @@ Let's look at an example of Bayesian Networks.

```{code-cell} ipython3
from probabilistic_model.bayesian_network.bayesian_network import *
from probabilistic_model.bayesian_network.distributions import *
from probabilistic_model.distributions import *
from probabilistic_model.probabilistic_circuit.rx.probabilistic_circuit import *
from random_events.set import *
from random_events.variable import *
from random_events.interval import *
import networkx as nx
from enum import IntEnum
from probabilistic_model.distributions import *
from probabilistic_model.probabilistic_circuit.nx.probabilistic_circuit import *
from probabilistic_model.probabilistic_circuit.nx.distributions.distributions import *

# Declare variable types and variables
class Success(IntEnum):
Expand All @@ -96,45 +93,41 @@ y = Continuous("y")
bn = BayesianNetwork()

# create root
cpd_success = RootDistribution(success, MissingDict(float, {hash(Success.FAILURE): 0.8, hash(Success.SUCCESS): 0.2}))
bn.add_node(cpd_success)
cpd_success = Root(SymbolicDistribution(success, MissingDict(float, {hash(Success.FAILURE): 0.8, hash(Success.SUCCESS): 0.2})), bayesian_network=bn)

# create P(ObjectPosition | Success)
cpd_object_position = ConditionalProbabilityTable(object_position)
cpd_object_position.conditional_probability_distributions[int(Success.FAILURE)] = SymbolicDistribution(object_position,
MissingDict(float, {hash(ObjectPosition.LEFT): 0.3,
hash(ObjectPosition.RIGHT): 0.3,
hash(ObjectPosition.CENTER): 0.4}))
cpd_object_position.conditional_probability_distributions[ int(Success.SUCCESS)] = SymbolicDistribution(object_position,
MissingDict(float, {hash(ObjectPosition.LEFT): 0.3,
hash(ObjectPosition.RIGHT): 0.3,
hash(ObjectPosition.CENTER): 0.4}))
bn.add_node(cpd_object_position)
cpd_object_position = ConditionalProbabilityTable(bayesian_network=bn)
cpd_object_position.conditional_probability_distributions[Success.FAILURE] = SymbolicDistribution(object_position,
MissingDict(float, {ObjectPosition.LEFT: 0.3,
ObjectPosition.RIGHT: 0.3,
ObjectPosition.CENTER: 0.4}))
cpd_object_position.conditional_probability_distributions[Success.SUCCESS] = SymbolicDistribution(object_position,
MissingDict(float, {ObjectPosition.LEFT: 0.3,
ObjectPosition.RIGHT: 0.3,
ObjectPosition.CENTER: 0.4}))
bn.add_edge(cpd_success, cpd_object_position)

# create P(Mood | Success)
cpd_mood = ConditionalProbabilityTable(mood)
cpd_mood.conditional_probability_distributions[hash(Success.FAILURE)] = SymbolicDistribution(mood,
MissingDict(float, {hash(Mood.HAPPY): 0.2,
hash(Mood.SAD): 0.8}))
cpd_mood.conditional_probability_distributions[hash(Success.SUCCESS)] = SymbolicDistribution(mood,
MissingDict(float, {hash(Mood.HAPPY): 0.9,
hash(Mood.SAD): 0.1}))
bn.add_node(cpd_mood)
cpd_mood = ConditionalProbabilityTable(bayesian_network=bn)
cpd_mood.conditional_probability_distributions[Success.FAILURE] = SymbolicDistribution(mood,
MissingDict(float, {Mood.HAPPY: 0.2,
Mood.SAD: 0.8}))
cpd_mood.conditional_probability_distributions[Success.SUCCESS] = SymbolicDistribution(mood,
MissingDict(float, {Mood.HAPPY: 0.9,
Mood.SAD: 0.1}))
bn.add_edge(cpd_success, cpd_mood)

# create P(X, Y | ObjectPosition)
cpd_xy = ConditionalProbabilisticCircuit([x, y])
product_unit = ProductUnit()
product_unit.add_subcircuit(UnivariateContinuousLeaf(GaussianDistribution(x, 0, 1)))
product_unit.add_subcircuit(UnivariateContinuousLeaf(GaussianDistribution(y, 0, 1)))
default_circuit = product_unit.probabilistic_circuit
cpd_xy = ConditionalProbabilisticCircuit(bayesian_network=bn)
default_circuit = ProbabilisticCircuit()
product_unit = ProductUnit(probabilistic_circuit=default_circuit)
product_unit.add_subcircuit(leaf(GaussianDistribution(x, 0, 1), default_circuit))
product_unit.add_subcircuit(leaf(GaussianDistribution(y, 0, 1), default_circuit))

cpd_xy.conditional_probability_distributions[hash(ObjectPosition.LEFT)] = default_circuit.truncated(SimpleEvent({x: closed(-np.inf, -0.5)}).as_composite_set())[0]
cpd_xy.conditional_probability_distributions[hash(ObjectPosition.RIGHT)] = default_circuit.truncated(SimpleEvent({x: open(0.5, np.inf)}).as_composite_set())[0]
cpd_xy.conditional_probability_distributions[hash(ObjectPosition.CENTER)] = default_circuit.truncated(SimpleEvent({x: open_closed(-0.5, 0.5)}).as_composite_set())[0]

bn.add_node(cpd_xy)
bn.add_edge(cpd_object_position, cpd_xy)

bn.plot()
Expand Down
2 changes: 1 addition & 1 deletion doc/intro.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,6 @@ If you use this software for publications, please cite it as below.
author = {Schierenbeck, Tom},
title = {probabilistic_model: A Python package for probabilistic models},
url = {https://github.com/tomsch420/probabilistic_model},
version = {5.0.4},
version = {7.1.0},
}
```
4 changes: 2 additions & 2 deletions doc/joint_probability_trees.md
Original file line number Diff line number Diff line change
Expand Up @@ -178,7 +178,7 @@ Next, let's define, and fit the model. Plot the decision criteria and the result
variables = infer_variables_from_dataframe(dataset, scale_continuous_types=False, min_likelihood_improvement=2.5)
model = TutorialJPT(variables, min_impurity_improvement=0.05, min_samples_leaf=500)
model.fig = fig
model.fit(dataset)
pc = model.fit(dataset)
model.fig.show()
```

Expand All @@ -189,6 +189,6 @@ They are ordered in the same way as they are induced in the distribution.
Finally, we plot resulting leaf distributions.

```{code-cell} ipython3
figure = go.Figure(model.plot(1000, surface=True))
figure = go.Figure(pc.plot(1000, surface=True))
figure.show()
```
32 changes: 15 additions & 17 deletions doc/layered_circuit.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ While understanding the concepts of a probabilistic circuit is subject to math,
story.
This section discusses different approaches to represent circuits.

## the DAG (networkx) way
## the DAG (rustworkx) way

The easiest and naive way of implementing a circuit is using a directed acyclic graph (DAG).
The graph directly follows definition {prf:ref}`def-probabilistic-circuit`.
Expand All @@ -29,9 +29,7 @@ Let's look at an example.
```{code-cell} ipython3
import plotly
plotly.offline.init_notebook_mode()
from probabilistic_model.probabilistic_circuit.nx.helper import leaf
from probabilistic_model.probabilistic_circuit.nx.probabilistic_circuit import *
from probabilistic_model.probabilistic_circuit.nx.distributions import *
from probabilistic_model.probabilistic_circuit.rx.probabilistic_circuit import *
from probabilistic_model.distributions import *
from random_events.variable import Continuous
import networkx as nx
Expand All @@ -43,9 +41,10 @@ import equinox as eqx

x = Continuous("x")
y = Continuous("y")
sum1, sum2, sum3 = SumUnit(), SumUnit(), SumUnit()
sum4, sum5 = SumUnit(), SumUnit()
prod1, prod2 = ProductUnit(), ProductUnit()
model = ProbabilisticCircuit()
sum1, sum2, sum3 = SumUnit(probabilistic_circuit=model), SumUnit(probabilistic_circuit=model), SumUnit(probabilistic_circuit=model)
sum4, sum5 = SumUnit(probabilistic_circuit=model), SumUnit(probabilistic_circuit=model)
prod1, prod2 = ProductUnit(probabilistic_circuit=model), ProductUnit(probabilistic_circuit=model)

sum1.add_subcircuit(prod1, np.log(0.5))
sum1.add_subcircuit(prod2, np.log(0.5))
Expand All @@ -54,10 +53,10 @@ prod1.add_subcircuit(sum4)
prod2.add_subcircuit(sum3)
prod2.add_subcircuit(sum5)

d_x1 = leaf(UniformDistribution(x, SimpleInterval(0, 1)))
d_x2 = leaf(UniformDistribution(x, SimpleInterval(2, 3)))
d_y1 = leaf(UniformDistribution(y, SimpleInterval(0, 1)))
d_y2 = leaf(UniformDistribution(y, SimpleInterval(3, 4)))
d_x1 = leaf(UniformDistribution(x, SimpleInterval(0, 1)), probabilistic_circuit=model)
d_x2 = leaf(UniformDistribution(x, SimpleInterval(2, 3)), probabilistic_circuit=model)
d_y1 = leaf(UniformDistribution(y, SimpleInterval(0, 1)), probabilistic_circuit=model)
d_y2 = leaf(UniformDistribution(y, SimpleInterval(3, 4)), probabilistic_circuit=model)

sum2.add_subcircuit(d_x1, np.log(0.8))
sum2.add_subcircuit(d_x2, np.log(0.2))
Expand All @@ -69,7 +68,6 @@ sum4.add_subcircuit(d_y2, np.log(0.5))
sum5.add_subcircuit(d_y1, np.log(0.1))
sum5.add_subcircuit(d_y2, np.log(0.9))

model = sum1.probabilistic_circuit
model.plot_structure()
plt.show()
```
Expand All @@ -88,8 +86,8 @@ The Benefits of the DAG representation are:
- Great for teaching

The drawbacks are:
- Pure Python implementations are usually slow
- Improvements of machine learning packages do not affect the DAG approach (besides networkx improvements)
- Python implementations are usually slow
- Rustowrkx does not benefit from SMID instruction like jax would
- No benefit from modern hardware acceleration


Expand Down Expand Up @@ -130,16 +128,16 @@ The drawbacks are:

As of today, the layered approach is implemented in jax and supports all inferences that do not change the structure of
the circuit.
These are all but marginalization and conditioning.
These are all but marginalization and conditioning/truncation.

The JAX implementation uses equinox to aid with an OOP approach to the circuit.
It uses sparse matrices to represent edges between the layers and hence does not suffer from extreme memory consumption
like EinsumNetworks.

JAX layered circuits are approximately **4000** times faster than the networkx implementation in calculating the
JAX layered circuits are approximately **20** times faster than the rustworkx implementation in calculating the
log-likelihood, and hence are a great tool for doing deep learning with circuits.
For the speed-up to kick in, the JAX computational graph that describes the circuit has to be compiled.
This is expensive if done often.
This is expensive, so don't do it more than needed.
However, for a fixed circuit, the speed-up is immense.

In the scripts folder, you can reproduce these results.
Expand Down
22 changes: 10 additions & 12 deletions doc/nyga_distribution.md
Original file line number Diff line number Diff line change
Expand Up @@ -272,25 +272,23 @@ plotly.offline.init_notebook_mode()
import plotly.graph_objects as go

distribution = NygaDistribution(Continuous("x"), min_samples_per_quantile=100, min_likelihood_improvement=0.01)
distribution.fit(dataset)
distribution = distribution.fit(dataset)
fig = go.Figure(distribution.plot(), distribution.plotly_layout())
fig.show()
```

Comparing this to the gaussian distribution we sampled from, we can see that the result is very similar.
Comparing this to the Gaussian distribution we sampled from, we can see that the result is very similar.

```{code-cell} ipython3
from probabilistic_model.distributions import GaussianDistribution
from probabilistic_model.probabilistic_circuit.nx.distributions import *
from probabilistic_model.probabilistic_circuit.nx.helper import leaf
from probabilistic_model.probabilistic_circuit.nx.probabilistic_circuit import SumUnit

gaussian_1 = leaf(GaussianDistribution(Continuous("x"), 0, 1))
gaussian_2 = leaf(GaussianDistribution(Continuous("x"), 5, 0.5))
mixture = SumUnit()
mixture.add_subcircuit(gaussian_1, np.log(0.5))
mixture.add_subcircuit(gaussian_2, np.log(0.5))
mixture = mixture.probabilistic_circuit
from probabilistic_model.probabilistic_circuit.rx.probabilistic_circuit import *

mixture = ProbabilisticCircuit()
gaussian_1 = leaf(GaussianDistribution(Continuous("x"), 0, 1), mixture)
gaussian_2 = leaf(GaussianDistribution(Continuous("x"), 5, 0.5), mixture)
s1 = SumUnit(probabilistic_circuit = mixture)
s1.add_subcircuit(gaussian_1, np.log(0.5))
s1.add_subcircuit(gaussian_2, np.log(0.5))
fig.add_traces(mixture.plot())
```

Expand Down
Loading
Loading