You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: paper.md
+87-4
Original file line number
Diff line number
Diff line change
@@ -129,14 +129,97 @@ where
129
129
* and $F(\mu)$ is the **convex conjugate** of $G$.
130
130
131
131
132
-
PCA is a special case of EPCA when the data is Gaussian (see [appendix](https://sisl.github.io/ExpFamilyPCA.jl/dev/math/appendix/gaussian/)). By selecting the appropriate function $G$, EPCA can handle a wider range of data types, offering more versatility than PCA. Then $\theta_i = g(a_i V)$ and
132
+
PCA is a special case of EPCA when the data is Gaussian (see [appendix](https://sisl.github.io/ExpFamilyPCA.jl/dev/math/appendix/gaussian/)). By selecting the appropriate function $G$, EPCA can handle a wider range of data types, offering more versatility than PCA. Then
133
133
134
+
$$
135
+
x_i \approx \theta_i = g(a_i V).
134
136
$$
135
-
a = \argmin B
136
-
$$
137
+
138
+
### Regularization
139
+
140
+
The optimum may diverge, so we introduce a regularization term
where $\epsilon > 0$ and $\mu_0 \in \mathrm{range}(g)$ to ensure the solution is stationary.
150
+
151
+
152
+
### Example: Gamma EPCA
153
+
154
+
old faithful stuff here
155
+
156
+

157
+
158
+
### Example: Poisson EPCA
159
+
160
+
The Poisson EPCA objective is the generalized Kullback-Leibler (KL) divergence (see [appendix](https://sisl.github.io/ExpFamilyPCA.jl/dev/math/appendix/poisson/)), making Poisson EPCA ideal for compressing discrete distribution data.
161
+
162
+
Add some blurb about how Poisson EPCA is then an alternative to correspondance analysis.
163
+
164
+
This is useful in applications like belief compression in reinforcement learning [@Roy], where high-dimensional belief states can be effectively reduced with minimal information loss. Below we recreate a figure from @shortRoy and observe that Poisson EPCA achieved a nearly perfect reconstruction of a $41$-dimensional belief profile using just $5$ basis components.
165
+
166
+

167
+
168
+
For a larger environment with $200$ states, PCA struggles even with $10$ basis.
169
+
170
+

171
+
172
+
# API
173
+
174
+
## Supported Distributions
175
+
176
+
`ExpFamilyPCA.jl` includes efficient EPCA implementations for several exponential family distributions.
|`BinomialEPCA`| For count data with a fixed number of trials |
182
+
|`ContinuousBernoulliEPCA`| For modeling probabilities between $0$ and $1$ |
183
+
|`GammaEPCA`| For positive continuous data |
184
+
|`GaussianEPCA`| Standard PCA for real-valued data |
185
+
|`NegativeBinomialEPCA`| For over-dispersed count data |
186
+
|`ParetoEPCA`| For modeling heavy-tailed distributions |
187
+
|`PoissonEPCA`| For count and discrete distribution data |
188
+
|`WeibullEPCA`| For modeling life data and survival analysis |
189
+
190
+
## Custom Distributions
191
+
192
+
When working with custom distributions, certain specifications are often more convenient and computationally efficient than others. For example, inducing the gamma EPCA objective from the log-partition $G(\theta) = -\log(-\theta)$ and its derivative $g(\theta) = -1/\theta$ is much simpler than implementing the full the Itakura-Saito distance [@ItakuraSaito] (see [appendix](https://sisl.github.io/ExpFamilyPCA.jl/dev/math/appendix/gamma/)):
A lengthier discussion of the `EPCA` constructors and math is provided in the [documentation](https://sisl.github.io/ExpFamilyPCA.jl/dev/math/objectives/).
207
+
208
+
## Usage
209
+
210
+
Each `EPCA` object supports a three-method interface: `fit!`, `compress`, and `decompress`. `fit!` trains the model and returns the compressed training data; `compress` returns compressed input; and `decompress` reconstructs the original data from the compressed representation.
0 commit comments