diff --git a/talk.md b/talk.md
index 946ccc8..cb46f22 100644
--- a/talk.md
+++ b/talk.md
@@ -765,126 +765,30 @@ $$
.bold.center[Having access to the gradients can make the fit orders of magnitude faster than finite difference]
---
-class: focus-slide, center
# Enable new techniques with autodiff
-.huge.bold.center[Familiar (toy) example: Optimizing selection "cut" for an analysis]
-
----
-# Discriminate Signal and Background
-
-* Counting experiment for presence of signal process
-* Place discriminate selection cut on observable $x$ to maximize significance
- - Significance: $\sqrt{2 (S+B) \log(1 + \frac{S}{B})-2S}$ (for small $S/B$: significance $\to S/\sqrt{B}$)
-
-.footnote[Example inspired by Alexander Held's [example of a differentiable analysis](https://github.com/alexander-held/differentiable-analysis-example/)]
-
-.kol-1-2.center[
-
-
-
-]
-.kol-1-2.center[
-
-
-
-]
-
----
-# Traditionally: Scan across cut values
-
-- Set baseline cut at $x=0$ (accept everything)
-- Step along cut values in $x$ and calculate significance at each cut. Keep maximum.
-
-.kol-1-2.center[
-.width-100[![signal_background_stacked](figures/signal_background_stacked.png)]
-]
-.kol-1-2[
-.width-100[![significance_cut_scan](figures/significance_cut_scan.png)]
-]
-
-.center[Significance: $\sqrt{2 (S+B) \log(1 + \frac{S}{B})-2S}$]
-
----
-# Differentiable Approach
-
-.kol-1-2.large[
-- Need differentiable analogue to non-differentiable cut
-- Weight events using activation function of sigmoid
+.kol-2-3[
+* Familiar (toy) example: Optimizing selection "cut" for an analysis.
+Place discriminate selection cut on observable $x$ to maximize significance.
+* Traditionally, step along values in $x$ and calculate significance at each selection. Keep maximum.
+* Need differentiable analogue to non-differentiable "cut".
+Weight events using activation function of sigmoid
.center[$w=\left(1 + e^{-\alpha(x-c)}\right)^{-1}$]
-- Event far .italic[below] cut: $w \to 0$
-- Event far .italic[above] cut: $w \to 1$
-- $\alpha$ tunable parameter for steepness
- - Larger $\alpha$ more cut-like
-]
-.kol-1-2[
-
-.width-100[![sigmoid_event_weights](figures/sigmoid_event_weights.png)]
-]
-
----
-# Compare Hard Cuts vs. Differentiable
-
-.kol-1-2.large[
-- For hard cuts the significance was calculated by applying the cut and than using the remaining $S$ and $B$ events
-- But for the differentiable model there aren't cuts, so approximate cuts with the sigmoid approach and weights
-- Comparing the two methods shows good agreement
-- Can see that the approximation to the hard cuts improves with larger $\alpha$
- - But can become unstable, so tunable
-]
-.kol-1-2.center[
-
-.width-100[![significance_scan_compare](figures/significance_scan_compare.png)]
-]
-
----
-# Compare Hard Cuts vs. Differentiable
-
-.kol-1-2.large[
-- For hard cuts the significance was calculated by applying the cut and then using the remaining $S$ and $B$ events
-- But for the differentiable model there aren't cuts, so approximate cuts with the sigmoid approach and weights
-- Comparing the two methods shows good agreement
-- Can see that the approximation to the hard cuts improves with larger $\alpha$
- - But can become unstable, so tunable
-]
-.kol-1-2.center[
-
-.width-100[![significance_scan_compare_high_alpha](figures/significance_scan_compare_high_alpha.png)]
-]
-
----
-# Accessing the Gradient
+* Most importantly though, with the differentiable model we have access to the gradient $\partial_{x} f(x)$
+* So can find the maximum significance at the point where the gradient of the significance is zero $\partial_{x} f(x) = 0$
+* With a simple gradient descent algorithm can easily automate the significance optimization
-.kol-2-5.large[
-* Most importantly though, with the differentiable model we have access to the gradient
- - $\partial_{x} f(x)$
-* So can find the maximum significance at the point where the gradient of the significance is zero
- - $\partial_{x} f(x) = 0$
-* With the gradient in hand this cries out for automated optimization!
]
-.kol-3-5.center[
+.kol-1-3.center[
-
+
+
+
]
----
-# Automated Optimization
-
-.kol-2-5.large[
-* With a simple gradient descent algorithm can easily automate the significance optimization
-* For this toy example, obviously less efficient then cut and count scan
-* Gradient methods apply well in higher dimensional problems
-* Allows for the "cut" to become a parameter that can be differentiated through for the larger analysis
-]
-.kol-3-5.center[
-.width-100[![automated_optimization](figures/automated_optimization.png)]
-
-
-]
-
---
# New Art: Analysis as a Differentiable Program
@@ -1266,6 +1170,121 @@ $$
.center[Image credit: [Alex Held](https://indico.cern.ch/event/1076231/contributions/4560405/)]
]
+---
+# Discriminate Signal and Background
+
+* Counting experiment for presence of signal process
+* Place discriminate selection cut on observable $x$ to maximize significance
+ - Significance: $\sqrt{2 (S+B) \log(1 + \frac{S}{B})-2S}$ (for small $S/B$: significance $\to S/\sqrt{B}$)
+
+.footnote[Example inspired by Alexander Held's [example of a differentiable analysis](https://github.com/alexander-held/differentiable-analysis-example/)]
+
+.kol-1-2.center[
+
+
+
+]
+.kol-1-2.center[
+
+
+
+]
+
+---
+# Traditionally: Scan across cut values
+
+- Set baseline cut at $x=0$ (accept everything)
+- Step along cut values in $x$ and calculate significance at each cut. Keep maximum.
+
+.kol-1-2.center[
+.width-100[![signal_background_stacked](figures/signal_background_stacked.png)]
+]
+.kol-1-2[
+.width-100[![significance_cut_scan](figures/significance_cut_scan.png)]
+]
+
+.center[Significance: $\sqrt{2 (S+B) \log(1 + \frac{S}{B})-2S}$]
+
+---
+# Differentiable Approach
+
+.kol-1-2.large[
+- Need differentiable analogue to non-differentiable cut
+- Weight events using activation function of sigmoid
+
+.center[$w=\left(1 + e^{-\alpha(x-c)}\right)^{-1}$]
+
+- Event far .italic[below] cut: $w \to 0$
+- Event far .italic[above] cut: $w \to 1$
+- $\alpha$ tunable parameter for steepness
+ - Larger $\alpha$ more cut-like
+]
+.kol-1-2[
+
+.width-100[![sigmoid_event_weights](figures/sigmoid_event_weights.png)]
+]
+
+---
+# Compare Hard Cuts vs. Differentiable
+
+.kol-1-2.large[
+- For hard cuts the significance was calculated by applying the cut and than using the remaining $S$ and $B$ events
+- But for the differentiable model there aren't cuts, so approximate cuts with the sigmoid approach and weights
+- Comparing the two methods shows good agreement
+- Can see that the approximation to the hard cuts improves with larger $\alpha$
+ - But can become unstable, so tunable
+]
+.kol-1-2.center[
+
+.width-100[![significance_scan_compare](figures/significance_scan_compare.png)]
+]
+
+---
+# Compare Hard Cuts vs. Differentiable
+
+.kol-1-2.large[
+- For hard cuts the significance was calculated by applying the cut and then using the remaining $S$ and $B$ events
+- But for the differentiable model there aren't cuts, so approximate cuts with the sigmoid approach and weights
+- Comparing the two methods shows good agreement
+- Can see that the approximation to the hard cuts improves with larger $\alpha$
+ - But can become unstable, so tunable
+]
+.kol-1-2.center[
+
+.width-100[![significance_scan_compare_high_alpha](figures/significance_scan_compare_high_alpha.png)]
+]
+
+---
+# Accessing the Gradient
+
+.kol-2-5.large[
+* Most importantly though, with the differentiable model we have access to the gradient
+ - $\partial_{x} f(x)$
+* So can find the maximum significance at the point where the gradient of the significance is zero
+ - $\partial_{x} f(x) = 0$
+* With the gradient in hand this cries out for automated optimization!
+]
+.kol-3-5.center[
+
+
+
+]
+
+---
+# Automated Optimization
+
+.kol-2-5.large[
+* With a simple gradient descent algorithm can easily automate the significance optimization
+* For this toy example, obviously less efficient then cut and count scan
+* Gradient methods apply well in higher dimensional problems
+* Allows for the "cut" to become a parameter that can be differentiated through for the larger analysis
+]
+.kol-3-5.center[
+.width-100[![automated_optimization](figures/automated_optimization.png)]
+
+
+]
+
---
# What is `pyhf`?