Skip to content

Commit

Permalink
Small documentation updates for the release
Browse files Browse the repository at this point in the history
  • Loading branch information
MargaretDuff committed Oct 10, 2024
1 parent ba7f09f commit cdf064f
Show file tree
Hide file tree
Showing 9 changed files with 43 additions and 43 deletions.
3 changes: 0 additions & 3 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,3 @@
* X.X.X


* 24.2.0
- New Features:
- Added SVRG and LSVRG stochastic functions (#1625)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@

class OperatorCompositionFunction(Function):

""" Composition of a function with an operator as : :math:`(F \otimes A)(x) = F(Ax)`
""" Composition of a function with an operator as : :math:`(F \circ A)(x) = F(Ax)`
:parameter function: :code:`Function` F
:parameter operator: :code:`Operator` A
Expand Down Expand Up @@ -66,9 +66,9 @@ def __call__(self, x):

def gradient(self, x, out=None):

""" Return the gradient of F(Ax),
""" Return the gradient of :math:`F(Ax)`,
..math :: (F(Ax))' = A^{T}F'(Ax)
:math:`(F(Ax))' = A^{T}F'(Ax)`
"""

Expand Down
13 changes: 7 additions & 6 deletions Wrappers/Python/cil/optimisation/functions/SVRGFunction.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,8 @@ class SVRGFunction(ApproximateGradientSumFunction):
r"""
The Stochastic Variance Reduced Gradient (SVRG) function calculates the approximate gradient of :math:`\sum_{i=1}^{n-1}f_i`. For this approximation, every `snapshot_update_interval` number of iterations, a full gradient calculation is made at this "snapshot" point. Intermediate gradient calculations update this snapshot by taking a index :math:`i_k` and calculating the gradient of :math:`f_{i_k}`s at the current iterate and the snapshot, updating the approximate gradient to be:
.. math ::
n*\nabla f_{i_k}(x_k) - n*\nabla f_{i_k}(\tilde{x}) + \nabla \sum_{i=0}^{n-1}f_i(\tilde{x}),
.. math ::
n*\nabla f_{i_k}(x_k) - n*\nabla f_{i_k}(\tilde{x}) + \nabla \sum_{i=0}^{n-1}f_i(\tilde{x}),
where :math:`\tilde{x}` is the latest "snapshot" point and :math:`x_k` is the value at the current iteration.
Expand Down Expand Up @@ -86,7 +86,7 @@ def __init__(self, functions, sampler=None, snapshot_update_interval=None, store
self.snapshot = None

def gradient(self, x, out=None):
""" Selects a random function using the `sampler` and then calls the approximate gradient at :code:`x` or calculates a full gradient depending on the update frequency
r""" Selects a random function using the `sampler` and then calls the approximate gradient at :code:`x` or calculates a full gradient depending on the update frequency
Parameters
----------
Expand Down Expand Up @@ -115,9 +115,10 @@ def gradient(self, x, out=None):
return self.approximate_gradient(x, self.function_num, out=out)

def approximate_gradient(self, x, function_num, out=None):
""" Calculates the stochastic gradient at the point :math:`x` by using the gradient of the selected function, indexed by :math:`i_k`, the `function_number` in {0,...,len(functions)-1}, and the full gradient at the snapshot :math:`\tilde{x}`
.. math ::
n*\nabla f_{i_k}(x_k) - n*\nabla f_{i_k}(\tilde{x}) + \nabla \sum_{i=0}^{n-1}f_i(\tilde{x})
r""" Calculates the stochastic gradient at the point :math:`x` by using the gradient of the selected function, indexed by :math:`i_k`, the `function_number` in {0,...,len(functions)-1}, and the full gradient at the snapshot :math:`\tilde{x}`
.. math ::
n*\nabla f_{i_k}(x_k) - n*\nabla f_{i_k}(\tilde{x}) + \nabla \sum_{i=0}^{n-1}f_i(\tilde{x})
Note
-----
Expand Down
2 changes: 1 addition & 1 deletion Wrappers/Python/cil/optimisation/operators/Operator.py
Original file line number Diff line number Diff line change
Expand Up @@ -609,7 +609,7 @@ class CompositionOperator(Operator):
Parameters
----------
args: `Operator`s
args: `Operator` s
Operators to be composed. As in mathematical notation, the operators will be applied right to left
"""
Expand Down
13 changes: 7 additions & 6 deletions Wrappers/Python/cil/optimisation/utilities/StepSizeMethods.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,13 @@ class ArmijoStepSizeRule(StepSizeRule):
The Armijo rule runs a while loop to find the appropriate step_size by starting from a very large number (`alpha`). The step_size is found by reducing the step size (by a factor `beta`) in an iterative way until a certain criterion is met. To avoid infinite loops, we add a maximum number of times (`max_iterations`) the while loop is run.
Reference
---------
- Algorithm 3.1 in Nocedal, J. and Wright, S.J. eds., 1999. Numerical optimization. New York, NY: Springer New York. https://www.math.uci.edu/~qnie/Publications/NumericalOptimization.pdf)
- https://projecteuclid.org/download/pdf_1/euclid.pjm/1102995080
Parameters
----------
alpha: float, optional, default=1e6
Expand All @@ -89,12 +96,6 @@ class ArmijoStepSizeRule(StepSizeRule):
If `warmstart = True` the initial step size at each Armijo iteration is the calculated step size from the last iteration. If `warmstart = False` at each Armijo iteration, the initial step size is reset to the original, large `alpha`.
In the case of *well-behaved* convex functions, `warmstart = True` is likely to be computationally less expensive. In the case of non-convex functions, or particularly tricky functions, setting `warmstart = False` may be beneficial.
Reference
------------
- Algorithm 3.1 in Nocedal, J. and Wright, S.J. eds., 1999. Numerical optimization. New York, NY: Springer New York. https://www.math.uci.edu/~qnie/Publications/NumericalOptimization.pdf)
- https://projecteuclid.org/download/pdf_1/euclid.pjm/1102995080
"""

def __init__(self, alpha=1e6, beta=0.5, max_iterations=None, warmstart=True):
Expand Down
15 changes: 7 additions & 8 deletions Wrappers/Python/cil/optimisation/utilities/sampler.py
Original file line number Diff line number Diff line change
Expand Up @@ -549,14 +549,18 @@ def _herman_meyer_function(num_indices, addition_arr, repeat_length_arr, iterat

@staticmethod
def herman_meyer(num_indices):
"""
Instantiates a sampler which outputs in a Herman Meyer order.
r"""Instantiates a sampler which outputs in a Herman Meyer order.
Parameters
----------
num_indices: int
The sampler will select from a range of indices 0 to num_indices. For Herman-Meyer sampling this number should not be prime.
Returns
-------
Sampler
An instance of the Sampler class which outputs in a Herman Meyer order.
Reference
----------
With thanks to Imraj Singh and Zeljko Kereta for their help with the initial implementation of the Herman Meyer sampling. Their implementation was used in:
Expand All @@ -567,11 +571,6 @@ def herman_meyer(num_indices):
Herman GT, Meyer LB. Algebraic reconstruction techniques can be made computationally efficient. IEEE Trans Med Imaging. doi: 10.1109/42.241889.
Returns
-------
Sampler
An instance of the Sampler class which outputs in a Herman Meyer order.
Example
-------
>>> sampler=Sampler.herman_meyer(12)
Expand Down
18 changes: 10 additions & 8 deletions Wrappers/Python/cil/processors/CofR_image_sharpness.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,20 +56,22 @@ class CofR_image_sharpness(Processor):
Example
-------
from cil.processors import CentreOfRotationCorrector
.. code-block :: python
from cil.processors import CentreOfRotationCorrector
processor = CentreOfRotationCorrector.image_sharpness('centre', 'tigre')
processor.set_input(data)
data_centred = processor.get_output()
processor = CentreOfRotationCorrector.image_sharpness('centre', 'tigre')
processor.set_input(data)
data_centred = processor.get_output()
Example
-------
from cil.processors import CentreOfRotationCorrector
.. code-block :: python
from cil.processors import CentreOfRotationCorrector
processor = CentreOfRotationCorrector.image_sharpness(slice_index=120, 'astra')
processor.set_input(data)
processor.get_output(out=data)
processor = CentreOfRotationCorrector.image_sharpness(slice_index=120, 'astra')
processor.set_input(data)
processor.get_output(out=data)
Note
Expand Down
8 changes: 4 additions & 4 deletions Wrappers/Python/cil/processors/Padder.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,12 +40,12 @@ class Padder(DataProcessor):
Notes
-----
`pad_width` behaviour (number of pixels):
`pad_width` behaviour (number of pixels):
- int: Each axis will be padded with a border of this size
- tuple(int, int): Each axis will be padded with an asymmetric border i.e. (before, after)
- dict: Specified axes will be padded: e.g. {'horizontal':(8, 23), 'vertical': 10}
`pad_values` behaviour:
`pad_values` behaviour:
- float: Each border will use this value
- tuple(float, float): Each value will be used asymmetrically for each axis i.e. (before, after)
- dict: Specified axes and values: e.g. {'horizontal':(8, 23), 'channel':5}
Expand Down Expand Up @@ -106,12 +106,12 @@ def constant(pad_width=None, constant_values=0.0):
Notes
-----
`pad_width` behaviour (number of pixels):
`pad_width` behaviour (number of pixels):
- int: Each axis will be padded with a border of this size
- tuple(int, int): Each axis will be padded with an asymmetric border i.e. (before, after)
- dict: Specified axes will be padded: e.g. {'horizontal':(8, 23), 'vertical': 10}
`constant_values` behaviour (value of pixels):
`constant_values` behaviour (value of pixels):
- float: Each border will be set to this value
- tuple(float, float): Each border value will be used asymmetrically for each axis i.e. (before, after)
- dict: Specified axes and values: e.g. {'horizontal':(8, 23), 'channel':5}
Expand Down
8 changes: 4 additions & 4 deletions docs/source/optimisation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -239,7 +239,7 @@ Note
----
All the approximate gradients written in CIL are of a similar order of magnitude to the full gradient calculation. For example, in the :code:`SGFunction` we approximate the full gradient by :math:`n\nabla f_i` for an index :math:`i` given by the sampler.
The multiplication by :math:`n` is a choice to more easily allow comparisons between stochastic and non-stochastic methods and between stochastic methods with varying numbers of subsets.
The multiplication ensures that the (SAGA, SGD, and SVRG and LSVRG) approximate gradients are an unbiased estimator of the full gradient ie :math:`\mathbb{E}\left[\tilde\nabla f(x)\right] =\nabla f(x)``.
The multiplication ensures that the (SAGA, SGD, and SVRG and LSVRG) approximate gradients are an unbiased estimator of the full gradient ie :math:`\mathbb{E}\left[\tilde\nabla f(x)\right] =\nabla f(x)`.
This has an implication when choosing step sizes. For example, a suitable step size for GD with a SGFunction could be
:math:`\propto 1/(L_{max}*n)`, where :math:`L_{max}` is the largest Lipschitz constant of the list of functions in the SGFunction and the additional factor of :math:`n` reflects this multiplication by :math:`n` in the approximate gradient.

Expand Down Expand Up @@ -411,14 +411,14 @@ This class allows the user to write a function which does the following:
F ( x ) = G ( Ax )
where :math:`A` is an operator. For instance the least squares function l2norm_ :code:`Norm2Sq` can
where :math:`A` is an operator. For instance the least squares function can
be expressed as

.. math::
F(x) = || Ax - b ||^2_2
F(x) = || Ax - b ||^2_2 \qquad \text{where} \qquad G(y) = || y - b ||^2_2
.. code::python
.. code-block :: python
F1 = Norm2Sq(A, b)
# or equivalently
Expand Down

0 comments on commit cdf064f

Please sign in to comment.