You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: doc/ch_conceptual_design/hofs/observers.rst
+7-5Lines changed: 7 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -32,17 +32,19 @@ Registration interface
32
32
^^^^^^^^^^^^^^^^^^^^^^
33
33
34
34
The below shows how the :cpp:`histogram_hits` operator in :numref:`workflow` would be registered in C++.
35
-
It uses the :cpp:`phlex::resource<histogramming>` interface to provide access to a putative histogramming resource (see :numref:`ch_conceptual_design/resources:Resources`).
35
+
It uses the :cpp:`resource<histogramming>` interface to provide access to a putative histogramming resource (see :numref:`ch_conceptual_design/resources:Resources`).
@@ -20,93 +20,138 @@ As mentioned in :numref:`ch_preliminaries/functional_programming:Sequences of Da
20
20
where the user-defined operation :math:`f` is applied repeatedly between an accumulated value (initialized by :math:`init`) and each element of the input sequence.
21
21
22
22
In a framework context, however, multiple fold results are often desired in the same program for the same kind of computation.
23
-
For example, consider a program that processes :math:`n` runs, each of which contains spills, identified by the tuple :math:`(R\ i, S\ j)`.
24
-
The user may wish to create one histogram per run that contains the track multiplicity per spill.
25
-
Instead of creating a single fold result, we thus use a *partitioned fold*:
23
+
Consider the workflow in :numref:`workflow`, which processes `Spill`\ s, identified by the index :math:`j` or, more specifically, the tuple :math:`(S\ j)`.
24
+
Each `Spill` is unfolded into a sequence of `APA`\ s, which are identified by the pair of indices :math:`jk` or, more specifically, the tuple :math:`(S\ j, A\ k)`.
25
+
The energies of the :cpp:`"GoodHits"` data products in :numref:`workflow` are summed across `APA`\ s per `Spill` using the :math:`\textit{fold(sum\_energy)}` node.
26
+
27
+
Instead of creating one fold result, we thus use a *partitioned fold* to create one summed energy data-product per `Spill`:
26
28
27
29
.. math::
28
30
:no-wrap:
29
31
30
32
\begin{align*}
31
-
[h_{(R\ 1)}&,\ \dots,\ h_{(R\ n)}] \\
32
-
&= \pfold{\textit{fill}}{\textit{init}}{\textit{into\_runs}}\ [m_{(R\ 1, S\ 1)},\ m_{(R\ 1, S\ 2)},\ \dots,\ m_{(R\ n, S\ 1)},\ m_{(R\ n, S\ 2)},\ \dots]
where :math:`h_{(R\ i)}` denotes the histogram for run:math:`i`, and :math:`m_{(R\ i,\ S\ j)}` is the track multiplicity for spill:math:`j` in run:math:`i`.
37
+
where :math:`E_{(S\ j)}` denotes the summed good-hits energy for `Spill`:math:`j`, and :math:`hs_{(S\ j,\ A\ k)}` is the good-hits data product :cpp:`"GoodHits"` for `APA`:math:`k` in `Spill`:math:`j`.
36
38
37
39
The above equation can be expressed more succinctly as:
Factorizing a set of data into non-overlapping subsets that collectively span the entire set is called creating a set *partition*. [Wiki-partition]_
57
+
Factorizing a set of data into non-overlapping subsets that collectively span the entire set is called creating a set *partition* [Wiki-partition]_.
56
58
Each subset of the partition is called a *cell*.
57
-
In the above example, the role of the :math:`\textit{into\_runs}` operation is to partition the input sequence into runs so that there is one fold result per run.
59
+
In the above example, the role of the :math:`\textit{into\_spills}` operation is to partition the input sequence into `Spill`\ s so that there is one fold result per `Spill`.
58
60
In general, however, the partitioning function is of the form :math:`\textit{part}: \{\iset{c}\} \rightarrow\mathbb{P}(\iset{c})`, where:
59
61
60
62
- the domain is the singleton set that contains only the index set :math:`\iset{c}` (i.e. :math:`\textit{part}` can only be invoked on :math:`\iset{c}`), and
61
-
- the codomain is the set of partitions of the index set :math:`\iset{c}`.
63
+
- the codomain is the set of partitions of :math:`\iset{c}` or :math:`\mathbb{P}(\iset{c})`; note that the output index set :math:`\iset{d} \in\mathbb{P}(\iset{c})`.
62
64
63
65
The function :math:`part` also establishes an equivalence relationship on the index set :math:`\iset{c}`, where each element of the index set is mapped to a cell of the partition.
64
66
The number of elements in the output sequence :math:`d` corresponds to the number of partition cells.
65
67
68
+
As of this writing, the only partitions supported are those that correspond to the names of data-product set categories.
69
+
The partition :math:`\textit{into\_spills}` can thus be represented by the string :cpp:`"Spill"`, which denotes that there is one partition spell per `Spill`.
70
+
66
71
Initializing the Accumulator
67
72
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
68
73
69
-
.. todo::
70
-
Change the domain type of :math:`\textit{init}`.
71
-
72
74
A crucial ingredient of the fold is the *accumulator*, which stores the fold result while it is being formed.
73
-
Each accumulator is initialized by invoking a user-defined operation :math:`\textit{init}: \one\rightarrow D`, which returns an object that has the same type :math:`D` as the fold result. [#finit]_
74
-
Instead of invoking a function, an accumulator is often initialized with a value.
75
-
However, in functional programming, a value can be represented by invoking a function that always returns the same result.
76
-
Expressing an initializer as a function thus supports value-initialization while retaining the flexibility that may occasionally be required through functions.
75
+
Each accumulator is initialized by invoking a user-defined operation :math:`\textit{init}: \opt{\iset{d}} \rightarrow D`, which returns an object that has the same type :math:`D` as the fold result [#finit]_.
76
+
The :math:`\opt{\iset{d}}` domain means that:
77
+
78
+
1. :math:`\textit{init}` can receive an argument corresponding to the identifier of a cell, which is a member of the output index set :math:`\mathcal{I}_d`.
79
+
In the example above, the relevant identifier would be that of the `Spill`–i.e. :math:`(S\ j)`.
80
+
2. :math:`\textit{init}` can be invoked with no arguments, thus producing the same value each time the accumulator is initialized.
81
+
This is equivalent to initializing the accumulator with a constant value.
82
+
83
+
The implementation of :math:`\textit{init}` for the total good-hits energy fold results is to return the constant :math:`0`.
77
84
78
85
Fold Operation
79
86
^^^^^^^^^^^^^^
80
87
81
88
A cell's fold result is obtained by repeatedly applying a fold operation to the cell's accumulator and each element of that cell's input sequence.
82
89
The fold operation has the signature :math:`f: D \times C \rightarrow D`, where :math:`D` represents the type of the accumulator/fold result, and :math:`C` is the type of each element of the input sequence.
83
90
84
-
In the above example, the function :math:`\textit{fill}` receives a histogram :math:`h_{(R\ i)}` as the accumulator for run :math:`i` and "combines" it with a track multiplicity object :math:`m_{(R\ i,\ S\ j)}` that belongs to spill :math:`j` in run :math:`i`.
85
-
This "combined" value is then returned by :math:`\textit{fill}` as the updated value of the accumulator.
86
-
The function :math:`\textit{fill}` is repeatedly invoked to update the accumulator with each track multiplicity value.
87
-
Once all track multiplcity values in run :math:`i` have been processed by :math:`\textit{fill}`, the accumulator's value becomes the fold result for that run.
91
+
In the above example, the function :math:`\textit{sum\_energy}` receives a floating-point number :math:`E_{(S\ i)}`, representing the accumulated good-hits energy for `Spill` :math:`j` and "combines" it with the good-hits object :math:`hs_{(S\ j,\ A\ k)}` that belongs to `APA` :math:`k` in spill :math:`j`.
92
+
This combination involves calculating the energy represented by the good-hits data product :math:`hs_{(S\ j,\ A\ k)}` and adding that to the accumulated value.
93
+
This "combined" value is then returned by :math:`\textit{sum\_energy}` as the updated value of the accumulator [#feff]_.
94
+
The function :math:`\textit{sum\_energy}` is repeatedly invoked to update the accumulator with good-hits data product.
95
+
Once all :cpp:`"GoodHits"` data products in `Spill` :math:`j` have been processed by :math:`\textit{sum\_energy}`, the accumulator's value becomes the fold result for that `Spill`.
The fold's :cpp:`result_type` must model the created data-product type described in :numref:`ch_conceptual_design/algorithms:Return Types`.
118
+
A fold algorithm may also create multiple data products by using a :cpp:`result_type` of :cpp:`std::tuple<T1, ..., Tn>` where each of the types :cpp:`T1, ..., Tn` models a created data-product type.
119
+
88
120
89
121
Registration interface
90
122
^^^^^^^^^^^^^^^^^^^^^^
91
123
92
-
**Result type**: A fold algorithm may create multiple data products through its result by specifying an :cpp:`std::tuple<T1, ..., Tn>` where each of the types :cpp:`T1, ..., Tn` models a data-product created type.
124
+
The :math:`\textit{fold(sum\_energies)}` node in :numref:`workflow` would be represented in C++ as:
"Spill", // <= Partition level (one fold result per Spill)
138
+
concurrency::unlimited // <= Allowed concurrency
139
+
)
140
+
.sequence("GoodHits"_in("APA"));
141
+
}
142
+
143
+
In order for the user-defined algorithm :cpp:`sum_energy` algorithm to be safely executed concurrently, protections must be in place to avoid data races when updating the :cpp:`total_hit_energy` result object from multiple threads.
144
+
Possible solutions include using :cpp:`std::atomic_ref<double>` [#fatomicref]_, placing a lock around the operation that updates :cpp:`total_hit_energy` (less desirable due to inefficiencies), or perhaps using :cpp:`std::atomic<double>` [#fatomic]_ instead of :cpp:`double` to represent the data product.
106
145
107
146
.. rubric:: Footnotes
108
147
109
148
.. [#finit] It is acceptable for :math:`\textit{init}` to return a type that is convertible to the accumulator's type.
149
+
.. [#feff] Returning an updated accumulated value is generally not the most memory-efficient approach as it requires at least two copies of an accumulated value to be in memory at one time.
150
+
The approach adopted by Phlex is to include a reference to the accumulated value as part of the fold operator's signature.
151
+
The accumulator can then be updated in place, thus avoiding the extra copies of the data.
- The :cpp:`return_type` must model the created data-product type described in :numref:`ch_conceptual_design/algorithms:Return Types`.
34
-
An algorithm may also create multiple data products by returning a :cpp:`std::tuple<T1, ..., Tn>` where each of the types :cpp:`T1, ..., Tn` models a created data-product type.
33
+
The :cpp:`return_type` must model the created data-product type described in :numref:`ch_conceptual_design/algorithms:Return Types`.
34
+
An algorithm may also create multiple data products by returning a :cpp:`std::tuple<T1, ..., Tn>` where each of the types :cpp:`T1, ..., Tn` models a created data-product type.
0 commit comments