forked from openvinotoolkit/openvino
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Spec][Opset13] Multinomial specification (openvinotoolkit#19866)
* [Spec] Multinomial specification * [Spec] Apply recommended style fixes to Multinomial_13.md * [Spec] Add range mathematical formulation * [Spec] Update Multinomial input types * Update opset13.md * Update opset.md * Update docs/ops/generation/Multinomial_13.md Co-authored-by: Katarzyna Mitrus <[email protected]> * Update Multinomial_13.md * Update opset13.md * Update docs/ops/generation/Multinomial_13.md Co-authored-by: Katarzyna Mitrus <[email protected]> * Update Multinomial_13.md * Update Multinomial_13.md * Update Multinomial_13.md * Update docs/ops/generation/Multinomial_13.md Co-authored-by: Katarzyna Mitrus <[email protected]> * Update docs/ops/generation/Multinomial_13.md Co-authored-by: Katarzyna Mitrus <[email protected]> * Update docs/ops/generation/Multinomial_13.md Co-authored-by: Katarzyna Mitrus <[email protected]> * Update Multinomial_13.md * [Spec] Replace 3107670 syntax with ..math * Update Multinomial_13.md * Update Multinomial_13.md * Update docs/ops/opset13.md Co-authored-by: Mateusz Mikolajczyk <[email protected]> * [Spec] Apply changes from Ref/Core PR * Update Multinomial_13.md --------- Co-authored-by: Katarzyna Mitrus <[email protected]> Co-authored-by: Mateusz Mikolajczyk <[email protected]>
- Loading branch information
1 parent
ec45107
commit a4e1f9d
Showing
3 changed files
with
202 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,200 @@ | ||
# Multinomial {#openvino_docs_ops_generation_Multinomial_13} | ||
|
||
@sphinxdirective | ||
|
||
.. meta:: | ||
:description: Learn about Multinomial-13 - a generation operation, that creates a sequence of indices of classes sampled from the multinomial distribution. | ||
|
||
**Versioned name**: *Multinomial-13* | ||
|
||
**Category**: *Generation* | ||
|
||
**Short description**: *Multinomial* operation generates a sequence of class indices sampled from the multinomial distribution based on the input class probabilities. | ||
|
||
**Detailed description**: *Multinomial* operation generates a sequence of class indices sampled from the multinomial distribution. In this context, the *probs* values represent the probabilities associated with each class within the multinomial distribution. When executing this operation, it randomly selects a class based on these probabilities. Subsequently, the index of the chosen class in the *probs* array is appended to the *output* sequence in the corresponding batch. | ||
|
||
**Algorithm formulation**: | ||
|
||
Note: The following notation | ||
|
||
.. math:: | ||
|
||
[a, b] => { x \in \mathbb{R}, a <= x <= b } | ||
|
||
denotes a range of real numbers between a and b. | ||
|
||
Given a list of probabilities x1, x2, ..., xn: | ||
|
||
* If *log_probs* is true: | ||
* For each probability x, replace it with a value e^x | ||
* Create an array - discrete CDF ([Cumulative Distribution Function](https://en.wikipedia.org/wiki/Cumulative_distribution_function)) - the cumulative sum of those probabilities, ie. create an array of values where the ith value is the sum of the probabilities x1, ..., xi | ||
* Divide the created array by its maximum value to normalize the cumulative probabilities between the real values in the range [0, 1]. This array is, by definition of CDF, sorted in ascending order, hence the maximum value is the last value of the array. | ||
* Randomly generate a sequence of double-precision floating point numbers in the range [0, 1]. | ||
* For each generated number, assign the class with the lowest index for which the cumulative probability is less or equal to the generated value. | ||
* If *with_replacement* is False (sampling without replacement): | ||
|
||
* Assume a class with index i has been selected - then every CDF value starting at i-th index should be lowered by the original probability of the selected class. This effectively sets the probability of sampling the given class to 0. | ||
* Afterwards, divide the CDF by its last (maximum) value to normalize the cumulative probabilities between the real values in the range [0, 1] | ||
|
||
* Convert the output indices to *convert_type* | ||
* Return output indices | ||
|
||
**Example computations**: | ||
Example 1 - 1D tensor | ||
|
||
* Let ``probs`` = ``[0.1, 0.5, 0.4]``, ``num_samples`` = 5, ``log_probs`` = false, ``with_replacement`` = true | ||
* CDF of ``probs`` = ``[0.1, 0.1 + 0.5, 0.1 + 0.5 + 0.4]`` = ``[0.1, 0.6, 1]`` | ||
* Randomly generated floats = ``[0.2, 0.4, 0.6, 0.8, 1]`` | ||
* Assigned classes = ``[1, 1, 1, 2, 2]`` | ||
|
||
Example 2 - 2D tensor, log probabilities | ||
|
||
* Let ``probs`` = ``[[-1, 1, 2], [50, 1, 21]]``, ``num_samples`` = 10, ``log_probs`` = true, ``with_replacement`` = true | ||
* Exponentiated ``probs`` = ``[[0.36, 2.71, 7.38], [5184705528587072464087.45, 2.71, 1318815734.48]]`` | ||
* CDF of ``probs``, per batch = ``[[0.36, 3.07, 10.45], [5184705528587072464087.45, 5184705528587072464090.16, 5184705528588391279824.64]]`` | ||
* Normalized CDF = ``[[0.03, 0.29, 1], [1.0, 1.0, 1.0]]`` | ||
* Randomly generated floats = ``[[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1], [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1]]`` | ||
* Assigned classes = ``[[1, 1, 2, 2, 2, 2, 2, 2, 2, 2], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]`` | ||
|
||
Example 3 - 1D tensor, without replacement | ||
|
||
* Let ``probs`` = ``[0.1, 0.5, 0.4]``, ``num_samples`` = 2, ``log_probs`` = false, ``with_replacement`` = false | ||
* CDF of ``probs`` = ``[0.1, 0.6, 1]`` | ||
* Randomly generated floats = ``[0.3, 0.2]`` | ||
* In a loop: | ||
|
||
* For a value of 0.3, a class with idx ``1`` is selected | ||
* Therefore, in CDF, for every class starting with idx ``1`` subtract the probability of class at idx ``1`` = ``probs[1]`` = 0.5 | ||
* CDF = ``[0.1, 0.6 - 0.5, 1.0 - 0.5]`` = ``[0.1, 0.1, 0.5]`` | ||
* Normalize CDF by dividing by last value: CDF = ``[0.2, 0.2, 1.0]`` | ||
* Take the next randomly generated float, here 0.2, and repeat until all random samples have assigned classes. Notice that for ``sampled values`` <= 0.2, only the class wih idx ``0`` can be selected, since the search stops at the index with the first value satisfying ``sample value`` <= ``CDF probability`` | ||
|
||
* Assigned classes = ``[1, 2]`` | ||
|
||
|
||
**Attributes**: | ||
|
||
* ``convert_type`` | ||
|
||
* **Description**: the type of the output. Determines generation algorithm and affects resulting values. Output numbers generated for different values of *convert_type* may not be equal. | ||
* **Range of values**: "i32", "i64". | ||
* **Type**: string | ||
* **Required**: *Yes* | ||
|
||
* ``with_replacement`` | ||
|
||
* **Description**: controls whether to sample with replacement (classes can be sampled multiple times). | ||
* **Range of values**: `true`, `false` | ||
|
||
* ``true`` - class indices can be sampled multiple times. | ||
* ``false`` - class indices will not repeat in the output and the size of ``probs``' ``class_size`` dimension is required to be larger or equal to *num_samples* value. Might affect performance. | ||
|
||
* **Type**: `bool` | ||
* **Required**: *Yes* | ||
|
||
* ``log_probs`` | ||
|
||
* **Description**: allows to control whether *inputs* should be treated as probabilities or unnormalized log probabilities. | ||
* **Range of values**: `true`, `false` | ||
|
||
* ``true`` - set values in *inputs* are unnormalized log probabilities that can be any real number. | ||
* ``false`` - probabilities in *inputs* are expected to be non-negative, finite and have a non-zero-sum. | ||
|
||
* **Type**: `bool` | ||
* **Required**: *Yes* | ||
|
||
* ``global_seed`` | ||
|
||
* **Description**: global seed value. | ||
* **Range of values**: non-negative integers | ||
* **Type**: `unsigned int 64-bit` | ||
* **Default value**: 0 | ||
* **Required**: *No* | ||
|
||
* ``op_seed`` | ||
|
||
* **Description**: operational seed value. | ||
* **Range of values**: non-negative integers | ||
* **Type**: `unsigned int 64-bit` | ||
* **Default value**: 0 | ||
* **Required**: *No* | ||
|
||
**Inputs**: | ||
|
||
* **1**: ``probs`` - A 1D or 2D tensor of type `T_IN` and shape `[class_size]` or `[batch_size, class_size]` with probabilities. Allowed values depend on the *log_probs* attribute. The values are internally normalized to have values in the range of `[0, 1]` with the sum of all probabilities in the given batch equal to 1. **Required.** | ||
|
||
* **2**: ``num_samples`` - A scalar or 1D tensor with a single element of type `T_SAMPLES` specifying the number of samples to draw from the multinomial distribution. **Required.** | ||
|
||
**Outputs**: | ||
|
||
* **1**: ``output``- A tensor with type specified by the attribute *convert_type* and shape depending on the rank of *probs*, either ``[num_samples]`` for one-dimensional *probs* or ``[batch_size, num_samples]`` for the two-dimensional one. | ||
|
||
**Types** | ||
|
||
* **T_IN**: any supported floating-point type. | ||
* **T_SAMPLES**: 32-bit or 64-bit integers. | ||
|
||
|
||
*Example 1: 1D input tensor.* | ||
|
||
.. code-block:: xml | ||
:force: | ||
|
||
<layer ... name="Multinomial" type="Multinomial"> | ||
<data convert_type="f32", with_replacement="true", log_probs="false", global_seed="234", op_seed="148"/> | ||
<input> | ||
<port id="0" precision="FP32"> < !-- probs value: [0.1, 0.5, 0.4] --> | ||
<dim>3</dim> | ||
</port> | ||
<port id="1" precision="I32"/> < !-- num_samples value: 5 --> | ||
</input> | ||
<output> | ||
<port id="3" precision="FP32" names="Multinomial:0"> | ||
<dim>5</dim> | ||
</port> | ||
</output> | ||
</layer> | ||
|
||
*Example 2: 2D input tensor.* | ||
|
||
.. code-block:: xml | ||
:force: | ||
|
||
<layer ... name="Multinomial" type="Multinomial"> | ||
<data convert_type="f32", with_replacement="true", log_probs="true", global_seed="234", op_seed="148"/> | ||
<input> | ||
<port id="0" precision="FP32"> < !-- probs value: [[-1, 1, 2], [50, 1, 21]] --> | ||
<dim>2</dim> < !-- batch size of 2 --> | ||
<dim>3</dim> | ||
</port> | ||
<port id="1" precision="I32"/> < !-- num_samples value: 10 --> | ||
</input> | ||
<output> | ||
<port id="3" precision="FP32" names="Multinomial:0"> | ||
<dim>2</dim> < !--dimension depends on input batch size --> | ||
<dim>10</dim> < !--dimension depends on num_samples --> | ||
</port> | ||
</output> | ||
</layer> | ||
|
||
*Example 3: 1D input tensor without replacement.* | ||
|
||
.. code-block:: xml | ||
:force: | ||
|
||
<layer ... name="Multinomial" type="Multinomial"> | ||
<data convert_type="f32", with_replacement="false", log_probs="false", global_seed="234", op_seed="148"/> | ||
<input> | ||
<port id="0" precision="FP32"> < !-- probs value: [0.1, 0.5, 0.4] --> | ||
<dim>3</dim> | ||
</port> | ||
<port id="1" precision="I32"/> < !-- num_samples value: 2 --> | ||
</input> | ||
<output> | ||
<port id="3" precision="FP32" names="Multinomial:0"> | ||
<dim>2</dim> < !-- 2 unique samples of classes --> | ||
</port> | ||
</output> | ||
</layer> | ||
|
||
@endsphinxdirective |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters