CIP | Title | Authors | Comments-URI | Status | Type | Created | License |
---|---|---|---|---|---|---|---|
2 |
Coin Selection Algorithms for Cardano |
Jonathan Knowles <[email protected]> |
Draft |
Informational |
2020-05-04 |
CC-BY-4.0 |
This article describes, in human-readable terms, the coin selection algorithms used by Cardano Wallet and other parts of the Cardano ecosystem.
In the context of this article, coin selection refers to the process of choosing unspent coins from a wallet (or UTxO set) in order to pay money to one or more recipients.
This document was written to help people gain an understanding of the coin selection algorithms used in Cardano without having to read through and understand Cardano source code.
We aim to provide descriptions of algorithms that:
- don't require prior experience with any particular programming language;
- are understandable for people who are unfamiliar with coin selection;
- are precise enough for software engineers to be able to reimplement these algorithms in their preferred programming languages.
Where appropriate, we also provide mathematical descriptions, for added clarity.
Coin selection is a large, complex topic, with many difficult problems to solve. However, all software that performs coin selection must ultimately deal with at least the following pair of problems:
- How to generate a coin selection, by choosing unspent coins from a wallet (or UTxO set) in order to pay money to one or more recipients.
- How to adjust a coin selection in order to pay for a network fee, so that the coin selection can be published as a transaction on the blockchain.
This article concerns itself with the former problem of how to generate a coin selection.
Problems relating to network fees, and how to adjust coin selections to pay for such fees, are outside the scope of this article.
The Background section introduces the fundamental concepts behind coin selection, provides a discussion of why coin selection is a non-trivial problem, and describes the goals of coin selection algorithms.
The Interface section gives a description of the common interface unifying all coin selection algorithms used within Cardano Wallet, the standard parameter types, result types, and error types used by this interface, and a description of the properties that all conforming implementations are expected to satisfy.
The Algorithms section gives detailed descriptions of each of the individual coin selection algorithms used in Cardano Wallet, along with step-by-step descriptions of the computations involved.
The Reference Implementations section provides links to reference implementations of each algorithm in various languages.
- Abstract
- Motivation
- Scope
- Structure
- Background
- Definitions
- Interface
- Algorithms
- Reference Implementations
- External Resources
- Copyright
This section introduces the fundamental concepts behind coin selection, provides a discussion of why coin selection is a non-trivial problem, and describes important goals of coin selection algorithms.
Coin selection is the process of choosing unspent coins from a wallet in order to pay money to one or more recipients.
In the familiar world of physical money, our wallets hold value in the form of coins and banknotes.
When making a payment to someone, we typically select a combination of coins and banknotes from a wallet that, when added together, have enough value to cover the amount required.
Ideally, we'd always be able to select just enough to cover the exact amount. However, given that coins and banknotes have fixed values (and cannot be subdivided without destroying their value), it's often impossible to select the exact amount required. In such cases, we typically give the recipient more than the required amount, and then receive the excess value back as change.
💡 Example
Alice wishes to pay for her lunch.
The total price comes to €2.50 (2 euros and 50 cents). In her wallet, she has five one-euro coins, and one ten-euro note.
Note that there is no combination of coins (or notes) in her wallet that when added together give a total of €2.50, but there are several possible combinations that exceed the total.
To solve this problem, Alice selects one of these combinations: three one-euro coins. She uses the coins to make the payment, and then receives one 50-cent coin as change.
Similarly to how a physical wallet holds value in the form of unspent coins and banknotes, a Cardano wallet holds value in the form of unspent transaction outputs. An unspent transaction output is the result of a previous transaction that transferred money to the wallet, where the value has not yet been spent by another transaction. Each unspent transaction output has an associated coin value, and the total value of a wallet is the sum of these coin values. Collectively, the set of unspent transaction outputs is known as the UTxO set.
When using a Cardano wallet to make a payment, the wallet software must select a combination of unspent outputs from the wallet's UTxO set, so that the total value of selected outputs is enough to cover the target amount.
Just as with physical coins and notes, unspent outputs from the UTxO set cannot be subdivided, and must either be spent completely in a given transaction, or not be spent at all. Similarly to a transaction with physical money, the wallet software must select a combination of unspent outputs whose total value is greater than the target amount, and then arrange that change is paid back to the wallet.
Coin selection refers to the process of selecting a combination of unspent outputs from a wallet's UTxO set in order to make one or more payments, and computing the set of change to be paid back to the wallet.
There are a number of issues which make the problem of coin selection more complicated than would initially appear.
Each transaction has a maximum size, as defined by the protocol. The size of a transaction increases as we add more inputs or outputs.
Therefore, there's a practical limit on the number of coins we can select for any given transaction.
One simple strategy for selecting coins might be to mimic what we do when making a payment with coins and banknotes in the physical world. By giving the recipient an amount that's as close as possible to the amount they're expecting, we can minimize the amount of change they need to return to us.
However, trying to replicate this strategy with a UTxO-based wallet has an undesirable effect: minimizing the total value of selected coins will also minimize the value of change returned to the wallet. When repeated over time, this strategy will tend to cause an accumulation of tiny outputs in the wallet's UTxO set known as dust.
Dust outputs are a problem, because even if the total value of dust in a wallet is more than enough to cover a given target amount, if we attempt to include that dust in a given transaction, we may run out of space (by reaching the transaction size limit) before we can cover the target amount.
For more information on dust avoidance, see Self Organisation in Coin Selection.
One simple strategy for generating change might be to mimic what a shop assistant does when accepting a payment in the real world, which is to minimize the number of coins and banknotes that they return to the customer. This is beneficial for the shop, as it reduces the chances of them running out of change, and beneficial for the customer, as it reduces the amount of change that they have to carry around in their wallet.
Analogously, when generating change for a UTxO-based wallet, we might be tempted to use the simple strategy of just creating a single change output with the exact excess value.
However, using this strategy has an undesirable effect: the repeated act of minimizing the number of change outputs will tend (over time) to reduce the number of entries in a wallet's UTxO set. This is bad for two reasons:
-
Having a small UTxO set limits the number of future payments that we can make in parallel.
-
The approach of coalescing all change into a single output is widely considered to have negative privacy implications.
In light of the issues described above, we'd ideally like for our coin selection algorithms to be able to:
-
limit, over the course of time, the amount of dust that accumulates in the UTxO set.
-
maintain, over the course of time, a UTxO set with useful outputs: that is, outputs that allow us to process future payments with a reasonably small number of inputs.
This section defines common terms that are used throughout this document.
An address is a unique identifier that represents a payment recipient, a destination for a payment.
Addresses are typically owned (and generated) by individual wallets.
In general, coin selection algorithms are agnostic to the type of addresses used to identify payment recipients. Any address type may be used, so long as the set of possible addresses is totally-ordered.
A coin value is a non-negative integer value that represents a number of Lovelace.
One Ada is exactly equal to one million Lovelace.
In a UTxO-based blockchain, a transaction is a binding between inputs and outputs.
input #1 >---+ +---> output #1
\ /
input #2 >-----+------+
/ \
input #3 >---+ +---> output #2
A transaction input is a unique reference to a single output from a previous transaction.
In general, coin selection algorithms are agnostic to the type of references used to identify outputs from previous transactions. Any type may be used, so long as the set of possible references is totally-ordered, and so long as it is possible to determine the coin value associated with any given reference.
In the case of Cardano and other UTxO-based blockchains, this reference generally consists of a pair of values (h, n), where:
- h is a unique identifier for an existing transaction t;
- n is a 0-based integer index into the output list of transaction t.
A transaction output consists of a pair of values (a, v), where:
- a is the address of a recipient.
- v is the coin value to pay to the recipient.
A spent transaction output is an output from an existing transaction that has already been referenced as an input within a later transaction on the blockchain.
In effect, the coin value associated with that transaction output has been spent, and cannot be reused.
An unspent transaction output is an output from an existing transaction that has not yet been referenced as an input within a later transaction.
In effect, the coin value associated with that transaction output has not yet been spent, and is still available.
A UTxO set is a set of unspent transaction outputs.
This term is commonly used in two ways:
-
To describe the complete set of all unspent transaction outputs within a blockchain.
-
To describe the subset of unspent transaction outputs associated with a wallet. The UTxO set of a wallet represents the total unspent value associated with that wallet.
From the point of view of a coin selection algorithm, each member of a UTxO set can be represented as a pair of the form (u, v), where:
- u is a unique reference to an unspent output from a previous transaction.
- v is the coin value associated with u.
In general, coin selection algorithms are agnostic to the type of references used to identify unspent outputs from previous transactions. Any type may be used, so long as the set of possible references is totally-ordered.
In practice however, the type of each unique reference u is equivalent to the type of a transaction input, as transaction inputs are simply references to unspent outputs from previous transactions.
In the context of a wallet, a change output is a transaction output that transfers value back to the wallet, rather than to an external payment recipient. The address associated with a change output is generated by the wallet, and belongs to the wallet.
Change ouputs are necessary in a UTxO-based blockchain, as the value associated with any given transaction input must be spent entirely by the transaction that includes it.
When selecting entries from a UTxO set to include as inputs in a transaction, a coin selection algorithm will generally not be able to select inputs that precisely match the total value of all payments to external recipients, and will therefore need to select more than is strictly required. To avoid the destruction of value, selection algorithms create change outputs to return the excess value back to the wallet.
A dust output is a transaction output with an associated coin value that is:
- small in comparison to payments typically made by the user of the wallet;
- small in comparison to the marginal fee associated with including it in a transaction.
Dust outputs are a problem, because even if the total value of dust in a wallet is more than enough to cover a given payment amount, if we attempt to include that dust in a given transaction, we may run out of space (by reaching the transaction size limit) before we can cover the target amount.
All coin selection algorithms used by Cardano Wallet implement a common interface.
At the most fundamental level, a coin selection algorithm is a mathematical function that when applied to a requested output set and an initial UTxO set, will produce a coin selection: the basis for a transaction in a UTxO-based blockchain.
This section describes:
- the parameters accepted by all coin selection algorithms;
- the results they produce when successful;
- the error conditions that may occur on failure;
- the properties that apply to all coin selection algorithms: mathematical laws governing the relationships between parameters and results.
In this section, the terms coin selection algorithm and coin selection function will be used interchangeably.
All coin selection functions accept the following parameters:
-
A list of payments to be made to recipient addresses, encoded as a list of transaction outputs.
-
A UTxO set from which the coin selection algorithm can select entries, to cover payments listed in the requested output set.
In the context of a wallet, this parameter would normally be assigned with the wallet's complete UTxO set, giving the coin selection algorithm access to the total value associated with that wallet.
-
An upper bound on the number of UTxO entries that the coin selection algorithm is permitted to select from the initial UTxO set.
This parameter is necessary for blockchains that impose an upper limit on the size of transactions.
All coin selection functions produce the following result values:
-
A coin selection is the basis for a transaction in a UTxO-based blockchain.
It is a record with three fields:
-
A set of inputs, equivalent to a subset of the initial UTxO set.
From the point of view of a wallet, this represents the value that has been selected from the wallet in order to cover the total payment value.
-
A set of outputs (see transaction output).
Represents the set of payments to be made to recipient addresses.
-
A set of change values (see change output), where each change value is simply a coin value.
From the point of view of a wallet, this represents the change to be returned to the wallet.
-
-
The remaining UTxO set is a subset of the initial UTxO set.
It represents the set of values that remain after the coin selection algorithm has removed values to pay for entries in the requested output set.
In the context of a wallet, if a coin selection algorithm is applied to the wallet's complete UTxO set, then the remaining UTxO set represents the updated UTxO set of that wallet.
All coin selection algorithms satisfy a common set of properties: general rules that govern the relationship between the parameters supplied to coin selection functions and the results they are allowed to produce.
This property states that the total value of inputs in the resulting coin selection result is sufficient to cover the total value of the requested output set.
In particular:
- vselected ≥ vrequested
Where:
-
vrequested
is the total value of the requested output set
-
vselected
is the total value of the inputs field of the coin selection result.
This property states that the correct amount of change was generated.
In particular:
- vselected = vrequested + vchange
Where:
-
vchange
is the total value of the change field of the coin selection result.
-
vrequested
is the total value of the requested output set
-
vselected
is the total value of the inputs field of the coin selection result.
This property states that every entry in the initial UTxO set is included in either the inputs set of the generated coin selection, or in the remaining UTxO set, but not both.
-
If a UTxO entry is selected by the coin selection algorithm, it is included in the coin selection inputs set.
-
If a UTxO entry is not selected by the coin selection algorithm, it is included in the remaining UTxO set.
The following laws hold:
- Uinitial ⊃ Uremaining
- Uinitial ⊇ Uselected
And:
- Uremaining ∩ Uselected = ∅
- Uremaining ⋃ Uselected = Uinitial
Where:
-
Uinitial
is the initial UTxO set.
-
Uremaining
is the remaining UTxO set.
-
Uselected
is the value of the inputs field of the coin selection result.
This property states that the requested output set is conserved in the coin selection result.
In particular, the outputs field of the coin selection result should be equal to the requested output set.
There are a number of ways in which a coin selection algorithm can fail:
-
This failure occurs when the total value of the entries within the initial UTxO set (the amount of money available) is less than the the total value of all entries in the requested output set (the amount of money required).
-
This failure occurs when the number of entries in the initial UTxO set is smaller than the number of entries in the requested output set, for algorithms that impose the restriction that a single UTxO entry can only be used to pay for at most one output.
-
This failure occurs if the algorithm depletes all entries from the initial UTxO set before it is able to pay for all outputs in the requested output set.
This can happen even if the total value of entries within the initial UTxO set is greater than the total value of all entries in the requested output set, due to various restrictions that coin selection algorithms impose on themselves when selecting UTxO entries.
-
This failure occurs when another input must be selected by the algorithm in order to continue making progress, but doing so will increase the size of the resulting selection beyond an acceptable limit, specified by the maximum input count parameter.
This section describes the coin selection algorithms used by Cardano Wallet, along with step-by-step descriptions of the computations involved.
All algorithms implement a common interface, as described in the Interface section.
There are two main algorithms used by Cardano Wallet:
In general, Cardano Wallet gives priority to the Random-Improve algorithm, as experimental evidence shows that it performs better at minimising dust and maintaining a UTxO set with useful outputs. (See Self Organisation in Coin Selection for more details.)
However, in rare cases, the Random-Improve algorithm may fail to produce a result. In such cases, Cardano Wallet will fall back to the Largest-First algorithm.
The Largest-First coin selection algorithm considers UTxO set entries in descending order of value, from largest to smallest.
When applied to a set of requested outputs, the algorithm repeatedly selects entries from the initial UTxO set until the total value of selected entries is greater than or equal to the total value of requested outputs.
The name of the algorithm is taken from the idea that the largest UTxO entry is always selected first. Specifically:
A given UTxO entry u1 with value v1 can be selected if and only if there is no other unselected entry u2 with value v2 where v2 > v1.
At all stages of processing, the algorithm maintains the following pieces of state:
-
This is initially equal to the initial UTxO set, sorted into descending order of coin value.
The head of the list is always the remaining UTxO entry with the largest coin value.
Entries are incrementally removed from the head of the list as the algorithm proceeds, until enough value has been selected.
-
This is initially equal to the empty set.
The algorithm proceeds according to the following sequence of steps:
-
Step 1
If the available UTxO list is empty:
- Terminate with a UTxO Balance Insufficient error.
If the available UTxO list is not empty:
- Remove an UTxO entry from the head of the available UTxO list and add it to the selected UTxO set.
-
Step 2
Compare the total size nselected of the selected UTxO set with the maximum input count nmax.
-
If nselected > nmax then:
- Terminate with a Maximum Input Count Exceeded error.
-
If nselected ≤ nmax then:
- Go to step 3.
-
-
Step 3
Compare the total value vselected of the selected UTxO set to the total value vrequested of the requested output set:
- If vselected < vrequested then go to step 1.
- If vselected ≥ vrequested then go to step 4.
-
Step 4
Return a coin selection result where:
-
The inputs set is equal to the selected UTxO set.
-
The outputs set is equal to the requested output set.
-
If vselected > vrequested then:
- The change set contains just a single coin of value (vselected − vrequested).
-
If vselected = vrequested then:
- The change set is empty.
-
The Random-Improve coin selection algorithm works in two phases:
-
In the first phase, the algorithm iterates through each of the requested outputs in descending order of coin value, from largest to smallest. For each output, the algorithm repeatedly selects entries at random from the initial UTxO set, until each requested output has been associated with a set of UTxO entries whose total value is enough to pay for that ouput.
-
In the second phase, the algorithm attempts to expand each existing UTxO selection with additional values taken at random from the initial UTxO set, to the point where the total value of each selection is as close as possible to twice the value of its associated output.
After the above phases are complete, for each output of value voutput and accompanying UTxO selection of value vselection, the algorithm generates a single change output of value vchange, where:
vchange = vselection − voutput
Since the goal of the second phase was to expand each selection to the point where its total value is approximately twice the value of its associated output, this corresponds to a change output whose target value is approximately equal to the value of the output itself:
vchange = vselection − voutput
vchange ≈ 2voutput − voutput
vchange ≈ voutput
There are several motivating principles behind the design of the algorithm.
The probability that random selection will choose dust entries from a UTxO set increases with the proportion of dust in the set.
Therefore, for a UTxO set with a large amount of dust, there's a high probability that a random subset will include a large amount of dust.
Over time, selecting entries randomly in this way will tend to limit the amount of dust that accumulates in the UTxO set.
As mentioned in the Goals section, it is desirable that coin selection algorithms, over time, are able to create UTxO sets that have useful outputs: outputs that will allow us to process future payments with a reasonably small number of inputs.
If for each payment request of value v we create a change output of roughly the same value v, then we will end up with a distribution of change values that matches the typical value distribution of payment requests.
💡 Example
Alice often buys bread and other similar items that cost around €1.00 each.
When she instructs her wallet software to make a payment for around €1.00, the software attempts to select a set of unspent transaction outputs with a total value of around €2.00.
As she frequently makes payments for similar amounts, transactions created by her wallet will also frequently produce change coins of around €1.00 in value.
Over time, her wallet will self-organize to contain multiple coins of around €1.00, which are useful for the kinds of payments that Alice frequently makes.
Searching the UTxO set for additional entries to improve our change outputs is only useful if the UTxO set contains entries that are sufficiently small enough. But it is precisely when the UTxO set contains many small entries that it is less likely for a randomly-chosen UTxO entry to push the total above the upper bound.
The Random-Improve algorithm imposes the following cardinality restriction:
- Each entry from the initial UTxO set is used to pay for at most one output from the requested output set.
As a result of this restriction, the algorithm will fail with a UTxO Not Fragmented Enough error if the number of entries in the initial UTxO set is smaller than the number of entries in the requested output set.
At all stages of processing, the algorithm maintains the following pieces of state:
-
This is initially equal to the initial UTxO set.
-
The accumulated coin selection is a coin selection where all fields are initially equal to the empty set.
The algorithm proceeds in two phases.
In this phase, the algorithm iterates through each of the requested outputs in descending order of coin value, from largest to smallest.
For each output of value v, the algorithm repeatedly selects entries at random from the available UTxO set, until the total value of selected entries is greater than or equal to v. The selected entries are then associated with that output, and removed from the available UTxO set.
This phase ends when every output has been associated with a selection of UTxO entries.
In this phase, the algorithm attempts to improve upon each of the UTxO selections made in the previous phase, by conservatively expanding the selection made for each output in order to generate improved change values.
During this phase, the algorithm:
-
processes outputs in ascending order of coin value.
-
continues to select values from the available UTxO set.
-
incrementally populates the accumulated coin selection.
For each output of value v, the algorithm:
-
Calculates a target range for the total value of inputs used to pay for that output, defined by the triplet:
(minimum, ideal, maximum) = (v, 2v, 3v)
-
Attempts to improve upon the existing UTxO selection for that output, by repeatedly selecting additional entries at random from the available UTxO set, stopping when the selection can be improved upon no further.
A selection with value v1 is considered to be an improvement over a selection with value v0 if all of the following conditions are satisfied:
-
Condition 1: we have moved closer to the ideal value:
abs (ideal − v1) < abs (ideal − v0)
-
Condition 2: we have not exceeded the maximum value:
v1 ≤ maximum
-
Condition 3: when counting cumulatively across all outputs considered so far, we have not selected more than the maximum number of UTxO entries specified by Maximum Input Count.
-
-
Creates a change value for the output, equal to the total value of the improved UTxO selection for that output minus the value v of that output.
-
Updates the accumulated coin selection:
- Adds the output to the outputs field;
- Adds the improved UTxO selection to the inputs field;
- Adds the change value to the change values field.
This phase ends when every output has been processed, or when the available UTxO set has been exhausted, whichever occurs sooner.
When both phases are complete, the algorithm terminates.
The accumulated coin selection is returned to the caller as the coin selection result.
The available UTxO set is returned to the caller as the remaining UTxO set result.
Reference implementations of the Largest-First algorithm are available in the following languages:
Language | Documentation | Source |
---|---|---|
Haskell | Documentation | Source |
Reference implementations of the Random-Improve algorithm are available in the following languages:
Language | Documentation | Source |
---|---|---|
Haskell | Documentation | Source |
Title Self Organisation in Coin Selection Author Edsko de Vries Year 2018 Location https://cardanofoundation.org/en/news/self-organisation-in-coin-selection/ This article introduces the Random-Improve coin selection algorithm, invented by Edsko de Vries.
It describes the three principles of self-organisation that inform the algorithm's design, and provides experimental evidence to demonstrate the algorithm's effectiveness at maintaining healthy UTxO sets over time.
This CIP is licensed under CC-BY-4.0.