Skip to content

Commit

Permalink
[Governance] Refining FLIP for Dynamic Inclusion Fees
Browse files Browse the repository at this point in the history
  • Loading branch information
Tony Z committed Nov 2, 2022
1 parent 8ed17bf commit 30ad3cb
Show file tree
Hide file tree
Showing 3 changed files with 115 additions and 62 deletions.
177 changes: 115 additions & 62 deletions governance/20221005-dynamic-inclusion-fee.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,16 @@ The goal of this proposal is to come up with a formula incentivizing developers

> 💬 We are not looking for perfectly precise math models this time, instead we are looking for a simple equation to calculate inclusion fees to reflect relative potential inclusion effort of a given transaction, including network transmission, memory usage etc.
#### Proposal

In this FLIP we propose to replace the existing consistent inclusion fee charge into a dynamic charging pattern, in which total size of a transaction and its number of authoriers(signatures) will be factored in:

<img src="https://render.githubusercontent.com/render/math?math=F_I"> = 287.1 * (6.39e-7 * **TxByteSize** + 3.29e-5 * **NumOfSignature** + 0.0023) * 1e-6 FLOW

Details on how we collect metrics, how the final equation is derived and validated will be elaborated in the rest part of this FLIP.

> 💬 Mathematical terms in this FLIP are from [this parent FLIP](https://github.com/onflow/flow/blob/master/flips/20211007-transaction-fees.md#terminology).
#### How Transactions are Created in Load Test

To get performance metrics with a batch of fixed-sized transactions, we need to compose such transactions that we can control each of their:
Expand All @@ -29,13 +39,13 @@ To get performance metrics with a batch of fixed-sized transactions, we need to


* Total transaction size
* Number of authorizers
* Number of signatures
* Number of keys of payer
* Argument size
* Comment size


In [PR 2816](https://github.com/onflow/flow-go/pull/2816), we added this feature in the loader app. We first add all indicated number of authorizers and keys of payer, compose transaction argument string, and at last leave all remaining bytes as comments. When these transactions are ready, the loader app will submit them to a given access cluster with give TPS.
In [PR 2816](https://github.com/onflow/flow-go/pull/2816), we added this feature in the loader app. We first add all indicated number of signatures and keys of payer, compose transaction argument string, and at last leave all remaining bytes as comments. When these transactions are ready, the loader app will submit them to a given access cluster with give TPS.

#### How Transaction Size is Calculated

Expand All @@ -45,7 +55,7 @@ According to our [code](https://github.com/onflow/flow-go/blob/master/engine/acc


* Byte size of transaction code
* Number of authorizers
* Number of Signatures
* Authorizer/payer keys

The above fields are the only available information of one transaction which we can use to calculate inclusion fees, because we have to calculate it without executing one transaction.
Expand All @@ -65,61 +75,74 @@ We checked localnet, benchnet and canary to collect metrics. Except that benchne
<i>chart for metrics from canary</i>
</p>

#### Data Modeling
Number of Signatures(one key for each authorizer) also demonstated a significant impact to TPS:

Assuming that with given configuration, relative inclusion fee is simply the inverse of stable peak TPS, now we have below linear equation with transaction size as X and the resulting Inclusion Fee as Y:
<p align="center">
<img src="20221005-dynamic-inclusion-fee/num-auth-tps-localnet.png" alt="chart for metrics from localnet">
<br>
<i>chart for metrics from localnet</i>
<img src="20221005-dynamic-inclusion-fee/num-auth-tps-canary.png" alt="chart for metrics from canary">
<br>
<i>chart for metrics from canary</i>
</p>

Y = C1 * X + C2
#### Data Modeling

We used some existing math tool ([https://mycurvefit.com/](https://mycurvefit.com/) in our case), plugging in the metrics data we collected (simply deriving the inclusion fee by 1.0/TPS) from canary and we got the equation below. The reason we use canary data to derive the equation is that its hardware\network configuration is closer to mainnet compared with localnet.
Assuming that with given configuration, relative inclusion fee is simply the inverse of stable peak TPS, and for the number of signatures, we want to charge it linearly by that number, now we have below linear equation, in which C1, C2, C3 are the constant coefficients in the equation:

<img src="https://render.githubusercontent.com/render/math?math=I"> = <span style="color:blue"> C1 * TxByteSize + C2 * NumOfSignature + C3
</span>

Inclusion Fee Factor = 6.370641e-7 * TxByteSize + 2.585227e-3
We used some existing math tool ([LINEST from Google Sheet](https://support.google.com/docs/answer/3094249?hl=en) in our case), plugging in the metrics data we collected (simply deriving the inclusion fee by 1.0/TPS) from canary and we got the equation below. The reason we use canary data to derive the equation is that its hardware\network configuration is closer to mainnet compared with localnet.

We’d like to charge average sized transactions by 1e-6 FLOW, and according to [this](https://console.cloud.google.com/bigquery?sq=1039464903934:01f22b6020e1403690eeb67822d56c87) BigQuery, we can tell that the average size of production transactions are around 1800B, plugging this into the equation above and now we have the final equation as below:
<img src="https://render.githubusercontent.com/render/math?math=I"> = <span style="color:blue"> 6.39e-7 * TxByteSize + 3.29e-5 * NumOfSignature + 0.0023
</span>

Inclusion Fee Factor = 267.956977406 * (6.370641e-7 * TxByteSize + 2.585227e-3)
We’d like to charge average sized transactions by 1.0 factor (1e-6 FLOW), which means we are trying to keep inclusion fees of average sized transactions the same as before. According to [this](https://console.cloud.google.com/bigquery?sq=1039464903934:01f22b6020e1403690eeb67822d56c87) BigQuery, we can tell that the average size of production transactions are around 1800B, and according to [this](https://console.cloud.google.com/bigquery?sq=1039464903934:8ce5d7d1677a4c5d99aa3341b46375cd) BigQuery, majority of transaction only have 1 authorizers, unfortunately on GCP we don't record number of keys per authorizer, so we are assuming that there's only one key for each authorizer. After plugging these data into the equation above and now we have the final equation as below:

<img src="https://render.githubusercontent.com/render/math?math=I"> = <span style="color:blue"> 287.1 * (6.39e-7 * TxByteSize + 3.29e-5 * NumOfSignature + 0.0023)
</span>
#### Model Validation

To make sure our mathematical model above works as expected with real mainnet transactions, we ran the equation against sampled transactions in BigQuery where our recent production transactions are stored.

* <span style="color: #000099;">New Inclusion Fee calculation in BigQuery</span>


From [this](https://console.cloud.google.com/bigquery?sq=1039464903934:9f7c25b503cf4fcdaf2301d2e51f539b) BigQuery where we calculate new inclusion fees for sampled transactions with different sizes, we can see that the new inclusion fee factor of transactions around 1800B are around 1.0 (or 1e-6 FLOW), and the smaller one tx is, the less the inclusion fee is, and vice versa. Listing several pieces of the result as below:
From [this](https://console.cloud.google.com/bigquery?sq=1039464903934:9f7c25b503cf4fcdaf2301d2e51f539b) BigQuery where we calculate new inclusion fees for sampled transactions with different sizes, we can see that the new inclusion effort of transactions around 1800B are around 1.0 (or 1e-6 FLOW), and the smaller one tx is, the less the inclusion fee is, and vice versa. Listing several pieces of the result as below. Note that there's no number of keys per authorizer info on BigTable.

<div align="center">

Transaction with size less than average

| Tx Size | New Inclusion Fee Factor |
|:----------:|:-------------:|
| 793 | 0.828099289 |
| 809 | 0.8308305813 |
| 810 | 0.8310012871 |
| 811 | 0.8311719928 |
| 821 | 0.8328790505 |
| Tx Size(Bytes) | Num of Auth | New Inclusion Effort |
|:----------:|:-------------:|:-------------:|
| 742 | 1 | 0.8059 |
| 757 | 1 | 0.8086 |
| 763 | 1 | 0.8097 |
| 766 | 1 | 0.8103 |
| 767 | 1 | 0.8104 |


Transaction with size around average

| Tx Size | New Inclusion Fee Factor |
|:----------:|:-------------:|
| 1755 | 0.9923182403 |
| 1786 | 0.9976101192 |
| 1804 | 1.000682823 |
| 1843 | 1.007340348 |
| 1865 | 1.011095875 |
| Tx Size(Bytes) | Num of Auth | New Inclusion Effort |
|:----------:|:-------------:|:-------------:|
| 1756 | 1 | 0.9919 |
| 1786 | 1 | 0.9974 |
| 1804 | 3 | 1.0196 |
| 1816 | 1 | 1.0029 |
| 1843 | 2 | 1.0173 |

Transaction with size above average

| Tx Size | New Inclusion Fee Factor |
|:----------:|:-------------:|
| 4538 | 1.4673924 |
| 4801 | 1.512288018 |
| 5184 | 1.577668328 |
| 5823 | 1.686749315 |
| 7880 | 2.037891086 |
| Tx Size(Bytes) | Num of Auth | New Inclusion Effort |
|:----------:|:-------------:|:-------------:|
| 3386 | 1 | 1.2909 |
| 3387 | 1 | 1.2911 |
| 3412 | 2 | 1.3051 |
| 3412 | 1 | 1.2957 |
| 3418 | 1 | 1.2968 |
</div>

* <span style="color: #000099;">Min\Max charges</span>
Expand All @@ -128,58 +151,58 @@ Transaction with size above average
<ol><ol>
<li><span style="color: #000099;">Inclusion Fee of Empty Transaction</span></li>

Inclusion Fee Factor(empty tx)
= 267.956977406 * (6.370641e-7 * **0** + 2.585227e-3)
= 0.69272961282
After applying this new inclusion fee factor onto the existing inclusion effort charge(1e-6 FLOW), we get 1e-6 * 0.69272961282 as the new charge.
Inclusion Effort(empty tx)
= 287.1 * (6.39e-7 * 0 + 3.29e-5 * 1 + 0.0023)
= 0.6697
After applying this new inclusion effort onto the existing inclusion effort charge(1e-6 FLOW), we get 1e-6 * 0.6697 as the new charge.
<li><span style="color: #000099;">Inclusion Fee of Maximal Transaction</span></li>

According to the [code](https://github.com/onflow/flow-go/blob/master/model/flow/constants.go#L32), the max possible byte size of one transaction is 1500000B, now we have:
According to the [code](https://github.com/onflow/flow-go/blob/master/model/flow/constants.go#L32), the max possible byte size of one transaction is 1500000B, and if we simply set the number of signatures as the currently max value of authorizers, 14, in our BigQuery, now we have:

Inclusion Fee Factor(largest tx possible)
= 267.956977406 * (6.370641e-7*1500000 + 2.585227e-3)
= 256.751385588
Inclusion Effort(largest tx possible)
= 287.1 * (6.39e-7 * 1500000 + 3.29e-5 * 14 + 0.0023)
= 275.9779

After applying this new inclusion fee factor onto the existing inclusion effort charge(1e-6 FLOW), we get 1e-6 * 256.751385588 as the new charge.
After applying this new inclusion effort onto the existing inclusion effort charge(1e-6 FLOW), we get 1e-6 * 275.9779 as the new charge.
</ol></ol>

* <span style="color: #000099;">New inclusion fees for different types of transactions</span>

It will also be useful to see inclusion fee changes with the new equation, for some typical types of transactions we have. We are assuming there is only one authorizer in these transactions which is the most common scenario.
It will also be useful to see inclusion fee changes with the new equation, for some typical types of transactions we have. We are assuming there is only one signature in these transactions which is the most common scenario. Assuming all calculation below has only 1 signature.
<ol>
<ol>
<li>Create Account transaction</li>

Cadence script we use for the calculation: [link](https://github.com/onflow/flow-go/blob/master/integration/benchmark/scripts/createAccountsTransaction.cdc) - size: 984B

New inclusion fee factor
= 267.956977406 * (6.370641e-7*984 + 2.585227e-3)
= 0.86070409114
New inclusion effort
= 287.1 * (6.39e-7 * 984 + 3.29e-5 * 1 + 0.0023)
= 0.8503

The new fee is roughly **14% less** than the existing inclusion charge (1e-6 FLOW)
The new fee is roughly **15% less** than the existing inclusion charge (1e-6 FLOW)


<li>Token transfer transaction</li>

Cadence script we use for the calculation: [link](https://github.com/onflow/flow-go/blob/master/integration/benchmark/scripts/tokenTransferTransaction.cdc) - size: 718B


New inclusion fee factor
= 267.956977406 * (6.370641e-7*718 + 2.585227e-3)
= 0.81529635615
New inclusion effort
= 287.1 * (6.39e-7 * 718 + 3.29e-5 * 1 + 0.0023)
= 0.8015

The new fee is roughly **18% less** than the existing inclusion charge (1e-6 FLOW)
The new fee is roughly **20% less** than the existing inclusion charge (1e-6 FLOW)

<li>NFT transfer transaction</li>

Cadence script we use for the calculation: [link](https://github.com/onflow/flow-go/blob/master/fvm/fvm_bench_test.go) - size 818B


New inclusion fee factor
= 267.956977406 * (6.370641e-7*818 + 2.585227e-3)
= 0.83236693322
New inclusion effort
= 287.1 * (6.39e-7 * 818 + 3.29e-5 * 1 + 0.0023)
= 0.8198

The new fee is roughly **16% less** than the existing inclusion charge (1e-6 FLOW)
The new fee is roughly **18% less** than the existing inclusion charge (1e-6 FLOW)

</ol>
</ol>
Expand All @@ -192,30 +215,60 @@ The new fee is roughly **16% less** than the existing inclusion charge (1e-6 FLO
![mainnet tx size histogram ](20221005-dynamic-inclusion-fee/tx-size-histogram-gcp.png)

So most transactions are with sizes between 500 - 2000 bytes, and there are much fewer number of transactions with sizes above 4000B. The average transaction size from our mainnet is 1800B, which is marked on the diagram too. We used this value to calibrate our inclusion fees as mentioned above.
<br>

* <span style="color: #000099;">Distribution of mainnet number of authorizers</span>

From [BigQuery](https://console.cloud.google.com/bigquery?sq=1039464903934:8ce5d7d1677a4c5d99aa3341b46375cd&project=dapperlabs-data) querying all our mainnet transactions, we can see the distribution of number of authorizers in below table:

<div align="center">

| Number of Authorizers | Count of Transactions |
|:----------:|:-------------:|
| 1 | 216049901 |
| 2 | 41611744 |
| 3 | 7548509 |
| 4 | 1843914 |
| 5 | 869991 |
| 6 | 528975 |
| 7 | 442313 |
| 8 | 403601 |
| 9 | 272371 |
| 10 | 119965 |
| 11 | 181782 |
| 14 | 53 |

</div>

#### Limitations

* <span style="color: #000099;">Using metrics from canary to estimate mainnet fees</span>
* <span style="color: #000099;">Using metrics from canary to estimate mainnet fees.</span>
Since we were using the loader app to stress test certain environments (localnet, canary), it is impractical to run a similar stress test to mainnet which will impact production traffic. Thus, we are only able to apply the mathematical equation derived from those environments to mainnet directly. It is fine in our use cases because we are not looking for precise mathematical calculation to new inclusion fees, and we are looking for a better calculation than the current constant inclusion charge: we should charge less for smaller transactions and more for bigger transactions. So as long as the new equation demonstrates this characteristic on production transactions, we will treat it as a working math model.

* <span style="color: #000099;">Benchnet didn’t give stable peak TPS metrics</span>
* <span style="color: #000099;">Benchnet didn’t give stable peak TPS metrics.</span>
We also tried the loader app on benchnet configurations. However the existing benchnet didn’t give us stable peak TPS and it made it unusable to collect performance related metrics. The reason is still under investigation, but since we were still able to get metrics from canary, it is not a blocking issue for us.

* <span style="color: #000099;">Other factors impacting stable peak TPS</span>
In our benchmark testing on both localnet and canary environments, we actually also found that another two factors can also impact the stable peak TPS: number of authorizers and number of keys per payer. However we didn’t include these two factors into our final equation, because they don’t seem to be major factors impacting performance, thus including them will make our equation unnecessarily complex.
* <span style="color: #000099;">Other factors impacting stable peak TPS.</span>
In our benchmark testing on both localnet and canary environments, we actually also found that number of keys can also impact the stable peak TPS. However the actual impact measured from data collection was not major enough for us to include it into the equation.

* <span style="color: #000099;">More descriptive Cadence code will be discouraged.</span>
For most cases, Cadence code takes a big portion of one transaction. With transaction size as one factor to charge for inclusion fees, it may discourage Cadence authors to write less readable and descriptive smart contract code on Flow.

**Future Work**

* <span style="color: #000099;">Collecting more data points for a more accurate model</span>
* <span style="color: #000099;">Collecting more data points for a more accurate model.</span>
Due to the limitation mentioned, it was quite a cumbersome process to collect even one single performance metrics data point, so we were only able to collect a limited amount of data for modeling purposes. We don’t need a super precise model for now, but if in the future when higher precision is required, we will need to collect more data to support the precision we want from a model.

* <span style="color: #000099;">Self-adaptive Inclusion Fee model</span>
* <span style="color: #000099;">Self-adaptive Inclusion Fee model.</span>
In this FLIP we propose a fixed-coefficient linear equation, however the hardware configuration, network capabilities and even Flow itself may change. In these cases, the coefficients we have now may not give optimal inclusion fee calculations accordingly. It will be ideal if we can have a self-adaptive mechanism, to automatically adjust parameters of the inclusion fee model, to continuously produce appropriate inclusion fee calculation.

* <span style="color: #000099;">Checking balance of payer to see if Inclusion Fees can be covered in access\collection nodes</span>
* <span style="color: #000099;">Checking balance of payer to see if Inclusion Fees can be covered in access\collection nodes.</span>
When we have a dynamic inclusion fee, we will have to check if a certain payer’s account balance can cover this new inclusion fee.

* <span style="color: #000099;">More infrastructure setup is nice-to-have for easier metrics collection in future</span>
* <span style="color: #000099;">Non-linear Inclusion Fees Calculation for extra large transactions.</span>
We might want to charge extra more, like quadratically for super large transactions, whose total size exceeds a certain cutoff size, to encourage people to keep their transaction size within a reasonable limit. Transaction smart contract code compress may be something worth investigation.

* <span style="color: #000099;">More infrastructure setup is nice-to-have for easier metrics collection in future.</span>
As previously mentioned, for almost all existing testing environments(localnet, benchnet, canary), we had to do a trial-and-error method to collect one stable peak TPS with a given configuration. This process is a very cumbersome manual process and it does not scale with the number of metrics to collect. So in future when we need to collect similar metrics, we may want to have an easier or even automated process. It will enable us to collect relatively comprehensive metrics data for our future data related work. At the time of writing, the automation team has been actively working on this already.

#### References
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 30ad3cb

Please sign in to comment.