Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reference/security: add encryption-at-rest doc #2410

Merged
merged 39 commits into from
May 29, 2020
Merged

reference/security: add encryption-at-rest doc #2410

merged 39 commits into from
May 29, 2020

Conversation

yiwu-arbug
Copy link
Contributor

@yiwu-arbug yiwu-arbug commented Apr 27, 2020

Signed-off-by: Yi Wu [email protected]

What is changed, added or deleted? (Required)

Add user document for TiKV Encryption-At-Rest (also refer to as Transparent Data Encryption, or TDE), and BR server-side encryption when backup to S3.

Which TiDB version(s) do your changes apply to? (Required)

  • master (the latest development version)
  • v4.0 (TiDB 4.0 versions)
  • v3.1 (TiDB 3.1 versions)
  • v3.0 (TiDB 3.0 versions)
  • v2.1 (TiDB 2.1 versions)

If you select two or more versions from above, to trigger the bot to cherry-pick this PR to your desired release version branch(es), you must add corresponding labels such as needs-cherry-pick-4.0, needs-cherry-pick-3.1, needs-cherry-pick-3.0, and needs-cherry-pick-2.1.

What is the related PR or file link(s)?

TiKV encryption-at-rest design docs: https://docs.google.com/document/d/14u0YKHXxjE0a9wAd3tsu0Hqjmknjr0bYU_EMzZSbzkA/edit?usp=sharing

  • This PR is translated from:
  • Other reference link(s):

* TiKV currently does not exclude encryption keys and user data from core dump. It is highly adviced to disable core dump for TiKV process when using encryption-at-rest. This is not currently handled by TiKV itself.
* TiKV keeps track of encryption key and method used for each data files, using absolute path of the files. As a result, once encryption is turned on for a TiKV node, user should not change data file paths config such as `storage.data-dir`, `raftstore.raftdb-path`, `rocksdb.wal-dir` and `raftdb.wal-dir`.

## BR S3 server-side encryption
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: copy this section to BR document.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@overvenus do you think this is needed, and if so, which file should I modify?

@yiwu-arbug yiwu-arbug requested a review from overvenus May 12, 2020 05:23
@ran-huang ran-huang changed the title encryption-at-rest docs reference/security: add encryption-at-rest doc May 12, 2020
@ran-huang ran-huang self-requested a review May 12, 2020 05:29
@yiwu-arbug yiwu-arbug requested a review from gregwebs May 12, 2020 18:23

The same master key can be shared by multiple instances of TiKV. The recommended way to provide a master key in production is via AWS KMS. User would need to create a CMK through AWS KMS, and then provide the CMK key id to TiKV in config file. The TiKV process would need access to the KMS CMK while it is running, which can be done by using [IAM](https://aws.amazon.com/iam/). If TiKV fail to get access to the KMS CMK, it will failed to start or restart. If TiKV loss access to the KMS CMK while it is running, data key rotation will be temporarily disabled. Please refer to AWS documentation for KMS and IAM usage.

Alternatively, if using custom key is desired, supplying master key via file is also supported. The file need to contain a 256 bits (or 32 bytes) key encoded as hex string. The file should end with a newline (i.e. "\n") and contain nothing else. Persistenting the key on disk, however, leaks the key, so the key file is only suitable to be stored on tempfs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Alternatively, if using custom key is desired, supplying master key via file is also supported. The file need to contain a 256 bits (or 32 bytes) key encoded as hex string. The file should end with a newline (i.e. "\n") and contain nothing else. Persistenting the key on disk, however, leaks the key, so the key file is only suitable to be stored on tempfs.
Alternatively, if using custom key is desired, supplying the master key via file is also supported. The file needs to contain a 256 bits (or 32 bytes) key encoded as hex string. The file should end with a newline (i.e. "\n") and contain nothing else. Persisting the key on disk, however, leaks the key, so the key file is only suitable to be stored on tempfs in RAM.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we require a newline? It seems like we could chomp new lines if they exist?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to make the file format very strict to make sure user aware of the required key length. I would like to do without the newline, but text editors (at least vim) tend to add a extra newline at line end, which is not a bad idea to indicate end of the key.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I had to choose one, I would probably not have a newline. I generally agree that being strict about inputs is a much better approach, but chomping newlines is the one thin I don't see a problem with. I don't think it will cause any confusion about the file format and they can potentially avoid some deployment delays.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can make the newline optional, but it is too late for 4.0 GA branch cut.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

file key are not for production use anyway.

@gregwebs
Copy link
Contributor

I think we should mention the caveats section in the introduction

@ran-huang ran-huang added the translation/doing This PR's assignee is translating this PR. label May 13, 2020
@ran-huang ran-huang self-assigned this May 13, 2020
@ran-huang ran-huang added the status/PTAL This PR is ready for reviewing. label May 13, 2020

The current version of TiKV encryption has some drawbacks that we are working actively to address in the future versions.

* When deploying a TiDB cluster, the majority of a user's data is stored in TiKV nodes, and that data will be encrypted when encryption is enabled. However there is a small amount of user data stored in PD nodes as metadata (e.g secondary index keys used as TiKV region boundaries). As of v4.0.0, PD doesn't support encryption-at-rest. We recommend using storage level encryption (e.g. file system encryption) to help protect sensitive data stored in PD.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the only source of data leakage? Or it is the biggest one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We currently believe the secondary index is the only case, but not confirmed.

Yi Wu added 2 commits May 13, 2020 11:45
Signed-off-by: Yi Wu <[email protected]>
@sre-bot sre-bot merged commit 7c1e784 into pingcap:master May 29, 2020
@yikeke
Copy link
Contributor

yikeke commented May 29, 2020

/run-cherry-picker

sre-bot pushed a commit to sre-bot/docs that referenced this pull request May 29, 2020
@sre-bot
Copy link
Contributor

sre-bot commented May 29, 2020

cherry pick to release-4.0 in PR #2635

sre-bot added a commit that referenced this pull request May 29, 2020
@yiwu-arbug yiwu-arbug deleted the encryption branch June 12, 2020 21:22
@yikeke yikeke added the priority/P1 The issue has P1 priority. label Jun 13, 2020
@yikeke yikeke added size/large Changes of a large size. and removed priority/P1 The issue has P1 priority. labels Jul 7, 2020
@TomShawn TomShawn self-assigned this Jul 9, 2020
@yikeke yikeke added translation/done This PR has been translated from English into Chinese and updated to pingcap/docs-cn in a PR. and removed translation/doing This PR's assignee is translating this PR. labels Jul 9, 2020
@qiancai qiancai mentioned this pull request Feb 8, 2024
18 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/large Changes of a large size. status/can-merge Indicates a PR has been approved by a committer. translation/done This PR has been translated from English into Chinese and updated to pingcap/docs-cn in a PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants