Skip to content

Commit

Permalink
Introduce markdownlint (#19)
Browse files Browse the repository at this point in the history
Use markdownlint
  • Loading branch information
breezewish authored Dec 19, 2018
1 parent 14ea9d6 commit d6939c1
Show file tree
Hide file tree
Showing 10 changed files with 134 additions and 101 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
node_modules/
package-lock.json
7 changes: 7 additions & 0 deletions .markdownlint.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{
"default": true,
"MD013": {
"code_blocks": false
},
"MD034": false
}
14 changes: 5 additions & 9 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -1,10 +1,6 @@
language: minimal

env:
- IGNORE_PAT="^\[.*\]:"

language: node_js
node_js:
- 10
cache: npm
script:
- for f in text/*; do
grep -vE $IGNORE_PAT $f | fold -s | diff -I $IGNORE_PAT - $f ||
( echo "Please wrap RFCs at 80 characters" && exit 1 );
done
- npm run lint
20 changes: 14 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,29 +10,37 @@ The "RFC" (request for comments) process is intended to provide a consistent
and controlled path for new features to enter the project, so that all
stakeholders can be confident about the direction the project is evolving in.

### How to submit an RFC
## How to submit an RFC

1. Copy `template.md` into `text/YYYY-MM-DD-my-feature.md`.
2. Write the document and fill in the blanks.
3. Submit a pull request.

### Timeline of an RFC
## Timeline of an RFC

1. An RFC is submitted as a PR.
2. Discussion takes place, and the text is revised in response.
3. The PR is merged or closed when at least two project maintainers reach
consensus.

### Style of an RFC
## Style of an RFC

We should wrap line at 80 characters.
We follow lint rules listed in
[markdownlint](https://github.com/DavidAnson/markdownlint/blob/master/doc/Rules.md).

### License
Run lints (you must have [Node.js](https://nodejs.org) installed):

```bash
# Install linters: npm install
npm run lint
```

## License

This content is licensed under Apachie License, Version 2.0,
([LICENSE](LICENSE) or http://www.apache.org/licenses/LICENSE-2.0)

### Contributions
## Contributions

Unless you explicitly state otherwise, any contribution intentionally submitted
for inclusion in the work by you, as defined in the Apache-2.0 license, shall
Expand Down
10 changes: 10 additions & 0 deletions package.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
{
"name": "rfcs",
"version": "0.0.0",
"devDependencies": {
"markdownlint-cli": "^0.13.0"
},
"scripts": {
"lint": "markdownlint text/*.md"
}
}
14 changes: 7 additions & 7 deletions template.md
Original file line number Diff line number Diff line change
@@ -1,33 +1,33 @@
<!-- 80 characters per line is preferred -->
# Title

# Summary
## Summary

One para explanation of the proposal.

# Motivation
## Motivation

Why are we doing this? What use cases does it support? What is the expected
outcome?

# Detailed design
## Detailed design

This is the bulk of the RFC. Explain the design in enough detail that:

- It is reasonably clear how the feature would be implemented.
- Corner cases are dissected by example.
- How the feature is used.

# Drawbacks
## Drawbacks

Why should we not do this?

# Alternatives
## Alternatives

- Why is this design the best in the space of possible designs?
- What other designs have been considered and what is the rationale for not
choosing them?
- What is the impact of not doing this?

# Unresolved questions
## Unresolved questions

What parts of the design are still to be determined?
18 changes: 10 additions & 8 deletions text/2017-12-22-read-pool.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
# Summary
# Read Pool

## Summary

This RFC proposes a new thread pool that supports async operations, which is
called `ReadPool`. It is designed to be specifically used for all sorts of read
operations like Kv Get, Kv Scan and Coprocessor Read to improve performance.

# Motivation
## Motivation

Currently, for all KV operations that stay inside `Storage`, they are executed
on the same thread pool. This leads to bad operation isolation, i.e. heavy
Expand All @@ -24,7 +26,7 @@ In this RFC, there will be a ReadPool for KV operations and another ReadPool
for Coprocessor operations, so that they won't block each other. In the future,
it may be possible to merge them together.

# Detailed design
## Detailed design

The ReadPool provides these features:

Expand All @@ -42,7 +44,7 @@ supports features 1 and 3. In this way, `FuturePool` becomes a common facility
that can be reused in the future to run other kinds of async operations that
need contexts, not just limited to the scenario listed in this RFC.

## FuturePool
### FuturePool

To support periodical ticking, this RFC defines:

Expand Down Expand Up @@ -129,7 +131,7 @@ impl<T: Context + 'static> FuturePool<T> {
}
```

## ReadPool
### ReadPool

`ReadPool` is implemented over `FuturePool` as follows:

Expand Down Expand Up @@ -181,13 +183,13 @@ impl<T: futurepool::Context + 'static> ReadPool<T> {
}
```

# Drawbacks
## Drawbacks

The `on_tick` implementation in `FuturePool` is not perfect. It is driven by
task completion, which means it will not be executed when there is no task.
This is acceptable since we are going to use `on_tick` to report metrics.

# Alternatives
## Alternatives

We can implement our own future pool instead of utilizing `CpuPool`. In this
way, we can have a true `on_tick`. However, this is complex and is not a
Expand All @@ -196,6 +198,6 @@ necessary feature we need.
We can implement our own async model instead of utilizing `Future`. However,
there is no benefits compared to the current solution.

# Unresolved questions
## Unresolved questions

None.
88 changes: 45 additions & 43 deletions text/2018-08-29-unsafe-destroy-range.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
# Summary
# Unsafe Destroy Range

## Summary

Support RPC `UnsafeDestroyRange`. This call is on the whole TiKV rather than a
certain Region. When it is invoked, TiKV will use `delete_files_in_range` to
Expand All @@ -11,7 +13,7 @@ say, the range is permanently scrapped.
This interface is only designed for TiDB. It's used to clean up data after
dropping/truncating a huge table/index.

# Motivation
## Motivation

Currently, when TiDB drops/truncates a table/index, after GC lifetime it will
invoke `DeleteRange` on all affected Regions respectively. Though TiKV will
Expand All @@ -27,7 +29,7 @@ In our tests, this can finish deleting data of a hundreds-of-GB table in
seconds, while the old implementation may cost hours to finish. Also the disk
space is released very fast.

# Detailed design
## Detailed design

There are several steps to implement it.

Expand All @@ -49,18 +51,18 @@ currently, it's possible that there will be more parameters added to `Context`,
which is also possibly needed by `UnsafeDestroyRange`.
2. When TiKV receives `UnsafeDestroyRangeRequest`, it immediately runs
`delete_files_in_range` on the whole range on RocksDB, bypassing the Raft
layer. Then it will scan and delete all remaining keys in the range.
* Due to that `UnsafeDestroyRange` only helps TiDB's GC, we put the logic
of handling `UnsafeDestroyRange` request to `storage/gc_worker`.
* `GCWorker` needs a reference to raftstore's underlying RocksDB now.
However, it's a component of `Storage` that knows nothing about the
implementation of the `Engine`. Either, we cannot specialize it for `RaftKv` to
get the RocksDB, since it's just a router. The simplest way is to pass the
RocksDB's Arc pointer explicitly in `tikv-server.rs`.
* We regard `UnsafeDestroyRange` as a special type of GCTask, which will be
executed in TiKV's GCWorker's thread:
`delete_files_in_range` on the whole range on RocksDB, bypassing the Raft
layer. Then it will scan and delete all remaining keys in the range.
* Due to that `UnsafeDestroyRange` only helps TiDB's GC, we put the logic
of handling `UnsafeDestroyRange` request to `storage/gc_worker`.
* `GCWorker` needs a reference to raftstore's underlying RocksDB now.
However, it's a component of `Storage` that knows nothing about the
implementation of the `Engine`. Either, we cannot specialize it for
`RaftKv` to get the RocksDB, since it's just a router. The simplest way is
to pass the RocksDB's Arc pointer explicitly in `tikv-server.rs`.
* We regard `UnsafeDestroyRange` as a special type of GCTask, which will be
executed in TiKV's GCWorker's thread:
```rust
enum GCTask {
GC {
Expand All @@ -76,10 +78,10 @@ executed in TiKV's GCWorker's thread:
},
}
```
* The logic of `UnsafeDestroyRange` is quite similar to `DeleteRange` in
apply worker: First, call `delete_files_in_range` to quickly drop most of the
data and also quickly release the space; Then, scan and delete all remaining
keys in the range.
* The logic of `UnsafeDestroyRange` is quite similar to `DeleteRange` in
apply worker: First, call `delete_files_in_range` to quickly drop most of
the data and also quickly release the space; Then, scan and delete all
remaining keys in the range.
We should choose a proper batch size for the second step (scan and
delete). Currently raftstore uses 4MB as the max batch size when doing
Expand All @@ -92,42 +94,42 @@ not affected any more.
let's continue to see what we will do on TiDB:
3. In TiDB, when you execute `TRUNCATE/DROP TABLE/INDEX`, the meta data of the
related table will be modified at first, and then the range covered by the
truncated/dropped table/index will be recorded in the system table
`gc_delete_range` with timestamp when the `TRUNCATE/DROP` query is executed.
TiDB GC worker will check this table every time it does GC. In the current
implementation, if there are records in `gc_delete_range` with its timestamp
smaller than the safe point, TiDB will delete ranges indicated by these records
by invoking `DeleteRange`, and also move these records from `gc_delete_range`
to `gc_delete_range_done`. So in the new implementation:
related table will be modified at first, and then the range covered by the
truncated/dropped table/index will be recorded in the system table
`gc_delete_range` with timestamp when the `TRUNCATE/DROP` query is executed.
TiDB GC worker will check this table every time it does GC. In the current
implementation, if there are records in `gc_delete_range` with its timestamp
smaller than the safe point, TiDB will delete ranges indicated by these
records by invoking `DeleteRange`, and also move these records from
`gc_delete_range` to `gc_delete_range_done`. So in the new implementation:
* When doing `DeleteRanges` (the second step of GC), invoke the new
`UnsafeDestroyRange` on each TiKV, instead of running the old `DeleteRange` on
each Region. Also we need to add an interface to PD so that TiDB can get a list
of all TiKVs.
`UnsafeDestroyRange` on each TiKV, instead of running the old
`DeleteRange` on each Region. Also we need to add an interface to PD so
that TiDB can get a list of all TiKVs.
* Check the record again from `gc_delete_range_done` after 24 hours (24
hours since it was moved to table `gc_delete_range_done`), and invoke
`UnsafeDestroyRange` again. Then this record can be removed from table
`gc_delete_range_done`.
hours since it was moved to table `gc_delete_range_done`), and invoke
`UnsafeDestroyRange` again. Then this record can be removed from table
`gc_delete_range_done`.
And why do we need to check it one more time after 24 hours? After deleting
the range the first time, if coincidentally PD is trying to move a Region or
something, some data may still appear in the range. So check it one more time
after a proper time to greatly reduce the possibility.
# Drawbacks
## Drawbacks
* It breaks the consistency among Raft replicas.
* But fortunately, in TiDB, when `DeleteRanges` happens, the deleted range
will never be accessed again.
* But fortunately, in TiDB, when `DeleteRanges` happens, the deleted range
will never be accessed again.
* To bypass the Raft layer, the code might violate the layered architecture of
TiKV.
TiKV.
* Sending requests to each TiKV server never occurs in TiDB before. This
pattern is introduced to TiDB's code the first time.
pattern is introduced to TiDB's code the first time.
* After deleting all files in a very-large range, it may trigger RocksDB's
compaction (even it is not really needed) and even cause stalling.
compaction (even it is not really needed) and even cause stalling.
# Alternatives
## Alternatives
We've considered several different ideas. But in order to solve the problem
quickly, we prefer an easier approach.
Expand All @@ -137,8 +139,8 @@ scenario where TiKV is only used by TiDB, deleted range will never be accessed
again. So in fact, if we add an interface that is specific for TiDB, we just
needn't keep this kind of consistency. It's safe enough.
# Unresolved questions
## Unresolved questions
* A huge amount of empty regions will be left after invoking
`UnsafeDestroyRange`. It may take a long time to merge them all. If possible,
they should be merged faster.
`UnsafeDestroyRange`. It may take a long time to merge them all. If possible,
they should be merged faster.
Loading

0 comments on commit d6939c1

Please sign in to comment.