Skip to content

Commit

Permalink
discuss normal locks
Browse files Browse the repository at this point in the history
Signed-off-by: ekexium <[email protected]>
  • Loading branch information
ekexium committed Aug 23, 2024
1 parent 93f799e commit 7938582
Showing 1 changed file with 70 additions and 6 deletions.
76 changes: 70 additions & 6 deletions text/0114-resolved-ts-for-large-transactions.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,9 @@ In current TiKV(v8.3), large transactions can block resolve-ts from advancing, b

## Goals

Do not let **large pipelined transactions** block the advance of resolved-ts.
In current phase, our primary goal is to not let **large pipelined transactions** block the advance of resolved-ts. We focus on large pipelined transactions here. It could be adapted for general "large" transactions.

We focus on large pipelined transactions here. It could be adapted for general "large" transactions.
Our ultimate goal is to achieve an unblocked resolved-ts progression. Besides long transactions and their locks, there are other factors that can block the advance of resolved-ts. We will discuss it in the last part of the proposal.

## Assumptions

Expand All @@ -28,11 +28,11 @@ This constraint is not a strict limit, but rather serves to manage resource util

The key idea is using `lock.min_commit_ts` to calculate resolved-ts instead of `lock.start_ts`.

A resolved-ts guarantees that all historical events prior to this timestamp are finalized and observable. 'Historical events' in this context refer specifically to write records and rollback records, but explicitly exclude locks. It's important to note that the absence of locks with earlier timestamps is not a requirement for a valid resolved-ts, as long as the status of their corresponding transactions is definitively determined.
A resolved timestamp (resolved-ts) ensures that all historical events before this point are finalized and observable. In this context, 'historical events' specifically mean write and rollback records, excluding locks in the LOCK CF. Importantly, a valid resolved-ts doesn't require the absence of earlier locks, as long as their transactions' status is determined.

### Maintanence of resolved-ts

Key objective: Maximize TiKV nodes' awareness of large pipelined transactions, including:
Key objective: Maximize all TiKV nodes' awareness of large pipelined transactions during their lifetime, i.e. from their first writes to all locks being committed. These info are necessary:

1. start_ts
2. Recent min_commit_ts
Expand All @@ -44,6 +44,12 @@ For a large pipelined transaction, its TTL manager is responsible for fetching a

Atomic variables or locks may be needed for synchronization between the TTL manager and the committer.

#### Scaling out TiKVs

When a new TiKV instance is added to the cluster in the middle of a large transaction, its TTL manager must broadcast to it in time. TTL manager gets the list of stores from the region cache. If region cache is unaware of any newly up TiKV, TTL manager may miss it.

To mitigate this, we propose implementing an optional routine in the region cache to periodically fetch all stores.

#### TiKV scheduler - heartbeat

Besides updating TTL, it can also update min_commit_ts of the PK.
Expand All @@ -69,13 +75,19 @@ After the successfully commiting all secondary locks of a large transaction, the

Resolver tracks normal locks as usual, but handles locks belonging to large pipelined transactions in a different way. The locks can be identified via the "generation" field.

For a lock belonging to a large pipelined transaction, the resolve only tracks its start_ts. When calculating resolved-ts, the resolver first tries to map the start_ts to its min_commit_ts by querying the txn_status_cache. If not found in cache, fallback to calculate using start_ts.
For locks in large pipelined transactions, the resolver only tracks the start_ts. When calculating resolved-ts, it first attempts to map start_ts to min_commit_ts via the txn_status_cache. To maintain semantics, resolved-ts must be at least min_commit_ts + 1. If the cache lookup fails, it falls back to using start_ts for calculation.

Upon observing a LOCK DELETION, the resolver ceases tracking the corresponding start_ts for large pipelined transactions. This is justified as lock deletion only occurs once a transaction's final state is determined.

### Benefits in resolving locks

Across all lock resolution scenarios—including normal reads, stale reads, flashbacks, and potentially write conflicts—a preliminary txn_status_cache lookup can significantly reduce unnecessary computational overhead introduced by large transactions.

### Compatibility

The key difference is that services can now observe locks. They need to handle the locks.
The key difference is that services can now observe much more locks.

Note that the current implementation still allows encountering locks with timestamps smaller than the resolved timestamp. This proposal doesn't change this behavior, so we don't anticipate correctness issues with this change. The main challenges will be related to performance and availability.

#### Stale read

Expand All @@ -85,10 +97,16 @@ When it meets a lock, first query the txn_status_cache. When not found in the ca

*TBD*

1. Compatilibity with CDC: Flashback will write a lock to block resolved-ts during its execution. It does not use pipelined transaction so this lock will be treated as a normal lock.

2. The current and previous (up to v8.3) implementations of Flashback in TiKV rely on an incorrect assumption about resolved-ts guarantees. This misconception can lead to critical issues, such as the potential violation of transaction atomicity, as documented in https://github.com/tikv/tikv/issues/17415.

#### EBS snapshot backups

*TBD*

It depends on Flashback.

#### CDC

Already well documented in [Large Transactions Don't Block Watermark](https://github.com/pingcap/tiflow/blob/master/docs/design/2024-01-22-ticdc-large-txn-not-block-wm.md). Briefly, a refactoring work is needed.
Expand All @@ -103,3 +121,49 @@ RPCs: each large transaction sends N more RPCs per second, where N is the number

CPU: the mechanism may consume more CPU, but should be ignorable.



## Possible future improvements

#### Tolerate lagging non-pipelined transactions

To get closer to our ultimate goal: minimize blocking of resolved-ts, we can further consider the case where resolved-ts being blocked by normal transaction locks. Typical causes could be:

- Memory locks from async commit and 1PC. Normal locks are region-partitioned can will not block resolved-ts of other regions. But concurrenty manager is a node-level instance. Memory locks can block every (leader) region in the same TiKV.
- Slow transactions which take too much time committing their locks
- Long-running transactions that may not be large.
- Node failures



Resolved-ts must continuously progress. However, it can't advance autonomously while ignoring locks. Such advancement would require the commit PK operation to either complete before the resolved-ts reaches a certain point or fail. This guarantee is not feasible.

The left approach feasible to prevent resolved-ts blocked by normal transactions are actively pushing their min_commit_ts, similar to what is done to large transactions.

However, locks using async commit cannot be pushed.

To sum up, when a resolver meets a lock whose min_commit_ts still blocks its

- Check the cache
- Found if T.min_commit_ts >= R_TS candidate -> skip the lock
- Else, fallthrough

- 2PC locks, check_txn_status and try to push its min_commit_ts.
- Committed -> return its commit_ts
- Commit_ts > R_TS candidate -> skip the lock
- Commit_ts < R_TS candidate -> block at commit_ts - 1.
- Min commit ts pushed, or min_commit_ts > R_TS candidate -> skip the lock
- Rolled back -> skip the lock
- Else -> block at min_commit_ts - 1
- Async commit locks -> check its status
- Committed, same as 2PC locks
- Rolled back -> skip the lock
- Else -> block at min_commit_ts - 1

Locks belonging to the same transaction can be consolidated.

To mitigate uncontrollable overhead and metastability risks, we limit our check to the oldest K transactions per region, given that the total number of transactions could be substantial.

#### Reduce write-read conflicts

Read requests typically require a check_txn_status to advance the min_commit_ts. We propose allowing large transactions to set their min_commit_ts to a higher value, potentially exceeding the current TSO. These min_commit_ts values, stored in the txn_status_cache, would enable read requests encountering locks to bypass them via a cache lookup. Large transactions would cease this special min_commit_ts setting once ready for prewrite.

0 comments on commit 7938582

Please sign in to comment.