txn: make the unique key lock behaviour consistent for all executors #22837

cfzjywxk · 2021-02-20T02:29:41Z

Development Task

After #21483, we'll try to lock all the related unique keys for pessimistic transactions if the executor is pointGet or batchPointGet. For other coprocessor related executors, we should also implement this for them, and make the lock behaviours consistent.
Tasks are:

Try to lock all unique keys for existing data for all executors, make them behave the same.
Make a detailed and improved document about the pessimistic lock behaviours of TiDB.

Another possible solution is to push down the pessimistic lock operations into the kv requests doing data read, the difference behaviours could be resolved, check rfc for more details.

The text was updated successfully, but these errors were encountered:

zhaoxugang · 2021-04-19T16:01:15Z

I want to join to resolve this issue, what can i do for this issue?@cfzjywxk

cfzjywxk · 2021-04-20T01:09:24Z

I want to join to resolve this issue, what can i do for this issue?@cfzjywxk

@zhaoxugang
I think maybe we need to to find out a way to lock the existing unique index keys for pessimistic transaction in selectLockExec executor, currently only row keys are locked.

zhaoxugang · 2021-05-06T16:28:52Z

the query by unique index keys will be executed by PointGetExec ot BatchPointGetExec, so i don't know what can i do for resolve the issue@you06,@cfzjywxk

you06 · 2021-05-08T17:27:06Z

the query by unique index keys will be executed by PointGetExec ot BatchPointGetExec, so i don't know what can i do for resolve the issue@you06,@cfzjywxk

In a short word, current PointGetExec and BatchPointGetExec will lock exist/unexist keys, however other executors will lock exist keys only. With the executor choice in planner module, it would be confusing for users.

eg.

-- id is a unique key, val is not
select * from t where id = 1 and val = 1; -- sql1, lock unexist key(id  = 1)
select * from t where val = 1; -- sql2, does not lock any keys if val = 1 not exist

Notice that sql2 takes a looser condition, it should cover the effects which sql1 makes. However, it does not lock any keys actually. Besides, the push-down requests scan need-lock keys in coprocessor in TiKV, there will be a great performance impact if we easily lock the query range for non-unique keys.

In a short word, this issue doesn't have a well design yet.

zhaoxugang · 2021-05-10T15:48:22Z

got it,thx

zhaoxugang · 2021-05-30T02:12:13Z

I have an idea , It is i choose which index to lock dependency access path of data source . If there not exists index that was choosing , i will lock the table .
@you06

you06 · 2021-06-08T03:54:22Z

I have an idea , It is i choose which index to lock dependency access path of data source . If there not exists index that was choosing , i will lock the table .
@you06

Lock table sounds terrible, that will introduce a great performance impact.

dwangxxx · 2022-01-11T04:48:45Z

the query by unique index keys will be executed by PointGetExec ot BatchPointGetExec, so i don't know what can i do for resolve the issue@you06,@cfzjywxk

In a short word, current PointGetExec and BatchPointGetExec will lock exist/unexist keys, however other executors will lock exist keys only. With the executor choice in planner module, it would be confusing for users.

eg.
-- id is a unique key, val is not
select * from t where id = 1 and val = 1; -- sql1, lock unexist key(id  = 1)
select * from t where val = 1; -- sql2, does not lock any keys if val = 1 not exist
Notice that sql2 takes a looser condition, it should cover the effects which sql1 makes. However, it does not lock any keys actually. Besides, the push-down requests scan need-lock keys in coprocessor in TiKV, there will be a great performance impact if we easily lock the query range for non-unique keys.

In a short word, this issue doesn't have a well design yet.

@you06
The execution plan of select * from t where id = 1 and v = 1 and val = 30 for update; is like this:

It will not lock any unexist keys. So in this situation, the task of this issue is to lock the unexist key(id = 1 and v = 1) in SelectLockExec or IndexLookUpExec to block other txn to update or insert at key(id = 1 and v = 1)?

you06 · 2022-01-11T05:38:57Z

@dwangxxx It seems your case talks about the table like create table t(id int, v int, val int, unique index u(id ,v));.

I don't mean it needs to lock the unexist key when with predicate id = 1 and v = 1 and val = 30 with empty result. Because id = 1 and v = 1 is wider than id = 1 and v = 1 and val = 30, so id = 1 and v = 1 locks the unexist key and id = 1 and v = 1 and val = 30 does not lock the unexist key is ok to me(just my personal opinion).
The ideal model for me is supporting range lock, so that when using predicate v = 1 will lock more keys than id = 1 and v = 1.

dwangxxx · 2022-01-11T06:19:19Z

@dwangxxx It seems your case talks about the table like create table t(id int, v int, val int, unique index u(id ,v));.

I don't mean it needs to lock the unexist key when with predicate id = 1 and v = 1 and val = 30 with empty result. Because id = 1 and v = 1 is wider than id = 1 and v = 1 and val = 30, so id = 1 and v = 1 locks the unexist key and id = 1 and v = 1 and val = 30 does not lock the unexist key is ok to me(just my personal opinion). The ideal model for me is supporting range lock, so that when using predicate v = 1 will lock more keys than id = 1 and v = 1.

@you06
So the ideal model you mentioned is like this?

/* t1 */ begin pessimistic;
/* t2 */ begin pessimistic;
/* t1 */ select * from t where id = 1 for update;  -- this will lock all keys id = 1
/* t2 */ insert into t values(1, 2, 3); -- this will be blocked until t1 commit

you06 · 2022-01-11T06:26:11Z

@you06 So the ideal model you mentioned is like this?

/* t1 */ begin pessimistic;
/* t2 */ begin pessimistic;
/* t1 */ select * from t where id = 1 for update;  -- this will lock all keys id = 1
/* t2 */ insert into t values(1, 2, 3); -- this will be blocked until t1 commit

Yes.

dwangxxx · 2022-01-11T06:43:10Z

@you06
I'm a little confused. How about the non-unique key? I list some situations, could you help figure out which should be blocked or not? Suppose table t is empty. Thanks~

test1

/* t1 */ begin pessimistic;
/* t2 */ begin pessimistic;
/* t1 */ select * from t where id = 1 for update; 
/* t2 */ insert into t values(1, x, x);

test2

/* t1 */ begin pessimistic;
/* t2 */ begin pessimistic;
/* t1 */ select * from t where id = 1 and v = 1 for update; 
/* t2 */ insert into t values(1, 1, x);

test3

/* t1 */ begin pessimistic;
/* t2 */ begin pessimistic;
/* t1 */ select * from t where id = 1 and v = 1 and val = 3 for update; 
/* t2 */ insert into t values(1, 1, x);

test4

/* t1 */ begin pessimistic;
/* t2 */ begin pessimistic;
/* t1 */ select * from t where val = 3 for update; 
/* t2 */ insert into t values(x, x, 3);

test5

/* t1 */ begin pessimistic;
/* t2 */ begin pessimistic;
/* t1 */ select * from t where id = 1 and val = 3 for update; 
/* t2 */ insert into t values(1, x, x);

you06 · 2022-01-11T12:01:46Z

@you06 I'm a little confused. How about the non-unique key? I list some situations, could you help figure out which should be blocked or not? Suppose table t is empty. Thanks~

Thanks for the cases. Suppose the only unique/primary key id (id, v) as you described before. There are some statements
request for cursors by predicate query and some request for item query(lookup one record by primary/unique index). Notice this is out of the isolation level discussion, the cursor is an assistant tool to avoid some anomalies.

~~You convince me to give up my standpoint.~~

The hypothesis from my personal perspective: If condition1 is stricter than condition2, it should take no more cursors(aka. locks) than condition2.

Here we go to the analysis:

test1

/* t1 */ begin pessimistic;
/* t2 */ begin pessimistic;
/* t1 */ select * from t where id = 1 for update; 
/* t2 */ insert into t values(1, x, x);

Block.

Because id = 1 and v = 1 takes the cursor on (1, 1), id = 1 certainly takes the cursor on (1, x). Unless it may lead to the lost update:

/* t1 */ begin pessimistic;
/* t2 */ begin pessimistic;
/* t1 */ select * from t where id = 1 for update; -- 0 rows
/* t2 */ insert into t values(1, 1, 1); -- success without block
/* t2 */ commit;
/* t1 */ update t set v = v + 1 where id = 1; -- 1 row affected
/* t1 */ commit;

test2

Block.

It should block by the cursor(lock) on the non-exist key.

test3

/* t1 */ begin pessimistic;
/* t2 */ begin pessimistic;
/* t1 */ select * from t where id = 1 and v = 1 and val = 3 for update; 
/* t2 */ insert into t values(1, 1, x);

Not block.

According to the hypothesis, the for-update statement in t1 can obtain fewer cursors than it in test2, so it's ok for this case not to lock (1, 1, x), but need to lock (1, 1, 3) to avoid lost update issue. We can also not lock (1, 1, 3) and claim that avoiding cursor lost update only works in item-type queries, don't discuss it here.

test4

/* t1 */ begin pessimistic;
/* t2 */ begin pessimistic;
/* t1 */ select * from t where val = 3 for update; 
/* t2 */ insert into t values(x, x, 3);

Block.

test4 to test3 is similar with test1 to test2, val = 3 covers id = x and v = x and val = 3, which should fetch all cursors of (x, x, 3). Here the problem comes, it's almost impossible to implement it in a lock-based concurrency control system.

test5

/* t1 */ begin pessimistic;
/* t2 */ begin pessimistic;
/* t1 */ select * from t where id = 1 and val = 3 for update; 
/* t2 */ insert into t values(1, x, x);

Block.

If we take this model:

The cursor only works in index-related columns, i.e. there should be a unique index that is used to perform lock.

create table t(id int, v int, val int, unique index u(id ,v)); -- empty table
select * from t where id = 1 for update; -- lock index u with (1, x)
select * from t where v = 1 for update; -- lock index u with (x, 1)
select * from t where val = 1 for update; -- do not lock any index keys

test1

/* t1 */ begin pessimistic;
/* t2 */ begin pessimistic;
/* t1 */ select * from t where id = 1 for update; 
/* t2 */ insert into t values(1, x, x);

Block, id = 1 will lock (1, x);

test2

Block, same as TiDB's current behavior.

test3

/* t1 */ begin pessimistic;
/* t2 */ begin pessimistic;
/* t1 */ select * from t where id = 1 and v = 1 and val = 3 for update; 
/* t2 */ insert into t values(1, 1, x);

Block or not are both reasonable, same as you describe before, it can lock less data than id = 1 and v = 1.

test4

/* t1 */ begin pessimistic;
/* t2 */ begin pessimistic;
/* t1 */ select * from t where val = 3 for update; 
/* t2 */ insert into t values(x, x, 3);

Not block, no index matches val = 3.

test5

/* t1 */ begin pessimistic;
/* t2 */ begin pessimistic;
/* t1 */ select * from t where id = 1 and val = 3 for update; 
/* t2 */ insert into t values(1, x, x);

Block, id = 1 can lock (1, x).

The difficulty of this strategy will be the implementation of range lock in the distributed systems.

Back to test3 which we discussed before, adding a lock for id = 1 and v = 1 will avoid this kind of lost update, will it fix any cases besides this?

/* t1 */ begin pessimistic;
/* t2 */ begin pessimistic;
/* t1 */ select * from t where id = 1 and v = 1 and val = 3 for update; -- empty
/* t2 */ insert into t values(1, 1, 1); -- success without block
/* t2 */ commit;
/* t1 */ update t set v = 10 where id = 1 and v = 1; -- 1 row affected
/* t1 */ commit;

dwangxxx · 2022-01-11T12:46:01Z

@you06
Acctually, I can not think of any other situations we can avoid about test3 if we lock id = 1 and v = 1. Just in my opinion, if we block test5 in id = 1, we should also block test3. I think this way may make the lock behaviors more consistent.

And about the difficulty, it is because the need-lock keys may be distributed in different regions?

you06 · 2022-01-11T13:12:44Z

@dwangxxx It's not need-lock keys, it's need-lock range. A range may be distributed in different regions which makes things complex and expensive.

dwangxxx · 2022-01-11T13:41:16Z

@you06
Ok, I see.
But I have another question. If we lock range like we discussed before, I think the transaction concurrence may be affected. If txn1 lock a wide range(like id = 1), but it just needs to update one record, this may cause the bad affect that other transactions are blocked(suppose there are many transactions need to update about id = 1, like id = 1 and v = x, different transaction has different x).

you06 · 2022-01-12T03:20:49Z

@dwangxxx You're right, 2 phase lock harms performance a lot. Do you mean we may not lock any keys or ranges with id = 1?

dwangxxx · 2022-01-13T02:09:17Z

@dwangxxx You're right, 2 phase lock harms performance a lot. Do you mean we may not lock any keys or ranges with id = 1?

@you06
Maybe we should.
I run the above five tests on MySQL. All tests are blocked. And the following tests:

/* t1 */ begin pessimistic;
/* t2 */ begin pessimistic;
/* t1 */ select * from t where id = 1 for update; -- empty
/* t2 */ insert into t values(2, 2, 1); -- blocked

/* t1 */ begin pessimistic;
/* t2 */ begin pessimistic;
/* t1 */ select * from t where id = 1 and v = 1 for update; -- empty
/* t2 */ insert into t values(2, 2, 1); -- blocked

They are also blocked. So I don't know if we should be like MySQL?

cfzjywxk added type/enhancement The issue or PR belongs to an enhancement. sig/transaction SIG:Transaction labels Feb 20, 2021

cfzjywxk assigned you06 Feb 20, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

txn: make the unique key lock behaviour consistent for all executors #22837

txn: make the unique key lock behaviour consistent for all executors #22837

cfzjywxk commented Feb 20, 2021 •

edited

Loading

zhaoxugang commented Apr 19, 2021

cfzjywxk commented Apr 20, 2021

zhaoxugang commented May 6, 2021

you06 commented May 8, 2021

zhaoxugang commented May 10, 2021

zhaoxugang commented May 30, 2021

you06 commented Jun 8, 2021

dwangxxx commented Jan 11, 2022

you06 commented Jan 11, 2022

dwangxxx commented Jan 11, 2022

you06 commented Jan 11, 2022

dwangxxx commented Jan 11, 2022

you06 commented Jan 11, 2022

dwangxxx commented Jan 11, 2022

you06 commented Jan 11, 2022

dwangxxx commented Jan 11, 2022 •

edited

Loading

you06 commented Jan 12, 2022

dwangxxx commented Jan 13, 2022

txn: make the unique key lock behaviour consistent for all executors #22837

txn: make the unique key lock behaviour consistent for all executors #22837

Comments

cfzjywxk commented Feb 20, 2021 • edited Loading

Development Task

zhaoxugang commented Apr 19, 2021

cfzjywxk commented Apr 20, 2021

zhaoxugang commented May 6, 2021

you06 commented May 8, 2021

zhaoxugang commented May 10, 2021

zhaoxugang commented May 30, 2021

you06 commented Jun 8, 2021

dwangxxx commented Jan 11, 2022

you06 commented Jan 11, 2022

dwangxxx commented Jan 11, 2022

you06 commented Jan 11, 2022

dwangxxx commented Jan 11, 2022

you06 commented Jan 11, 2022

dwangxxx commented Jan 11, 2022

you06 commented Jan 11, 2022

dwangxxx commented Jan 11, 2022 • edited Loading

you06 commented Jan 12, 2022

dwangxxx commented Jan 13, 2022

cfzjywxk commented Feb 20, 2021 •

edited

Loading

dwangxxx commented Jan 11, 2022 •

edited

Loading