Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition #32

Open
G-M-twostay opened this issue Jan 6, 2023 · 6 comments
Open

Race condition #32

G-M-twostay opened this issue Jan 6, 2023 · 6 comments

Comments

@G-M-twostay
Copy link

G-M-twostay commented Jan 6, 2023

Hi, I saw that the sorted linked list is based on Harris's linked list, but according to my understanding, it's not correctly written. Harris's linked list is based on 2 assumptions: 1. Atomic operation on node.next can see changes to node.delete. 2. node.next on a deleted node is still valid and can be used to track to the original next. In this implementation, I see that: 1. node.delete and node.next can't be dealt with in a single atomic operation. This is problematic, consider: node.delete can change immediately before(after the if checks) or during a CAS operation on node.next, and during this process, a successful physical deletion can happen before the CAS operation completes/starts, therefore, the new node is linked onto a deleted node. This is my understanding, correct me if I'm wrong.

I encountered the above problem in my initial attempts to implement such a hashmap using Harris's linked list.

With this in mind, I designed a few cases that can reflect the above problem. However, I'm not sure whether the failures in the below cases are solely caused by the above reason or is/are caused by other problems. It appears to me that at least on my end Case1 has some other problem because a given key is guaranteed to fail. Anyway, let's see these cases.
Case 1:

func BenchmarkHaxMap_Case1(b *testing.B) {
	b.StopTimer()
	wg := sync.WaitGroup{}
	for i := 0; i < b.N; i++ {
		M := haxmap.New[int, int]()
		b.StartTimer()
		for k := 0; k < iter0; k++ {
			wg.Add(1)
			go func(l, h int) {
				for j := l; j < h; j++ {
					M.Set(j, j)
				}
				for j := l; j < h; j++ {
					_, a := M.Get(j)
					if !a {
						b.Error("key doesn't exist", j)
					}
				}
				for j := l; j < h; j++ {
					x, _ := M.Get(j)
					if x != j {
						b.Error("incorrect value", j, x)
					}
				}
				wg.Done()
			}(k*elementNum0, (k+1)*elementNum0)
		}
		wg.Wait()
		b.StopTimer()
	}
}

Case 2:

func BenchmarkHaxMap_Case3(b *testing.B) {
	b.StopTimer()
	wg := &sync.WaitGroup{}
	for a := 0; a < b.N; a++ {
		M := haxmap.New[int, int]()
		b.StartTimer()
		for j := 0; j < iter0; j++ {
			wg.Add(1)
			go func(l, h int) {
				defer wg.Done()
				for i := l; i < h; i++ {
					M.Set(i, i)
				}

				for i := l; i < h; i++ {
					_, x := M.Get(i)
					if !x {
						b.Errorf("not put: %v\n", i)
					}
				}
				for i := l; i < h; i++ {
					M.Del(i)

				}
				for i := l; i < h; i++ {
					_, x := M.Get(i)
					if x {
						b.Errorf("not removed: %v\n", i)
					}
				}

			}(j*elementNum0, (j+1)*elementNum0)
		}
		wg.Wait()
		b.StopTimer()
	}

}
const (
	iter0       = 1 << 3
	elementNum0 = 1 << 10
)

If you increase the data size, this problem becomes more severe. You can delete all the benchmark and timing things.

Modifying these tests to sync.Map or an ordinary map with Mutex will show that no failures happen. In addition, cornelk's hashmap also fails at these tests.

@alphadose
Copy link
Owner

@G-M-twostay understood, I will take a look at this. Might be related to #23

G-M-twostay added a commit to G-M-twostay/Go-Utils that referenced this issue Jan 6, 2023
…m/alphadose/haxmap.

Details:
1. haxmap is incorrect, see alphadose/haxmap#32.
2. Slightly changed README.md
TODO:
1. Make all the helper functions private.
@G-M-twostay
Copy link
Author

If I were correct about the race condition, then to solve it, simply enforce consistency between node.next and node.delete. One such way is to make use an additional struct to combine these 2 fields and directly operate on the immutable struct.

@alphadose
Copy link
Owner

@G-M-twostay I couldn't understand your solution completely. Can you give me an example code snippet ?

@G-M-twostay
Copy link
Author

@alphadose https://secondboyet.com/Articles/LockFreeLinkedList3.html is a better explanation. https://github.com/G-M-twostay/Go-Utils/blob/master/Maps/ChainMap/Node.go.
Harris's linked list originally used pointer tagging in C++, but unfortunately that's impossible in Go in my knowledge due to garbage collection.

@ItalyPaleAle
Copy link

ItalyPaleAle commented Sep 22, 2023

I am looking at haxmap but am a bit concerned by this issue. After playing with it a bit, it appears it may be related to a race condition while resizing.

I can repro the error easily with @G-M-twostay's code, even as a Test and not a Benchmark. But, if I allocate the haxmap with a larger initial capacity, the error does not appear.

// Old
M := haxmap.New[int, int]()
// New
M := haxmap.New[int, int](iter0 * elementNum0)

-- EDIT

It looks like this is a known "issue". I just saw this in the docs:

If a resizing operation is happening concurrently while calling Set() then the item might show up in the map only after the resize operation is finished

phuslu added a commit to phuslu/shardmap that referenced this issue Sep 24, 2023
phuslu added a commit to phuslu/shardmap that referenced this issue Sep 24, 2023
phuslu added a commit to phuslu/shardmap that referenced this issue Sep 24, 2023
phuslu added a commit to phuslu/shardmap that referenced this issue Sep 24, 2023
@G-M-twostay
Copy link
Author

I am looking at haxmap but am a bit concerned by this issue. After playing with it a bit, it appears it may be related to a race condition while resizing.

I can repro the error easily with @G-M-twostay's code, even as a Test and not a Benchmark. But, if I allocate the haxmap with a larger initial capacity, the error does not appear.

// Old
M := haxmap.New[int, int]()
// New
M := haxmap.New[int, int](iter0 * elementNum0)

-- EDIT

It looks like this is a known "issue". I just saw this in the docs:

If a resizing operation is happening concurrently while calling Set() then the item might show up in the map only after the resize operation is finished

I believe it's related to deletion.

@ItalyPaleAle ItalyPaleAle mentioned this issue May 20, 2024
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants