Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[leader election] master should't give up it's leadership easily #17

Open
BlueBlue-Lee opened this issue Jun 19, 2017 · 1 comment
Open

Comments

@BlueBlue-Lee
Copy link

Is this a BUG REPORT or FEATURE REQUEST? (choose one):
feature request

What happened:
There is leader election in doorman server, which is achieved by etcd. Set a key with delay ttl and continually refresh it every 1/3 delay interval.

when leader is down, this etcd key will expire. And then new leader is elected.

see the source code:

	go func() {
		for {
			log.V(2).Infof("trying to become master at %v", e.lock)
			if _, err := kapi.Set(ctx, e.lock, id, &client.SetOptions{
				TTL:       e.delay,
				PrevExist: client.PrevNoExist,
			}); err != nil {
				log.V(2).Infof("failed becoming the master, retrying in %v: %v", e.delay, err)
				time.Sleep(e.delay)
				continue
			}
			e.isMaster <- true
			log.V(2).Info("Became master at %v as %v.", e.lock, id)
			for {
				time.Sleep(e.delay / 3)
				log.V(2).Infof("Renewing mastership lease at %v as %v", e.lock, id)
				_, err := kapi.Set(ctx, e.lock, id, &client.SetOptions{
					TTL:       e.delay,
					PrevExist: client.PrevExist,
					PrevValue: id,
				})

				if err != nil {
					log.V(2).Info("lost mastership")
					e.isMaster <- false
					break
				}
			}
		}
	}()

when master fail to renew lease because some temp reasons, for example network jitter, it just loses leadership easily. But actually, if the master try again, it will renew lease successfully.

This problem will resulting in unnecessary learning mode and it takes time to converge.

What you expected to happen or what your proposal is:

I think we shold add retry mechanism when renew lease. If it fails twice or other retry-counts, then lose its leadership.

@BlueBlue-Lee
Copy link
Author

@ryszard @josvisser

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant