-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Race condition results in EIP association failing #24
Comments
Add functionality to check that an EIP was actually associated with the desired instance. This is important because the EC2 API is eventually consistent and receiving an HTTP 200 response from the association API call is not a guarantee that the association will actually happen. This verification is enabled by default but can be disabled with --skip-verify. By default, it will retry up to 10 times, sleeping 0.1 seconds after each attempt, in order to give the EC2 API plenty of time to achieve a result. This allows aws-ec2-assign-elastic-ip to exit nonzero when there is a race condition that causes EIP association to fail. Fixes: sebdah#24
Following up here with discussion from the PR: This issue is actually more like a bug/wart in boto2, which doesn't set the AllowReassociation parameter and doesn't provide any way to explicitly set that parameter to false. Without it, AWS defaults to true, allowing an EIP to be reassociated even if it is currently associated with another instance. ab/boto@239013b is a start at fixing this for boto 2. |
I think the billing issue I said could theoretically exist in the PR thread does already exist. Imagine two instances fire up due to a scaling event and both have a cron to check they have an Elastic IP every minute. Once booted they will both race for the same Elastic IP as shown in @brodygov's log. With an EC2-Classic account (pre-2014) only one instance will manage to associate and the other will error. This will incur a $0.10 charge. In the second minute the second instance will associate and incur another $0.10 charge. Total charges are $0.20. With an EC2-VPC-only account (2014+) both instances could successfully associate in the first minute and incur $0.20 of charges. In the second minute the instance that was stolen from will associate again and incur another $0.10 charge. Total charges are $0.30. Not a big difference but it escalates quickly. If the bad luck was repeated with 10 instances it would take 10 minutes to properly associate every instance and the difference by then is $1.00 vs $5.50. I would definitely upgrade to boto 3 and explicitly set AllowReassociation to False for everyone. If requested features that would require reassociation (e.g. #13 or #19) are ever implemented then they should be done so carefully. |
The above PR is merged and released with 0.8.0. Please reopen this issue if needed. Thanks @wilerson for the PR. |
The EC2 API for associating EIPs appears to be eventually consistent. So if two servers attempt to grab the same EIP at the same time, one of them will succeed and the other will fail but think it succeeded.
Example logs from this happening, with the IP and instance IDs redacted.
In reality,
i-8d7ed1
won the race and has the EIP associated, whilei-a77fbf
thinks it got the EIP but has no EIP associated.One way to fix this would be to loop doing the describe-addresses call until the association with the current instance ID appears. Would you be interested in a PR that does this?
The text was updated successfully, but these errors were encountered: