-
Notifications
You must be signed in to change notification settings - Fork 241
Description
SDK version
v1.13.1-1
Relevant provider source code
Taken from: https://github.com/terraform-providers/terraform-provider-aws/blob/master/aws/resource_aws_iam_role.go#L152
var createResp *iam.CreateRoleOutput
err := resource.Retry(30*time.Second, func() *resource.RetryError {
var err error
createResp, err = iamconn.CreateRole(request) <-- Has internally a retry loop, can block more then 30 seconds
// IAM users (referenced in Principal field of assume policy)
// can take ~30 seconds to propagate in AWS
if isAWSErr(err, "MalformedPolicyDocument", "Invalid principal in policy") {
return resource.RetryableError(err)
}
return resource.NonRetryableError(err)
})
if isResourceTimeoutError(err) { <-- Goroutine started in Retry (WaitForState) can still be running
createResp, err = iamconn.CreateRole(request) <-- Issues another blocking CreateRole
}
Debug Output
[DEBUG] [aws-sdk-go] DEBUG: Request iam/CreateRole Details:
[DEBUG] [aws-sdk-go] DEBUG: Send Request iam/CreateRole failed, attempt 0/25, error RequestError: send request failed
[DEBUG] [aws-sdk-go] DEBUG: Retrying Request iam/CreateRole, attempt 1
[DEBUG] [aws-sdk-go] DEBUG: Request iam/CreateRole Details:
[WARN] WaitForState timeout after 30s
[WARN] WaitForState starting 30s refresh grace period
[DEBUG] [aws-sdk-go] DEBUG: Send Request iam/CreateRole failed, attempt 1/25, error RequestError: send request failed
[DEBUG] [aws-sdk-go] DEBUG: Retrying Request iam/CreateRole, attempt 2
[DEBUG] [aws-sdk-go] DEBUG: Request iam/CreateRole Details:
[ERROR] WaitForState exceeded refresh grace period
[DEBUG] [aws-sdk-go] DEBUG: Request iam/CreateRole Details:
Expected Behavior
resource.Retry() blocks until callback finishes - even on timeout
Actual Behavior
Callback is still running when timeout happens. For example, AWS provider issues another, parallel, CreateRole(). In many cases this results double creation attempt, and eventually a failure in the plugin.
Error: Error creating IAM Role hello-world-ssm_role: EntityAlreadyExists: Role with name hello-world-ssm_role already exists.
status code: 409, request id: removed
on main.tf line 18, in resource "aws_iam_role" "ssm_role":
18: resource "aws_iam_role" "ssm_role"
References
We have been running a crude patch ( #529 ) in production for a few weeks with good results.