Fixed updater failing to run using CFN stack as a cron job #51

srgothi92 · 2021-05-12T17:21:22Z

Issue number:
partially #42

Description of changes:
To run updater using stack 3 changes were made:

1. Added minimal required permission to stack to run updater

2. Refactored logs

* Previously, complete HTTP Body of aws-sdk API calls were logged, this change disables
the aws-sdk logging.
* Existing logs were hard to interpret and were formatted incorrectly, this
change fixes those logs which were found incorrect during stack permission update.

3. Fixed a bug where instances are stuck in Drain state

Previously, failure in waiting for instance to become Ok after update would return an error and updater would exit.
However, instance remained in drained state even if it would have become OK after some time.
This change, changes instance state back to active irrespective of success or failure of wait.

Testing done:

Ran updater using CFN template: bottlerocket-ecs-updater.yaml and verified all the instances getting updated.
Ran updater locally and verified all instances getting updated.

Terms of contribution:

By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.

WilboMo · 2021-05-14T22:37:06Z

updater/aws.go

 	}
 	if len(resp.Failures) != 0 {
-		return fmt.Errorf("Container instance %s failed to activate: %#v", aws.StringValue(containerInstance), resp.Failures)
+		return fmt.Errorf("api failures while activating: %#v", resp.Failures)


nit: it looks like in your previous changes to error messages you're dropping "#" from the formatting. Looks like this one got missed.

Thanks for pointing that out.
I will just add reason why I removed "#": Adding "#" was making error split in multiple lines in Cloud watch and it was getting hard to read. By removing "#" we only lose field name which anyway did not add much value in my opinion.

By default, the awslogs driver will split log events on newlines. We can change this to split based on a regular expression with the awslogs-multiline-pattern log option or a strftime pattern with the awslogs-datetime-format log option.

what would you recommend "%v" or "%#v" with awlogs-multiline-pattern ? I have changed error wrapping to use "%w", but while printing l am still using "%v".

srgothi92 · 2021-05-14T23:05:58Z

srgothi92 force-pushed the cron-updater branch from 34ffa8f to 6d25292 14 seconds ago

Addressed @WilboMo comments.

samuelkarp · 2021-05-15T20:30:38Z

stacks/bottlerocket-ecs-updater.yaml

              - Effect: Allow
                Action:
                  - 'ecs:ListContainerInstances'
                Resource:
                  - !Sub 'arn:${AWS::Partition}:ecs:${AWS::Region}:${AWS::AccountId}:cluster/${ClusterName}'
+              # Allows describe container instances to get ec2 instance ID


nit: this also allows us to get attributes, which is how we determine that a given container instance is running Bottlerocket

Added more details.

samuelkarp · 2021-05-15T20:38:04Z

stacks/bottlerocket-ecs-updater.yaml

+                Resource:
+                  - !Sub "arn:aws:ecs:${AWS::Region}:${AWS::AccountId}:container-instance/*"


Since the resource that DescribeContainerInstances operates on is the container instance and since ECS does not enable cross-account actions, the only effect of this resource condition is constraining the permission to the stack region. It would be a better approach to leverage the ecs.cluster condition key and specify the cluster ARN so that the only instances that can be described are in the expected cluster.

Nice. Changed as suggested.

I'd recommend removing this Resource since it provides no additional value over the Condition.

stacks/bottlerocket-ecs-updater.yaml

updater/aws.go

srgothi92

Addressed Sam's comments.

samuelkarp · 2021-05-18T01:59:32Z

stacks/bottlerocket-ecs-updater.yaml

+                Resource:
+                  - !Sub "arn:aws:ecs:${AWS::Region}:${AWS::AccountId}:container-instance/*"


I'd recommend removing this Resource since it provides no additional value over the Condition.

samuelkarp · 2021-05-18T02:09:56Z

stacks/bottlerocket-ecs-updater.yaml

+                Resource:
+                  - !Sub "arn:aws:ecs:${AWS::Region}:${AWS::AccountId}:task/${ClusterName}/*"


It looks to me like this Resource doesn't provide any additional constraint over the Condition, and that the Condition is the same one as used for the previous statement. I'd recommend combining this permission into that previous statement and removing the Resource.

* Previously, complete HTTP Body of aws sdk api calls were logged, this change disables the aws sdk logging. * Exiting logs were hard to interpret and were formatted incorrectly, this change fixes those logs which were found incorrect during stack permission update.

Previously, when waiting on instance to be ok fails after starting update, error was returned and updater was exited. However, instance remained in drained state even if it would have become Ok after some time. This change, changes instance state back to active irrespective of success or failure of wait.

srgothi92 · 2021-05-18T16:37:55Z

srgothi92 force-pushed the cron-updater branch from f920185 to e3e5213 32 seconds ago

Addressed Sam's review comments and re-tested by deploying new stack.

srgothi92 · 2021-05-18T18:18:05Z

stacks/bottlerocket-ecs-updater.yaml

@@ -47,11 +47,38 @@ Resources:
          PolicyDocument:
            Version: 2012-10-17
            Statement:
+              # Allows listing all container instances in a cluster
              - Effect: Allow
                Action:
                  - 'ecs:ListContainerInstances'
                Resource:


Just realized we can combine this as well with below action. Will try removing and test.

Based on doc here we cannot use cluster condition key with ListContainerInstances. Therefore existing changes looks good.

srgothi92 force-pushed the data-structure-instance branch from 59e357a to 918d804 Compare May 12, 2021 23:06

srgothi92 changed the base branch from data-structure-instance to develop May 12, 2021 23:11

srgothi92 force-pushed the cron-updater branch 2 times, most recently from b985344 to c18de65 Compare May 12, 2021 23:51

srgothi92 requested review from samuelkarp and WilboMo May 13, 2021 22:28

srgothi92 force-pushed the cron-updater branch from c18de65 to 34ffa8f Compare May 14, 2021 22:17

WilboMo approved these changes May 14, 2021

View reviewed changes

srgothi92 force-pushed the cron-updater branch from 34ffa8f to 6d25292 Compare May 14, 2021 23:05

samuelkarp reviewed May 15, 2021

View reviewed changes

srgothi92 force-pushed the cron-updater branch 3 times, most recently from 06f240d to f920185 Compare May 17, 2021 16:42

srgothi92 commented May 17, 2021

View reviewed changes

samuelkarp reviewed May 18, 2021

View reviewed changes

srgothi92 added 3 commits May 18, 2021 10:51

Added minimal required premission to stack to run updater

8632196

Refactored logs

02058a1

* Previously, complete HTTP Body of aws sdk api calls were logged, this change disables the aws sdk logging. * Exiting logs were hard to interpret and were formatted incorrectly, this change fixes those logs which were found incorrect during stack permission update.

srgothi92 force-pushed the cron-updater branch from f920185 to e3e5213 Compare May 18, 2021 16:36

srgothi92 commented May 18, 2021

View reviewed changes

samuelkarp approved these changes May 18, 2021

View reviewed changes

srgothi92 merged commit c0c9a98 into develop May 18, 2021

srgothi92 deleted the cron-updater branch May 19, 2021 00:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed updater failing to run using CFN stack as a cron job #51

Fixed updater failing to run using CFN stack as a cron job #51

srgothi92 commented May 12, 2021 •

edited

Loading

WilboMo May 14, 2021

srgothi92 May 14, 2021

samuelkarp May 15, 2021

srgothi92 May 17, 2021

srgothi92 commented May 14, 2021

samuelkarp May 15, 2021

srgothi92 May 17, 2021

samuelkarp May 15, 2021

srgothi92 May 17, 2021

samuelkarp May 18, 2021

srgothi92 left a comment

samuelkarp May 18, 2021

samuelkarp May 18, 2021

srgothi92 May 18, 2021

srgothi92 commented May 18, 2021

srgothi92 May 18, 2021 •

edited

Loading

srgothi92 May 18, 2021

		Resource:
		- !Sub "arn:aws:ecs:${AWS::Region}:${AWS::AccountId}:container-instance/*"

Fixed updater failing to run using CFN stack as a cron job #51

Fixed updater failing to run using CFN stack as a cron job #51

Conversation

srgothi92 commented May 12, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

srgothi92 commented May 14, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

srgothi92 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

srgothi92 commented May 18, 2021

srgothi92 May 18, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

srgothi92 commented May 12, 2021 •

edited

Loading

srgothi92 May 18, 2021 •

edited

Loading