-
Notifications
You must be signed in to change notification settings - Fork 168
BM-1950: Update log lambda to query alarm tags for debugging via chatbot #1338
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
ncocchiaro
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments to help reviewers
| return 'l-prod-11155111-bonsai-prover-11155111'; | ||
| case 'monitor': | ||
| return 'l-prod-11155111-monitor-11155111-monitor'; | ||
| return 'l-prod-11155111-monitor-11155111-mon-monitor'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was renamed in the wild, so needs to be renamed here
| // Note we have to escape two backslashes in the query string. In | ||
| // CW it should look like: regexp_replace(log, '\\x1b\\[[0-9;]*[mK]', '') AS msg | ||
| return ` | ||
| function proverQuery(logGroup: string): string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same query as before, just refactored into a function
| export class ManagerMetricAlarmComponent extends MetricAlarmComponent { | ||
|
|
||
| constructor(config: ManagerMetricAlarmConfig) { | ||
| constructor(config: MetricAlarmConfig) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This no longer needs a specific subclass since chainId is moving to the base MetricAlarmConfig interface
| ServiceName: config.serviceName, | ||
| LogGroupName: config.logGroupName, | ||
| StackName: config.stackName, | ||
| ChainId: config.chainId, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These new tags on the alarms allow the new lambda logic to work
| 'arn:aws:iam::245178712747:role/alarmListTagsRole', | ||
| 'arn:aws:iam::632745187633:role/alarmListTagsRole' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: import the account numbers from accountConstants.ts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I missed that. Fixed in 3eb768f
| // Create a role that the log fetcher Lambda can assume to list tags on Cloudwatch alarms in the current account | ||
| const alarmTagsRole = new aws.iam.Role(`${prefix}alarmListTagsRole`, { | ||
| name: `${prefix}alarmListTagsRole`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we just grant the lambda execution role the permissions to read tags, rather than creating a new role and assuming it in the lambda?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that work cross-account though? I had the same thought initially but from experience didn't think it would, I double checked in testing anyway but that confirmed that I couldn't get it to access the resources without assuming a role in the other accounts. Maybe you have a detailed suggestion?
This PR adds an additional approach for the log lambda to find the information it needs either to output debug data via chatbot or to create a log query. If an incoming alarm has the relevant information in tags then the lambda won't need to rely on parsing the alarm names, offering an option for debug configuration that can be more robust and easier to expand on in the future.
In order to achieve this, the lambda needs to assume a role in the target account that allows listing tags on the alarm resources. This is set up here via new roles in
bootstrapand a change to the lambda execution role, plus new lambda code.The original name parsing logic remains and is still used in all cases in which alarms are not tagged (currently all cases except for prover-cluster alarms, which add tags in this PR as well).