-
Notifications
You must be signed in to change notification settings - Fork 153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Maintenance: improve continuous code quality automation #2123
Comments
@aws-powertools/lambda & @aws-powertools/lambda-typescript-core would like to hear your opinion before settling on an option. |
I agree with running e2e tests as soon and as frequent as possible. I would opt for an approach that would run those tests before merging to main, but the time it takes for the tests to run will create a slow developer feedback and running after merge will catch issues late in the developer cycle. I think the 4th option with merge queues might be a good approach (I have not used it before). Another added benefit I get from the docs is "... does not require a pull request author to update their pull request branch and wait for status checks to finish before trying to merge." which is also really good. |
Security was the main concern for running e2e on a PR, thus we did not enable it in the past. Reading merge queues it seems exactly the right tool to solve it. Alternatively, we could merge first into staging branch and run separate checks, which seems to be exactly how merge queues work. Another topic is how we deal with broken e2e test in a PR from external contributors. We can't expect them to fix the tests without local iteration options (they'd need an AWS account). This would mean we need to take over the PR at the end and make changes. Not sure if there are other options. |
Over in @aws-powertools/lambda-java-core we are running E2E tests on every PR for 4 different runtimes. We also run on main, once the PR is merged, which is somewhat redundant. That is - we are using (2) and (3) from your options. PR runs are gated so they only run for PRs cut from branches on the primary repository itself and not on PRs from forks, where we have to actively trigger the run after considering the changeset. We have a fairly large E2E suite and it is at times flaky - often when we catch failures, the failures are incidental (e.g., some timeout the suite hasn't run into before) and not related to the underlying PR at hand. We invest effort in improving the robustness of the E2E suite, but accept that this will inherently always have some element of "randomness" that actual unit tests should lack. Having said all this I think this is clearly better than not having the suite. When we catch errors, they are errors we would not have otherwise noticed, and are errors that should block a merge. We've not used merge queues either and I am interested in seeing how you go with them! |
I have finally got around to testing the feature. I created a new workflow with the correct trigger (I think) and have put the same steps we have in the regular pre-merge CI. In doing this, I realized that unless I am misunderstanding the setup (I might) having two separate workflows won't work. On one hand, before a PR can be added to the queue, all the required checks must be fulfilled. Like I said, I might have misunderstood the feature or its setup, but as of now it seems that you need to have the same set of checks in both pre-queue & queue, which means you're running the same tests twice. If that's the case, then what's the point? A potential solution to this, would be to run:
It's still unclear to me how to handle e2e tests in this context, since they require access to secrets & environments. I need to investigate further. Will try again after new year. |
Summary
Currently we are running only limited code quality checks before merging a PR (linting and unit tests), while integration tests are still ran manually by maintainer either as a part of a PR review or before a release.
This is error prone and increases the risk of issues slipping into the codebase.
This issue came out of a discussion under #2072.
Why is this needed?
So that we can have code quality checks on the code merged on
main
and increase confidence on what's in it.Which area does this relate to?
Automation, Tests
Solution
We should create a new workflow called "Code Quality" with trigger to be defined with several steps:
The linting and unit tests complete within ~2 minutes while the integration tests take 8-9 minutes to complete.
We are already parallelizing via a matrix of architectures, runtime version, and packages. This results in 35+ tests and (~80+) CloudFormation stacks being deployed which in some cases seems to trigger some queuing in CloudFormation.
Regarding the trigger, I think we have four options:
main
with issues.main
as a result of a merged PR. Depending on how we set up the CI this would mean a cooldown of ~10 minutes between merged PRs. On the other hand this would ensure that issues are caught as early as possible.main
using merge queues. This would provide the similar benefits to the previous two options, but with two added values: 1/ maintainers don't have to wait for the longer checks to finish before merging one or more PRs, 2/ issues are detected before they are merged tomain
, thus giving the maintainer an opportunity to fix them without having to create a dedicated issue & additional PR.At least based on my current understanding of merge queues option 4 seems the most promising option. With that said, this is a relatively new feature and I don't think any of us has had the chance to experiment with it yet.
Acknowledgment
Future readers
Please react with 👍 and your use case to help us understand customer demand.
The text was updated successfully, but these errors were encountered: