Skip to content

x-pack/filebeat/input/awss3: allow a grace time on shutdown #43369

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Mar 27, 2025

Conversation

efd6
Copy link
Contributor

@efd6 efd6 commented Mar 20, 2025

Proposed commit message

x-pack/filebeat/input/awss3: allow a grace time on shutdown

We may have SQS message requests in flight that are being serviced when the
input's context is cancelled. This allows the user to specify a time to wait
after the context is cancelled before the processing and publication work is
terminated.

We finish as soon as the set of pending requests had been received and
published, cut short after the grace time.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Disruptive User Impact

Author's Checklist

  • [ ]

How to test this PR locally

Related issues

Use cases

Screenshots

Logs

@efd6 efd6 added enhancement Filebeat Filebeat Team:obs-ds-hosted-services Label for the Observability Hosted Services team Team:Security-Service Integrations Security Service Integrations Team backport-8.17 Automated backport with mergify backport-8.18 Automated backport to the 8.18 branch backport-9.0 Automated backport to the 9.0 branch labels Mar 20, 2025
@efd6 efd6 self-assigned this Mar 20, 2025
@botelastic botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label and removed needs_team Indicates that the issue/PR needs a Team:* label labels Mar 20, 2025
@efd6 efd6 force-pushed the i13062-awss3 branch 2 times, most recently from 944fa21 to 6e73b10 Compare March 20, 2025 02:09
@efd6 efd6 added the backport-8.x Automated backport to the 8.x branch with mergify label Mar 20, 2025
@efd6 efd6 marked this pull request as ready for review March 20, 2025 04:02
@efd6 efd6 requested review from a team as code owners March 20, 2025 04:02
@elasticmachine
Copy link
Collaborator

Pinging @elastic/obs-ds-hosted-services (Team:obs-ds-hosted-services)

@efd6 efd6 requested review from AndersonQ and khushijain21 March 20, 2025 04:02
@elasticmachine
Copy link
Collaborator

Pinging @elastic/security-service-integrations (Team:Security-Service Integrations)

@efd6 efd6 requested review from andrewkroh and kcreddy March 20, 2025 04:02
Copy link
Contributor

@kcreddy kcreddy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
Suggest to merge after someone more familiar with input approves.

@pierrehilbert
Copy link
Collaborator

@bturquet could we have someone from your time to review this one please?
@faec will only be back on Monday.

@bturquet
Copy link
Contributor

@Kavindu-Dodan could you help with the review please ?

@@ -226,19 +270,23 @@ func (w *sqsWorker) processMessage(ctx context.Context, msg types.Message) {
w.client.Publish(e)
publishCount++
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really, this var should not be needed since the implementation already returns the sum of to-publish events in result.eventCount, but it seems that this is not implemented in testing due to the approach taken in mock testing.

Comment on lines +173 to +182
func cancelWithGrace(parent context.Context, timeout time.Duration) (context.Context, context.CancelFunc) {
ctx, cancel := context.WithCancel(context.WithoutCancel(parent))
stop := context.AfterFunc(parent, func() {
time.AfterFunc(timeout, cancel)
})
return ctx, func() {
stop()
cancel()
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be good to have a unit test for this function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

Copy link
Contributor

@Kavindu-Dodan Kavindu-Dodan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good for me :)

@efd6 efd6 requested a review from AndersonQ March 26, 2025 10:58
Comment on lines +304 to +307
case <-time.After(tooLong):
t.Fatal("parent context failed to cancel within timeout")
case <-parentCtx.Done():
parentCancelled = time.Now()
Copy link
Member

@AndersonQ AndersonQ Mar 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[NIT]
parent is already cancelled, you don't need the wait here. If <-parentCtx.Done() does not get selected, you could fail the test immediately.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's here for symmetry.

Copy link
Contributor

mergify bot commented Mar 26, 2025

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b i13062-awss3 upstream/i13062-awss3
git merge upstream/main
git push upstream i13062-awss3

efd6 added 6 commits March 27, 2025 11:55
We may have SQS message requests in flight that are being serviced when
the input's context is cancelled. This allows the user to specify a time
to wait after the context is cancelled before the processing and
publication work is terminated.

The approach is a little blunt; it merely waits for the grace period
after the requester has been stopped to allow for incoming messages to
be processed and published. Ideally, we would finish as soon as the set
of pending requests had been received and published, cut short after the
grace time. However, the current structure of data flow does not lend
itself sharing that information between parts of the input, so a minimal
change is made.
This fixes handling of the case where graceCtx has not yet been cancelled, but
ctx has. This would leave the first select in a blocked state when and so starve
the second select, resulting in failure to work through the pending messages.
This also adds an opportunity to exit the loop early when all the pending
messages have been handled.
@efd6 efd6 merged commit c0d439c into elastic:main Mar 27, 2025
28 checks passed
mergify bot pushed a commit that referenced this pull request Mar 27, 2025
We may have SQS message requests in flight that are being serviced when the
input's context is cancelled. This allows the user to specify a time to wait
after the context is cancelled before the processing and publication work is
terminated.

We finish as soon as the set of pending requests had been received and
published, cut short after the grace time.

(cherry picked from commit c0d439c)

# Conflicts:
#	docs/reference/filebeat/filebeat-input-aws-s3.md
mergify bot pushed a commit that referenced this pull request Mar 27, 2025
We may have SQS message requests in flight that are being serviced when the
input's context is cancelled. This allows the user to specify a time to wait
after the context is cancelled before the processing and publication work is
terminated.

We finish as soon as the set of pending requests had been received and
published, cut short after the grace time.

(cherry picked from commit c0d439c)

# Conflicts:
#	docs/reference/filebeat/filebeat-input-aws-s3.md
mergify bot pushed a commit that referenced this pull request Mar 27, 2025
We may have SQS message requests in flight that are being serviced when the
input's context is cancelled. This allows the user to specify a time to wait
after the context is cancelled before the processing and publication work is
terminated.

We finish as soon as the set of pending requests had been received and
published, cut short after the grace time.

(cherry picked from commit c0d439c)

# Conflicts:
#	docs/reference/filebeat/filebeat-input-aws-s3.md
mergify bot pushed a commit that referenced this pull request Mar 27, 2025
We may have SQS message requests in flight that are being serviced when the
input's context is cancelled. This allows the user to specify a time to wait
after the context is cancelled before the processing and publication work is
terminated.

We finish as soon as the set of pending requests had been received and
published, cut short after the grace time.

(cherry picked from commit c0d439c)
efd6 added a commit that referenced this pull request Mar 27, 2025
…e on shutdown (#43529)

* x-pack/filebeat/input/awss3: allow a grace time on shutdown (#43369)

We may have SQS message requests in flight that are being serviced when the
input's context is cancelled. This allows the user to specify a time to wait
after the context is cancelled before the processing and publication work is
terminated.

We finish as soon as the set of pending requests had been received and
published, cut short after the grace time.

(cherry picked from commit c0d439c)

# Conflicts:
#	docs/reference/filebeat/filebeat-input-aws-s3.md

* resolve conflicts

* remove irrelevant changelog entry
* apply doc changes to asciidoc and remove markdown addition

---------

Co-authored-by: Dan Kortschak <[email protected]>
efd6 added a commit that referenced this pull request Mar 27, 2025
…me on shutdown (#43530)

* x-pack/filebeat/input/awss3: allow a grace time on shutdown (#43369)

We may have SQS message requests in flight that are being serviced when the
input's context is cancelled. This allows the user to specify a time to wait
after the context is cancelled before the processing and publication work is
terminated.

We finish as soon as the set of pending requests had been received and
published, cut short after the grace time.

(cherry picked from commit c0d439c)

# Conflicts:
#	docs/reference/filebeat/filebeat-input-aws-s3.md

* resolve conflicts

* apply doc changes to asciidoc and remove markdown addition

---------

Co-authored-by: Dan Kortschak <[email protected]>
efd6 added a commit that referenced this pull request Mar 27, 2025
…me on shutdown (#43531)

* x-pack/filebeat/input/awss3: allow a grace time on shutdown (#43369)

We may have SQS message requests in flight that are being serviced when the
input's context is cancelled. This allows the user to specify a time to wait
after the context is cancelled before the processing and publication work is
terminated.

We finish as soon as the set of pending requests had been received and
published, cut short after the grace time.

(cherry picked from commit c0d439c)

# Conflicts:
#	docs/reference/filebeat/filebeat-input-aws-s3.md

* resolve conflicts

* remove irrelevant changelog entry
* apply doc changes to asciidoc and remove markdown addition

---------

Co-authored-by: Dan Kortschak <[email protected]>
efd6 added a commit that referenced this pull request Apr 15, 2025
We may have SQS message requests in flight that are being serviced when the
input's context is cancelled. This allows the user to specify a time to wait
after the context is cancelled before the processing and publication work is
terminated.

We finish as soon as the set of pending requests had been received and
published, cut short after the grace time.

(cherry picked from commit c0d439c)
efd6 added a commit that referenced this pull request Apr 16, 2025
…e on shutdown (#43532)

* x-pack/filebeat/input/awss3: allow a grace time on shutdown (#43369)

We may have SQS message requests in flight that are being serviced when the
input's context is cancelled. This allows the user to specify a time to wait
after the context is cancelled before the processing and publication work is
terminated.

We finish as soon as the set of pending requests had been received and
published, cut short after the grace time.

(cherry picked from commit c0d439c)

* remove irrelevant changelog entries

---------

Co-authored-by: Dan Kortschak <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-8.x Automated backport to the 8.x branch with mergify backport-8.17 Automated backport with mergify backport-8.18 Automated backport to the 8.18 branch backport-9.0 Automated backport to the 9.0 branch enhancement Filebeat Filebeat Team:obs-ds-hosted-services Label for the Observability Hosted Services team Team:Security-Service Integrations Security Service Integrations Team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Crowdstrike FDR] Handle deduplication
8 participants