Skip to content

mito: add remaining_executions status variable#111

Merged
efd6 merged 1 commit intodevfrom
execution_budget
Sep 14, 2025
Merged

mito: add remaining_executions status variable#111
efd6 merged 1 commit intodevfrom
execution_budget

Conversation

@efd6
Copy link
Collaborator

@efd6 efd6 commented Aug 25, 2025

Please take a look.

@efd6 efd6 self-assigned this Aug 25, 2025
@efd6 efd6 marked this pull request as ready for review August 25, 2025 21:54
@efd6 efd6 requested a review from a team August 31, 2025 20:54
Copy link
Contributor

@chrisberkhout chrisberkhout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned this just gives us a way to manually work around a problem we should be able to avoid or fix in a better way.

In general I think carefully exposing more contextual information to CEL allows more flexibility and is good.

The CEL input documentation says about max_executions:

This is used to ensure that accidental infinite loops do not halt processing. When the execution budget is exceeded, execution will be restarted at the next interval and a warning will be written into the logs.

I'm imagining something like getting stuck in a loop fetching the same page, and if it's cut off the cursor will advance after the interval. But I think usually our logic is the same in either case and want_more only decides when it will happen.

Or is it about tight loops starving other processing of resources?

If we can confidently set a maximum number of loops for normal operation, then it makes sense to log an error or warning after that.

If that's difficult or impossible, it might still makes sense to slow it down at some point, or under certain conditions.

Is the purpose of remaining_executions to let us manually slow down before max_executions generates an error when we don't want it to? Maybe there's a more direct way to get what we want.

About the implementation...

Why put this outside of state when it has everything else, including things that don't change like state.url?

The budget variable could be called remainingExecutions or replaced with *maxExecutions - n - 2.

@chrisberkhout
Copy link
Contributor

I just realized most of what I said should have been on elastic/beats#46210

@efd6
Copy link
Collaborator Author

efd6 commented Sep 11, 2025

I think this is reading too much into the feature; it's a pretty standard runtime feature for programs to be able to be aware of the resources that are available to them. This is just that, but for CEL programs provided by the CEL runtime.

Is the purpose of remaining_executions to let us manually slow down before max_executions generates an error when we don't want it to? Maybe there's a more direct way to get what we want.

I don't want to guess how it could be used. This is the most general approach, chosen so that it's use in the future for yet to be discovered purposes is not limited.

It's not in the state for the same reason that now is not in state. It's not under the control of the program and is a magic value that is owned by the runtime. Despite some of the behaviour of state.url in the filebeat input, it's not actually special.

The name budget more closely follows idiomatic Go label naming (fewer syllables and simpler construction). Obtaining the value directly loses semantic identification. This could be retained by

		if *maxExecutions > 0 {
			budget--
			input = map[string]any{
				"state":                val,
				"remaining_executions": *maxExecutions - n - 2,
			}
		} else {
…

but I think this give less clarity. It also doesn't match the implementation in the input.

Copy link
Contributor

@chrisberkhout chrisberkhout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is reading too much into the feature; it's a pretty standard runtime feature for programs to be able to be aware of the resources that are available to them. This is just that, but for CEL programs provided by the CEL runtime.

...

I don't want to guess how it could be used. This is the most general approach, chosen so that it's use in the future for yet to be discovered purposes is not limited.

Okay, fair enough 👍

It's not in the state for the same reason that now is not in state. It's not under the control of the program and is a magic value that is owned by the runtime. Despite some of the behaviour of state.url in the filebeat input, it's not actually special.

Yeah, the value of state.url comes from the user, now and remaining_executions come from the environment.

The name budget ...

Okay. This is nitpicking... I was thinking there could be different kinds of budgets. And these various conditions and values are always depend on just maxExecutions and n:

		if *maxExecutions > 0 {
			// Only provide remaining_executions if we have set a limit.
			...
				"remaining_executions": *maxExecutions - 1,

	budget := *maxExecutions - 1
		if budget > 0 {
			budget--
			...
				"remaining_executions": budget,
				

But it's not that spread out or hard to understand. And matching the implementation in the input is helpful. 👍

@efd6 efd6 merged commit d40e0e9 into dev Sep 14, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants