Skip to content

scm: truncate long command output in logs (Bug 2036740)#1123

Open
cgsheeh wants to merge 4 commits intomozilla-conduit:mainfrom
cgsheeh:hg-logging
Open

scm: truncate long command output in logs (Bug 2036740)#1123
cgsheeh wants to merge 4 commits intomozilla-conduit:mainfrom
cgsheeh:hg-logging

Conversation

@cgsheeh
Copy link
Copy Markdown
Member

@cgsheeh cgsheeh commented May 4, 2026

No description provided.

@cgsheeh cgsheeh requested a review from a team as a code owner May 4, 2026 12:39
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 4, 2026

View this pull request in Lando to land it once approved.

@cgsheeh cgsheeh changed the title scm: truncate long command output in logs (Bug 2036413) scm: truncate long command output in logs (Bug 2036740) May 4, 2026
Copy link
Copy Markdown
Contributor

@zzzeid zzzeid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious what problems this has already presented in the logs? Seems like something we can handle when parsing logs or via the logs UI / query. We could be losing some info here, and setting an arbitrary value for length doesn't seem like an improvement.

@cgsheeh
Copy link
Copy Markdown
Member Author

cgsheeh commented May 4, 2026

Curious what problems this has already presented in the logs? Seems like something we can handle when parsing logs or via the logs UI / query. We could be losing some info here, and setting an arbitrary value for length doesn't seem like an improvement.

@zzzeid we're logging the entire patch content for each commit when running hg export. This is making the logs excessively long for larger patches/stacks, and makes logs hard to parse in ArgoCD view (which while imperfect is still convenient and useful). There's no reason we need to view this content in the logs, and if we need to debug we can easily reproduce the command in the container shell.

Alternatively we could add a flag to control the behaviour. We could either enable/disable the truncation, or enable/disable logging the command output altogether where appropriate. I figured this approach was a reasonable middle ground for now.

@zzzeid
Copy link
Copy Markdown
Contributor

zzzeid commented May 4, 2026

Curious what problems this has already presented in the logs? Seems like something we can handle when parsing logs or via the logs UI / query. We could be losing some info here, and setting an arbitrary value for length doesn't seem like an improvement.

@zzzeid we're logging the entire patch content for each commit when running hg export. This is making the logs excessively long for larger patches/stacks, and makes logs hard to parse in ArgoCD view (which while imperfect is still convenient and useful). There's no reason we need to view this content in the logs, and if we need to debug we can easily reproduce the command in the container shell.

Alternatively we could add a flag to control the behaviour. We could either enable/disable the truncation, or enable/disable logging the command output altogether where appropriate. I figured this approach was a reasonable middle ground for now.

It sounds like hg export is the main command that is causing this excessive output? Is there a way to change the logging behaviour for that particular command (e.g., reduce verbosity or change the output via flags), instead of applying this blanket truncation to everything? As a last resort, maybe we can implement a filter for that particular command's output to exclude patch content.

Lastly if there's an error with that command, we should probably still log the output as is since even the patch content may reveal something important.

@cgsheeh cgsheeh requested review from a team and zzzeid May 4, 2026 21:02
Copy link
Copy Markdown
Contributor

@zzzeid zzzeid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having looked at this implementation, it would actually be more suitable as something that we define in the logging framework. Maybe adding a filter or special formatter in main.logging and making it applicable only to certain modules/commands that way, and only enabling it for say remote logging (or even prod) in settings.

This would avoid making any changes to scm.git or scm.hg.

Comment thread src/lando/main/scm/git.py
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In your original description you mentioned this issue is with the hg export command. Do we need this to apply to git commands as well?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, git diff content doesn't need to be emitted in full to the logs, hence why I added truncate_log_output below.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, for both the git cases, I think we could simply remove --stdout or use --output to prevent the unnecessary output in the first place (which may be a legacy implementation anyway). I wonder if this is something we can also do in hg commands? I think if we can avoid truncating output it would be best.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a way, if the output somehow ends up being incorrect, it's probably best to have some of it in the logs, for quicker identification, than none at all. Truncating would allow to retain that.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, yes, not using --stdout may be a good idea (likely with --output-directory and --numbered-files to get predictable names).

@cgsheeh
Copy link
Copy Markdown
Member Author

cgsheeh commented May 5, 2026

Having looked at this implementation, it would actually be more suitable as something that we define in the logging framework. Maybe adding a filter or special formatter in main.logging and making it applicable only to certain modules/commands that way, and only enabling it for say remote logging (or even prod) in settings.

This would avoid making any changes to scm.git or scm.hg.

How would you suggest we define this in the logging framework? We need to truncate the log output on a per-callsite basis, not a per-module or app-wide basis. We would either need to match commands by string in the log message (brittle), or have the call site add flags to opt them into the truncated logging (via extra= or otherwise), at which point we're still updating {git,hg}.py anyways.

@zzzeid
Copy link
Copy Markdown
Contributor

zzzeid commented May 5, 2026

How would you suggest we define this in the logging framework? We need to truncate the log output on a per-callsite basis, not a per-module or app-wide basis. We would either need to match commands by string in the log message (brittle), or have the call site add flags to opt them into the truncated logging (via extra= or otherwise), at which point we're still updating {git,hg}.py anyways.

I guess with our existing implementation there wouldn't be a way to do it without adding something to the scm modules (e.g., to be able to use funcName below, maybe by adding an extra parameter), but the actual truncation would be better placed in the formatter IMO, that way you can easily disable it if needed. You would use something like module and funcName somewhere in the formatter to do this. E.g.`,

if record.module == "hg" and record.funcName == "_hg_run":
    ...

But if possible we should solve this problem by not using stdout in the first place (as I mentioned in the other comment).

Copy link
Copy Markdown
Contributor

@zzzeid zzzeid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops my other comment did not get submitted, here it is.

Comment thread src/lando/main/scm/git.py
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, for both the git cases, I think we could simply remove --stdout or use --output to prevent the unnecessary output in the first place (which may be a legacy implementation anyway). I wonder if this is something we can also do in hg commands? I think if we can avoid truncating output it would be best.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants