WP-CLI Out Of Memory on search-replace actions with large serialized objects - Buffer? #64

JanThiel · 2024-05-27T09:23:45Z

Describe the bug
Hey @Pierre-Lannoy hope you are doing well :-)
This is quite an "early" bug report while we have not yet 100% identified the root cause but having some quite reasonable ideas of it. Hoping for some highly valued input of yours.

The issue:
We face(d) massive memory spikes when using WP-CLIs search-replace on tables with many large (~100kb each, several 100s of them) serialized objects. These serialized objects leads WP-CLI to use PHP for the search-replace on these fields.
And as such using the WP PHP API (update_... functions) to de-serialize, search-replace and then serialize and update the data again. Performing this on a table of ~80MB size lead to a memory consumption of >20GB ending with Out of Memory killer being triggered.
Running the same wp search-replace ... with --skip-plugins=decalog mitigated this issue. The search-replace went through in seconds. No memory spikes, no mentionable memory consumption.

We assume - but again have not verified it yet (sorry!) - the buffer option of Decalog to be the root of this. If this applies to WP-CLI calls as well, it might explain the seen behaviour as the log calls with all the data in it are stored in memory and would only be logged once the CLI "request" ends?! Does that makes sense?

If you agree on this as a possible source, could you suggest a "hotfix" we could try out to ignore the buffer option in CLI sessions? We could try this out and give you feedback on it as we have this issue reproducible ;-)

Notable: We use a redis object cache (Object Cache Pro) which is inflicted as well. Yet when disabling Decalog that component does not seem to cause any issues.
And we still use Datadog as target log system.

To Reproduce
Steps to reproduce the behavior (not verified yet!):

Fill the wp_postmeta table with many large, serialized and nested objects
Make sure the buffer option in Decalog is enabled
Run wp search-replace old-value new-value wp_postmeta --dry-run

Memory should spike and the operation should take a long time.

Expected behavior
No memory spike and only some seconds execution time.
Maybe ignore the buffer option in CLI sessions and always flush the log calls directly? This might increase the CLI runtimes slightly but will prevent scaling effects. If these are the case here.

Environment (please complete the following information):

PHP version: 8.1.28
WordPress version and flavour: 6.4.2 Multisite / CLI
Other info: Object Cache / Redis

Thank you very much as always :-)

The text was updated successfully, but these errors were encountered:

Pierre-Lannoy · 2024-05-28T06:12:08Z

Hello @JanThiel !
Hope you're doing well too…
Thank you so much for this comprehensive report, which will - once again - really help me 😍

I will try to reproduce it and see if it's the buffering mechanism that produces this "effect". My preliminary analysis is that your assumption seems correct…
I'm more concerned (if that's the case) about finding a way to solve this problem. My intuition tells me that omitting the buffer is likely to make processing time incredibly long (one API call for each event). Maybe processing bulk-calls (like if the buffer has a "limit" before sending) would be better. Don't know yet.

Anyway, let's start by confirming your analysis.

Can you just tell me if you can see if such a wpcli command produces many events?

Pierre-Lannoy · 2024-05-28T06:17:27Z

Oh, and by the way, if you really need to execute this command now, the best approach for me seems to pause the logger before the command and reactivate it after. You will be "blind" and I agree it's a dirty workaround. But it is better than other methods I can think of…

JanThiel · 2024-05-28T21:32:33Z

Hey @Pierre-Lannoy,

very well over here in Germany except of the common spreading stupidity. But let's not jump into politics :-) Thank you very much!

Using the WP-CLI flag --skip-plugins=decalog works fine :-) As it skips loading decalog within the WP bootstrapping it's an easy workaround with no need of manual deactivation of the plugin. So it's an easy and acceptable workaround we added to our standard procedures internally.

Can you just tell me if you can see if such a wpcli command produces many events?

In general thre aren't really any entries within Datadog about these operations as we have the log level reduced in our Prod env.
We investigated further today and can confirm that WP-CLI does not use the WP PHP API to update the DB values. It's only happening via SQL and a custom PHP script for serialized data ( https://github.com/wp-cli/search-replace-command/blob/main/src/Search_Replace_Command.php#L539 ).

That does make me wonder where Decalog enters the call stack.

What did cross my mind was the question whether it might be helpful to be able to get the Decalog "outputs" printed out within a CLI session when they happen. Like the --debug flat of WP-CLI. That would allow us to actually identify what it might be that happen in Decalog while the search-replace command runs. Would be interesting for manual app debugging as well in general.

Regarding your thoughts on the buffer and a potential limit: I highly agree that no buffer is no solution either. Especially when HTTP calls are invoked. Kind of a buffer limit (time or memory based?) with forced flushes seems smart to me too on second thought.
In addition to the current issues we commonly hit situations in the past where the log output of Decalog was lost whenever a script hit the PHP memory limits and died with OOM as the buffer was unable to be flushed at all. That might be solved by such an approach as well. At least partly. Currently we were kind of blind in these situations. Only the local log files contained the information about the OOM event.

Have a great night,

Jan

Pierre-Lannoy · 2024-06-14T08:05:04Z

Hello @JanThiel !
I haven't forgotten you ;)
So far, I haven't been able to reproduce your problem. I have, however, discovered a memory management problem that could lead to what you're experiencing.
I'm still working on it, but I'll be releasing 4.2.0 version in some hours/days (which adds a few new features for those who use Datadog on custom endpoints - not your case) but that won't solve this issue.
I plan to fix this issue just as soon the 4.2.0 is released.
Sorry for this delay.

What did cross my mind was the question whether it might be helpful to be able to get the Decalog "outputs" printed out within a CLI session when they happen.

There is already a way to do (a sort of) this (but, yes, it requires shmop php extension to be enable): just launch a second terminal and do a wp log tail --level=info. But I understand it is not really what you ask for. Let me think about such a feature…

Have a nice day :)

JanThiel added the bug Something isn't working label May 27, 2024

JanThiel assigned Pierre-Lannoy May 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WP-CLI Out Of Memory on search-replace actions with large serialized objects - Buffer? #64

WP-CLI Out Of Memory on search-replace actions with large serialized objects - Buffer? #64

JanThiel commented May 27, 2024

Pierre-Lannoy commented May 28, 2024

Pierre-Lannoy commented May 28, 2024

JanThiel commented May 28, 2024 •

edited

Loading

Pierre-Lannoy commented Jun 14, 2024

WP-CLI Out Of Memory on search-replace actions with large serialized objects - Buffer? #64

WP-CLI Out Of Memory on search-replace actions with large serialized objects - Buffer? #64

Comments

JanThiel commented May 27, 2024

Pierre-Lannoy commented May 28, 2024

Pierre-Lannoy commented May 28, 2024

JanThiel commented May 28, 2024 • edited Loading

Pierre-Lannoy commented Jun 14, 2024

JanThiel commented May 28, 2024 •

edited

Loading