Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Periodic flushing / auto-batching #2302

Open
mtheos opened this issue Jan 22, 2023 · 5 comments
Open

Feature request: Periodic flushing / auto-batching #2302

mtheos opened this issue Jan 22, 2023 · 5 comments
Labels
status: waiting-for-feedback We need additional information before we can continue type: feature A new feature
Milestone

Comments

@mtheos
Copy link

mtheos commented Jan 22, 2023

Feature Request (branch)

Hi team, have you considered periodic-flushing/auto-batching as a middle ground between auto and manual flushing?

I'm aware of the @Batch annotation for the command interface, but I don't believe there is any way to avoid command timeouts if the batch size isn't reached.

I've been looking at the case where many threads make a small number of requests (~10-20) to Redis each before flushing. In this case, there is still significant traffic to Redis in aggregate, but a connection per thread would introduces a lot of overhead, and I believe even pooling is suboptimal due to the batch size.

In a test application with 50 threads, I see batch sizes of 0-30 manually flushing across multiple threads, compared to 300-500 when flushing periodically at 1ms intervals.

Is your feature request related to a problem? Please describe

This is related to the previous issue I opened (#2289) regarding Redis CPU usage with auto-flush enabled.

In our case, we have a large request handling pool, but each thread only issues ~10 requests to Redis before flushing. We found that with auto-flush enabled, each command is written out to the socket individually, and our Redis node was seeing very high CPU usage, which we presume is from reading in commands.

Our current solution is to use a shared connection and manually flush in each thread, which returns CPU usage to what we expect in our node.

Describe the solution you'd like

  • Commands are written to the command buffer
  • A task running on the event loop periodically flushes the buffer at a configurable interval
  • A max size can be configured, that will immediately flush the buffer.

Advantages:

  • Can increase the batch size significantly compared to manually flushing with a shared connection.
  • CPU usage is noticeably lower in Redis

Drawbacks:

  • Inherently adds latency
  • The optimal flush interval is very constrained. In my testing, <=5ms (realistically 1-5ms) proved optimal and had QPS between auto and manual flushing.

Describe alternatives you've considered

Manually flushing a shared connection remains an effective option in lieu of this.
My expected use case is when the number of commands issued between flushing for an individual thread is low, but the number of threads sharing the connection is high.

Teachability, Documentation, Adoption, Migration Strategy

If you can, explain how users will be able to use this and possibly write out a version the docs.
Maybe a screenshot or design?

A possible interface:

    /**
     * Disable or enable auto-batch behavior. Default is {@code false}. If autoBatchCommands is enabled,
     * multiple commands can be buffered before being automatically flushed to the transport.
     * Commands are buffered until the configured batch size or maximum delay is reached.
     *
     * @param autoBatch state of autoBatch.
     */
    void setAutoBatchCommands(boolean autoBatch);

    /**
     * Sets the auto-batch delay. Default is {@code 5 milliseconds}. If autoBatchCommands is enabled,
     * this is the maximum amount of time that will pass before buffered commands are flushed to the transport.
     *
     * @param delay maximum delay to auto-batch.
     */
    void setAutoBatchDelay(Duration delay);

    /**
     * Sets the auto-batch size. Default is {@code 50}. If autoBatchCommands is enabled,
     * a flush will be triggered when this number of commands have been buffered,
     * regardless of how long it has been since the last flush.
     *
     * @param size maximum size of a batch.
     */
    void setAutoBatchSize(int size);
@mp911de mp911de added the type: feature A new feature label Jan 23, 2023
@mp911de
Copy link
Collaborator

mp911de commented Jan 23, 2023

I'm torn on this one because such a mechanism can easily lead to command timeouts using the synchronous API. Also, we introduce a lot of API that is not used by 99% of all users.

I wonder, instead, whether it would make sense to introduce a buffering outbound channel handler that can be optionally installed into the netty channel pipeline that buffers the outgoing write requests. Such an approach could be activated via ClientResources and NettyCustomizer, and it would not touch the existing API bits.

It could be as well a path forward to refactor our manual command flushing.

@mtheos
Copy link
Author

mtheos commented Jan 23, 2023

such a mechanism can easily lead to command timeouts using the synchronous API

The performance would be terrible, but unless your timeouts are extremely aggressive WRT the flush period it should be functional with the sync API. The @Batch annotation and manual flushing should not be used with the sync API either AIUI.

I wonder, instead, whether it would make sense to introduce a buffering outbound channel handler that can be optionally installed into the netty channel pipeline that buffers the outgoing write requests.

That sounds interesting. I'll see if I can put something together for it. I'm fresh to netty, so it may take some time.

If it's created by the ClientResources, would that mean that it applies to all connections on the client, or would you have an API to enable/disable buffering on individual connections?

@mp911de
Copy link
Collaborator

mp911de commented Jan 23, 2023

I'm not sure what the control mechanism would be to enable or disable (or force-write) buffers.

@tishun tishun added this to the Icebox milestone Jun 28, 2024
@tishun tishun added the status: waiting-for-feedback We need additional information before we can continue label Jun 28, 2024
@tishun
Copy link
Collaborator

tishun commented Jun 28, 2024

Seems like a very corner case scenario right now.

Leaving in the Icebox and unless we see interest from the community we might have to drop it.

If anyone is willing to contribute a solution this would increase the odds of integrating it (the team still has to decide if it makes sense thought)

@okg-cxf
Copy link

okg-cxf commented Aug 17, 2024

@mtheos @tishun @mp911de

I did an evaluation against the solution that I propose (not periodically flush but more like group commit so no such delay thing) vs FlushConsolidationHandler.

Here is some interesting results.

See MR: #2950

Result: https://github.com/okg-cxf/lettuce-core/blob/test/auto-batch-flush/bench-c5n-2xlarge-exists-b64.bench
https://github.com/okg-cxf/lettuce-core/blob/test/auto-batch-flush/bench-c5n-2xlarge-exists-b8.bench

can see there are still advantage when QPS is high.

Test-code: https://github.com/okg-cxf/lettuce-core/blob/test/auto-batch-flush/src/main/java/io/lettuce/bench/MultiThreadSyncExists.java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: waiting-for-feedback We need additional information before we can continue type: feature A new feature
Projects
None yet
Development

No branches or pull requests

4 participants