-
Notifications
You must be signed in to change notification settings - Fork 3k
Create connections concurrently in separate threads #2317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
| @SuppressWarnings("NullableProblems") | ||
| public Thread newThread(Runnable r) { | ||
| var thread = new Thread(r, threadName); | ||
| var thread = new Thread(r, threadName + "_" + counter.incrementAndGet()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P.S. If the overall idea is good, gotta change this part so that thread naming doesn't interfere with other existing executors.
| var backoffMs = 10L; | ||
| long jitterMs = ThreadLocalRandom.current().nextLong(Math.max(10, config.getConnectionTimeout() / 10)); | ||
| if (connectionsInProgress.get() >= Runtime.getRuntime().availableProcessors()) { | ||
| quietlySleep(jitterMs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added sleeping for some jittered time to avoid potential CPU throttling in CPU-limited environments like Kubernetes, when multiple connections are created at once.
| final int maxPoolSize = config.getMaximumPoolSize(); | ||
| this.addConnectionExecutor = createThreadPoolExecutor(maxPoolSize, poolName + ":connection-adder", threadFactory, new CustomDiscardPolicy()); | ||
| this.addConnectionExecutor = createCreatorThreadPoolExecutor( | ||
| Math.min(Runtime.getRuntime().availableProcessors() * 2, config.getMinimumIdle()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, core pool size will have to be the same as max pool size (not Hikari's), so that executor doesn't hold the task in the queue and executes immediately. Something to fix
d7a9b2f to
e56fa46
Compare
The problem
When the cost of creating a connection is high enough (e.g. longer than connection-timeout, like a few seconds or more), the pool fails to reach sufficient number of connections to satisfy demand (rate of requests). Even when
com.zaxxer.hikari.blockUntilFilledis set totrueto make the app wait at startup to fill the pool with minimum idle connections, after max-lifetime of connections reaches, the connections are closed faster than the refill rate can keep up. This leads to the high number of waiting threads and small number of active and idle connections (less than the minimum-idle) either periodically or, in the worst case, becomes a new norm.In addition, since
PoolEntryCreatorcreates and adds new connections sequentially,initialization-fail-timeoutshould be large enough to guarantee minimum idle connections are ready which can quickly turn into several minutes, though unnecessarily - had the connections been created concurrently. Moreover, when we prefer the smallest max-lifetime (which is 30s), closing of connections can start in the middle ofblockUntilFilledat the startup, hence by the end of the prefill, we will have less than the desired number of idle connections.Another point is that, while a creator task is already running,
refillPool()method simply submits a creator task which ends up in the queue, because executor'scorePoolsize = maxPoolSize = 1. As the creator task sequentially adds connections in the loop, by the time it finishes not much changes when the next creator task starts its loop again becauseshouldContinueCreating()has just returned false, and even if the condition satisfies again, it will barely help by adding a few connections. At this point, the queue becomes nothing more than just an indicator of demand, based on its size, rather than truly preserving refill requests and helping to meet the demand.Discussion
Some might ask why would establishing a connection take longer than a few seconds, and that this problem should be fixed in the first place. Although in most of the cases that would be fair enough but this is not always possible, and in fact, I believe that is the whole point of a connection pool to take the burden/cost of creating connections on itself and have warm connections available regardless of the time it takes to create a connection.
Another question some might ask, why would connection-timeout be less than the time it takes to establish a connection? That might be a legit argument if requests can afford running that long, which could be the case with apps with relatively low traffic. However, with comparatively low-latency systems and large number of requests, we have to make sure if DB is not available, we don't wait longer than a few hundreds of milliseconds not to saturate threads and avoid cascading failures. In which case we rely on the connection pool to maintain connections on the side and have ready connections whenever needed.
I think for this type of scenarios, the strategy of maintaining connections could be discussed further and taken to the next level perhaps. In the short term though, creating new connections concurrently should be good enough of a solution to meet the aforementioned requirements.
Overall, if there is a better way of tackling the problem, feel free to comment on how the code change should look like. Will appreciate if you review this PR and let me know what you think. Cheers!