-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Description
The connect
function of a Pool
(at least for postgres) behaves differently from what I would expect.
Scenario
At the start of my application I initialize a connection Pool and then run a query against it:
let pool = PoolOptions<Postgres>::new()
.max_connections(big_number)
.connect_timeout(std::time::Duration::from_secs(10))
.connect(&std::env::var("DATABASE_URL")?)
.await?,
let ret = sqlx::query!("SELECT * FROM character").fetch(&pool).await?;
I mistakenly specified the wrong port in DATABASE_URL
.
What happens
On connect
the application tries to establish a connection against the server and gets its tcp connection denied (RST) because of the wrong port. However, it does not return an error but returns the Pool
without any error. Then on fetch
the connection is tried to be re-established in a tight loop (saturating multiple CPU cores) receiving RST each time until the connect timeout is reached which leads to fetch
returning an error.
What I expect to happen
I expect that connect
(in contrast to connect_lazy
) tries to establish at least one connection to the server before returning. If that connection fails an Err
should be returned. Otherwise it makes it difficult differentiating a misconfiguration from a temporary congestion very difficult.
Why the current behaviour is a problem for me
The code I posted is a vast simplification of my real code. I changed the following two things:
- I spawn a huge amount of
async_std::task
that then each try to get a connection from the pool and use that for the remainder of their life time (which is usually rather short). - I set
connection_timeout
to a basically unlimited big number in order to have those tasks to block until they eventually get a connection.
I do that because my concurrency is is bottle necked the amount of database connections I want to spawn. Therefore I do not limit the amount of tasks spawned but let the PgPool
limit how many can run in parallel. I imagine that this is the case for other applications, too.
Currently, there is no distinction between a connection_timeout
and an acquire_timeout
in a Pool
. The problem that arises from that is that even if the connect
of a pool would try to check if at least one connection is possible it would not be helpful because it would never return because the connection_timeout
is basically unbound. But what I really want to be unbound is only the acquire_timeout
. Without this distinction it is not possible to differentiate database problems (connect_timeout) from an intended concurrency limit (acquire_timeout).