Supavisor FAQ #21566
TheOtherBrian1
announced in
Troubleshooting
Supavisor FAQ
#21566
Replies: 1 comment
-
Question: Is JDBC compatible with Supavisor in transaction mode? If yes, is there any reference for the connection parameters required to use it in transaction mode? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
What Problems Do Poolers Solve?
PostgreSQL stands out from other databases by opting to create a new process, not a new thread, for each direct connection. While this design choice brings numerous benefits, it introduces a startup penalty for new connections. Moreover, connections are more memory-intensive and can strain Postgres's internal schedulers, limiting the sustainable number that can be formed. Resultingly, developers must be mindful of how they allocate the resource.
When a client (backend server) connects to PostgreSQL, the connection is stateful and enduring, allowing clients to greedily hold onto connections without any obligation to give them up. Typically, applications do not continuously send queries to a database, so this pattern underutilizes finite connections that could have serviced other clients.
PostgreSQL's shortcomings are particularly evident when handling transient servers, like edge functions. They not only hoard connections for brief queries but also aggressively open and close connections, straining the database.
How Do Poolers Solve the Problem?
A pooler is ultimately a load balancer for database connections. It maintains several hot connections that it triages to clients. This reduces the startup cost of creating a new process on Postgres. The pooler can also more efficiently manage a database's finite connections by only allowing clients to access them when they need to execute a query (A.K.A. transaction mode).
Are Poolers Necessary?
All database connection libraries, such as Prisma, SQLAlchemy, and Postgres.js have built-in poolers. These are known as application-side poolers and they are fundamental for sustainable connection management. Most libraries have default pool sizes that may need to be changed for specific workloads. As an example, most edge/serverless functions are called to service a single user's request. They usually require significantly fewer connections (often 1 is optimal) than a dedicated application server.
Note, that like databases, servers can only maintain a certain amount of connections themselves. If you were to increase the number aggressively, the server may not be able gracefully orchestrate them. For instance, Prisma's default pool size is automatically set to one more than twice the server's CPU count (1 + 2 * num_of_CPUs). The Prisma Team chose this value because it is generally performant with ORM's internal architecture.
When deploying to static architecture, such as long-standing containers or VMs, application-side poolers are satisfactory on their own.
When connecting to your application from serverless/edge functions, horizontally auto-scaling servers, or in cases where you simply need more connections than what the database can manage, it's best to complement application-side poolers with a serverside one. Supabase provides Supavisor as an option, but you could use alternatives, such as Prisma's Accelerate or Cloudflare's Hyperdrive.
They sit between the database and your client servers. They are solely optimized for sustaining high numbers of client connections and queuing and triaging queries to the database. Although they add network complexity, they are necessary when managing auto-scaling servers that can hypothetically form an infinite amount of connections.
Where are the connection strings
Supabase provides 3 database connection strings that can be used simultaneously if necessary. All can be viewed in the connection string section of the Database Settings

Direct connections:
# Example connection string postgresql://postgres:[YOUR-PASSWORD]@db.ajrbwkcuthywfihaarmflo.supabase.co:5432/postgres
Supavisor in transaction mode (port 6543)
# Example transaction string postgresql://postgres.ajrbwkcuthywddfihrmflo:[YOUR-PASSWORD]@aws-0-us-east-1.pooler.supabase.com:6543/postgres
Supavisor in session mode (port 5432)
# Example session string postgresql://postgres.ajrbwkcuthywfddihrmflo:[YOUR-PASSWORD]@aws-0-us-east-1.pooler.supabase.com:5432/postgres
Supavisor: Transaction Mode vs. Session Mode?
When a client forms a direct connection with PostgreSQL, it usually will make a few queries, but may not utilize the connection the entire time. In transaction mode, a client is allowed to make a single query before being sent back to the figurative "waiting room". This prevents greedy or sedentary clients from hoarding connections. In most cases, this increases query throughput and is optimal.
In session mode, once the pooler assigns a direct connection, it stays with that client until voluntarily surrendered.
This behavior mirrors a direct connection, allowing greedy clients to monopolize the pool. This raises a question: what is the purpose of session mode?
Depending on your application's configurations, having the pooler manage a queue of patient clients is preferable to the alternative of constantly polling the database to check for an available connection. Session mode can queue clients for up to a minute. If this isn't particularly relevant to your application design, then the primary benefit is that it is IPv4 compatible. Also, unlike transaction mode, it supports prepared statements.
What Happens when a Client Library, such as Prisma, Connects through Supavisor?
When clients connect to either PostgreSQL or Supavisor, they do so with the PostgreSQL Wire Protocol. Because of this, clients treat connections with pooler as if they were directly connected to PostgreSQL. The pooler then smoothly acts as a messenger between the database and the client.
What are "Client Connections"?
TL;DR: They have nothing to do with front-end clients. They are the amount of backend-server connections that can connect to a serverside pooler.
Imagine a chess tournament with 60 boards. Each board represents a connection in a database. When a player sits down at a board, it's like a client connecting to the database. They can take their time with their moves or just sit there, not making any.
But when the tournament fills up and all the boards are taken, new players are turned away, and told to check back at a later time to see if a table becomes available.
Now, imagine the tournament organizers decide to expand the venue to house 200 people without adding more tables. Even when all boards are occupied, players don't have to leave. They can wait in the wings, and the moment a board opens up, someone from the waiting area can take their place. Likewise, if someone ends their game, but wants to play again, they can just go back to the waiting area. The waiting area represents the "Max Client Connections". Ultimately, the additional capacity provided by the pooler ensures fewer people are turned away from the "tournament".
In the Context of Supavisor, what Does "pool size" Mean?
"Pool size" refers to the maximum number of direct connections the pooler can maintain per unique user, database, and mode combination. You can adjust it in the Database Settings to strike a balance between efficient resource utilization and accommodating peak traffic.
What is the "user+db+mode" Combination?
PostgreSQL is not actually a database. It is a Relational Database Management System (RDMS). Within it, you can spawn PostgreSQL databases. In Supabase, it is a common pattern to just use the default database called postgres, but you could create more:
Similarly, a database can have many database users, but most people just rely on the default user "postgres".
The modes are transaction (port 6543) and session (port 5432) mode.
The "user+database+mode" combinations are formed from the above variables and are used within the connection string:
When a distinct combination connects to your database, a new direct connection pool will be created. That means if you have two combinations connecting and the "Pool Size" is set to 120, each combination will have permission to form 120 connections. This can become problematic if collectively they exhaust all available direct connections.
Does Supavisor Immediately Establish the Max Pool Size?
No, it doesn't. Let's break it down:
Where Can I Change My Pool Size?
In the Dashboard's Database Settings, you can configure Supavisor's "Pool Size":

You can also change the pool size for PostgREST's (DB API) internal pooler at the bottom of the application settings.
Do all services on Supabase use Supavisor?
Supabase Storage uses Supavisor internally. The other servers that communicate with Postgres (PostgREST, Realtime, and Auth) all rely on internal application poolers.
Supavisor is primarily intended for users who do not want to rely on the Supase Client libraries and instead prefer to work with external ORMs, such as Prisma, Drizzle, and Psycopg.
Should I Change Supavisor's Pool Size?
In an ideal scenario, PostgreSQL would support an unlimited number of direct connections, but there's a limit to how many it can handle. If Supavisor uses most of the available connections, you risk depriving other servers, such as Auth, from accessing your database. Still, as much as possible, you want to give your pooler freedom to grow its pool as needed to service demand.
It's important to note that the Storage server also uses Supavisor as a unique "user+db+mode" combination, so the pool size you set for your general application will apply to it, too.
As a rule of thumb, if you're using the DB REST API or multiple app-based "user+db+mode" combinations, try to keep the pooler's usage under 40% of available connections. Otherwise, you can cautiously increase usage to around 80%. These percentages are flexible and depend on your application's usage and setup. Monitor connection usage to determine the optimal allocation without depriving other servers of necessary connections.
How can I Monitor Connections?
Connection usage can be monitored with a Supabase Grafana Dashboard. It provides realtime visibility of over 200 database metrics, such as graphs of CPU, EBS, and active direct/pooler connections. It can be extremely useful for monitoring and debugging instances.
You can check our GitHub repo for setup instructions for local deployments or free cloud deployments on Fly.io. I recommend referring to our concise documentation to learn more about the metrics endpoint.
Can Supavisor really Support a Million Connections?
It depends.
In the article "Supavisor: Scaling Postgres to 1 Million Connections" we gave Supavisor the capacity to connect to a million clients simultaneously. The pooler triaged the clients' queries to 400 database connections.
What is important is that the database processing these queries had 64vCPUs and 256GB of memory. It could process the queries fast enough to ensure that there was never a noticeable bottleneck. The clients never waited long enough to timeout. If the clients were asking to run inefficient queries, like the one below, a severe backlog of pending client queries would have formed:
A backlog could also happen if there are not enough direct connections available. In our setup, 400 connections were enough to handle a million clients, ensuring a smooth flow of requests. But if we had only 1 direct connection, the queue would've moved too slowly and clients would expire their requests.
Scaling to this level only works when the pooler is operating in transaction mode. In Session mode, hypothetically, the 400 clients that first accessed the direct connections could keep them, even if they are no longer executing queries. The pending 999,600 clients waiting for a turn would then be starved of direct connections.
Beta Was this translation helpful? Give feedback.
All reactions