Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(electric): Update SSL logic after upgrading to OTP 27.0 #1396

Merged
merged 4 commits into from
Jun 24, 2024

Conversation

alco
Copy link
Member

@alco alco commented Jun 21, 2024

When we upgraded the sync service to OTP 27.0, we missed the changed default from verify_none to verify_peer in SSL connections.

We now explicitly set verify to verify_none because it's currently the only way to ensure encrypted connections work even when a faulty certificate chain is presented by the PG host. This behaviour matches that of psql <DATABASE_URL>?sslmode=require.

Here's an example of connecting to DigitalOcean's Managed PostgreSQL to illustrate the point:

$ psql 'postgresql://...?sslmode=require'
Null display is "∅".
Output format is aligned.
Expanded display is used automatically.
Line style is unicode.
Border style is 1.
psql (16.1, server 16.3)
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off)
Type "help" for help.

[db-postgresql-do-user-13160360-0] doadmin:defaultdb=> \q
$ psql 'postgresql://...?sslmode=verify-full'
psql: error: connection to server at "***.db.ondigitalocean.com" (167.99.250.38), port 25060 failed: root certificate file "/home/alco/.postgresql/root.crt" does not exist
Either provide the file, use the system's trusted roots with sslrootcert=system, or change sslmode to disable server certificate verification.

$ psql 'sslrootcert=system sslmode=verify-full host=***.db.ondigitalocean.com ...'
psql: error: connection to server at "***.db.ondigitalocean.com" (167.99.250.38), port 25060 failed: SSL error: certificate verify failed
$ openssl s_client -starttls postgres -showcerts -connect ***.db.ondigitalocean.com:25060 -CApath /etc/ssl/certs/
[...]
SSL handshake has read 3990 bytes and written 885 bytes
Verification error: self-signed certificate in certificate chain

Fix #1395.

@alco
Copy link
Member Author

alco commented Jun 21, 2024

Found another issue with DO's managed Postgres while testing this fix:

14:56:49.189 pid=<0.3377.0> [error] GenServer #PID<0.3377.0> terminating
** (stop) exited in: :options.incompatible({:verify, :verify_peer}, {:cacerts, :undefined})
    ** (EXIT) :ssl_negotiation_failed
Last message (from #PID<0.3370.0>): {:command, :epgsql_cmd_connect, %{...}}
▓ ┌────────────────────┐
▓ │  CONNECTION ERROR  │
▓ ┕━━━━━━━━━━━━━━━━━━━━┙
▓ 
▓ Failed to initialize Postgres state:
▓   {:error, {:ssl_negotiation_failed, {:options, :incompatible, [verify: :verify_peer, cacerts: :undefined]}}}
▓ 
▓ Double-check the value of DATABASE_URL and make sure your database
▓ is running and can be reached using the connection URL in DATABASE_URL.
14:56:49.190 pid=<0.3370.0> origin=postgres_1 [error] Initialization of Postgres state failed with reason: {:error, {:ssl_negotiation_failed, {:options, :incompatible, [verify: :verify_peer, cacerts: :undefined]}}}.

Since OTP 26, the default for SSL connections has changed from
verify_none to verify_peer. We cannot use the latter because in practice
some popular hosts appear to have certificate chain problems.

Not that connecting with psql using sslmode=require is equivalent to
using verify_none with Erlang SSL.

Fix #1395.
@alco alco force-pushed the alco/vax-1987-cacerts_load branch from 1b0d6ab to 59d89fb Compare June 21, 2024 23:41
@alco alco changed the title fix(electric): Handle :undefined returned from :public_key.cacerts_load() fix(electric): Update SSL logic after upgrading to OTP 27.0 Jun 21, 2024
@alco alco force-pushed the alco/vax-1987-cacerts_load branch from 59d89fb to 2bead1b Compare June 22, 2024 00:10
@alco alco merged commit 6002f07 into main Jun 24, 2024
7 checks passed
@alco alco deleted the alco/vax-1987-cacerts_load branch June 24, 2024 15:35
alco added a commit that referenced this pull request Jun 27, 2024
Merging #1370 into `main`
inadvertently broke the fix that had been introduced in
#1396. That latter PR's
description goes into detail about why we are not ready to validate
server certificates.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Sync service 12.0.1 fails to connect to the database with {:case_clause, :undefined} error
2 participants