Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Notices in pg_repack1.4.5 logs for AWS Aurora PostgreSQL 12.8 indicate waiting for transactions to finish with specific PIDs unrelated to the table being repacked. #381

Open
Cdkhanna opened this issue Feb 22, 2024 · 13 comments

Comments

@Cdkhanna
Copy link

Hi,
In the pg_repack1.4.5 logs for AWS Aurora PostgreSQL 12.8, there are notices indicating a wait for transactions to finish, mentioning specific PIDs that are not associated with the table undergoing repacking. This inconsistency needs to be addressed for clarity and efficiency in the repacking process.

@Cdkhanna
Copy link
Author

Cdkhanna commented Feb 22, 2024

In the pg_stat_activity, it's observed that the pg_Repack session is executing the query "LOCK TABLE table IN ACCESS SHARE MODE" and remains in an idle in transaction state.

@za-arthur
Copy link
Collaborator

This is expected behavior. pg_repack opens an additional connection which holds ACCESS SHARE lock. Here is quote from the documentation:

You will not be able to perform DDL commands of the target table(s) except VACUUM or ANALYZE while pg_repack is working. pg_repack will hold an ACCESS SHARE lock on the target table during a full-table repack, to enforce this restriction.

and

pg_repack will only hold an ACCESS EXCLUSIVE lock for a short period during initial setup (steps 1 and 2 above) and during the final swap-and-drop phase (steps 6 and 7). For the rest of its time, pg_repack only needs to hold an ACCESS SHARE lock on the original table, meaning INSERTs, UPDATEs, and DELETEs may proceed as usual.

@andreasscherbaum
Copy link
Collaborator

This inconsistency needs to be addressed for clarity and efficiency in the repacking process.

Which tables are seen in the log?

@Cdkhanna
Copy link
Author

Cdkhanna commented Feb 22, 2024

Thanks for the response, @andreasscherbaum & @za-arthur

It seems I didn't explain the situation thoroughly. The PID that pg_repack is waiting for isn't interacting with the target table at all. Each time I run pg_repack, it gets stuck waiting for a completely unrelated session, In this particular case session is executing a different SQL statement on a different table.

This target table is newly created solely to reproduce the issue and isn't used anywhere else in the database.

Regarding my comment on the pg_repack session in pg_stat_Activity, it's executing "LOCK TABLE table IN ACCESS SHARE MODE" and remains idle in a transaction state.

@andreasscherbaum
Copy link
Collaborator

Can you create a case where you can reproduce this on vanilla PostgreSQL?

@Cdkhanna
Copy link
Author

Quoting @schmiddy from another issue - Issue

So that "Waiting for x transactions to finish" message comes from here, and as you can see there is no respect paid to the -T / --wait-timeout option there.

I think the code has been like that forever, and I guess the reasoning goes something like: it's ok to kill other backends which are directly holding locks on our target table, because they really should know better than to try to hold locks on the target table during a repack. But it may be less obvious that other long-running transactions which don't touch the target table at all are holding up the repack, and that they deserve to be killed.

We have already gotten some complaints about pg_repack's default behavior of killing conflicting backends (see e.g. #76 and #90). So I'm pretty hesitant to introduce any more backend killing by default. We might think about adding a non-default option to enable killing of old transactions, though I'm also leery of over-burdening the client with dozens of options.

@za-arthur
Copy link
Collaborator

This might be related to this issue #86.

@Cdkhanna
Copy link
Author

Cdkhanna commented Mar 6, 2024

@za-arthur I am sorry I couldn't find any solution in the issue #86

My pg_repack runs keep waiting like this. I did use -T 30 so that these sessions can be terminated for pg_repack.
What else I can do in pg_repack to make sure repacking is not waiting for these locks?

NOTICE: Waiting for 5 transactions to finish. First PID: 9677
NOTICE: Waiting for 5 transactions to finish. First PID: 9677
NOTICE: Waiting for 5 transactions to finish. First PID: 9677
NOTICE: Waiting for 5 transactions to finish. First PID: 9677
NOTICE: Waiting for 5 transactions to finish. First PID: 9677
NOTICE: Waiting for 5 transactions to finish. First PID: 9677
NOTICE: Waiting for 5 transactions to finish. First PID: 9677
NOTICE: Waiting for 5 transactions to finish. First PID: 9677
NOTICE: Waiting for 5 transactions to finish. First PID: 9677
NOTICE: Waiting for 5 transactions to finish. First PID: 9677
NOTICE: Waiting for 5 transactions to finish. First PID: 9677
NOTICE: Waiting for 5 transactions to finish. First PID: 9677
NOTICE: Waiting for 4 transactions to finish. First PID: 9677
NOTICE: Waiting for 4 transactions to finish. First PID: 9677
NOTICE: Waiting for 4 transactions to finish. First PID: 9677
NOTICE: Waiting for 4 transactions to finish. First PID: 9677
NOTICE: Waiting for 4 transactions to finish. First PID: 9677
NOTICE: Waiting for 4 transactions to finish. First PID: 9677
NOTICE: Waiting for 4 transactions to finish. First PID: 9677
NOTICE: Waiting for 4 transactions to finish. First PID: 9677
NOTICE: Waiting for 4 transactions to finish. First PID: 9677
NOTICE: Waiting for 4 transactions to finish. First PID: 9677
NOTICE: Waiting for 4 transactions to finish. First PID: 9677
NOTICE: Waiting for 4 transactions to finish. First PID: 9677
NOTICE: Waiting for 4 transactions to finish. First PID: 9677
NOTICE: Waiting for 4 transactions to finish. First PID: 9677
NOTICE: Waiting for 4 transactions to finish. First PID: 9677

@MaximBoguk
Copy link

MaximBoguk commented Mar 6, 2024

Not much you can do. You do not have (and never will) a control over amazon RDS or Aurora instances on superuser level required to cancel transactions of other users (including amazon internal stuff).
This request should be addressed to amazon support actually.
This issue not with pg_repack, but with limited conrol and permissions you have over DB as a service instances in general.
I am actually very surprised that the pg_repack work on Aurora at all, given how seriously Aurora internals differ from vanilla PostgreSQL.

@haggrip
Copy link

haggrip commented Mar 7, 2024

Can you create a case where you can reproduce this on vanilla PostgreSQL?

This can be reproduced with bitnami's Postgresql 15 with pg_repack:

  1. Create a new database
  2. Create a table
  3. Execute any moderately long transaction, for example, executing pg_sleep prevents pg_repack from running with the same message @Cdkhanna mentioned.
  4. Attempt to repack any table. Pg_repack won't execute until transaction is done

Maybe we are missing something from the docs, but our database always has running transactions. Is pg_repack impossible to execute under these conditions?
bitnami-pg-repack.Dockerfile.txt

@MaximBoguk
Copy link

It's expected behavior. Just let it wait until old transactions finished.
PS: long running TS very very detrimental to PostgreSQL performance in general.

@haggrip
Copy link

haggrip commented Mar 8, 2024

It's expected behavior. Just let it wait until old transactions finished.
PS: long running TS very very detrimental to PostgreSQL performance in general.

Thank you @MaximBoguk. We may need to address the long running transactions. I still don't quite get this behavior. Could anybody give more details on this? Are any previously running transactions considered potential interference for pg_repack?

@andreasscherbaum
Copy link
Collaborator

In general, PostgreSQL can't let go of old data which may or may not be accessed by older transactions. It's possible that a long-running transaction will also access this table, even though that did not happen, or not yet happen. The database can't know that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants