Skip to content

Commit

Permalink
Added tracer for row level locks
Browse files Browse the repository at this point in the history
Closes: #117
  • Loading branch information
jnidzwetzki committed Jan 1, 2024
1 parent 105eab7 commit 24e6153
Show file tree
Hide file tree
Showing 5 changed files with 570 additions and 5 deletions.
5 changes: 5 additions & 0 deletions .github/workflows/integration_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,11 @@ jobs:
pg_lw_lock_tracer --version
sudo pg_lw_lock_tracer -p $(pidof postgres) --dry-run
- name: Check that pg_row_lock_tracer can be executed
run: |
pg_row_lock_tracer --version
sudo pg_row_lock_tracer -x ~/postgresql/bin/${{ matrix.psql_version }}/bin/postgres -v -p $(pidof postgres) --statistics --dry-run
- name: Check that animate_lock_graph can be executed
run: |
animate_lock_graph --version
Expand Down
76 changes: 71 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,11 +8,12 @@

This project provides tools that allow you to gain deep insights into PostgreSQL's locking activities and troubleshoot locking-related issues (e.g., performance problems or deadlocks).

* `pg_lock_tracer` - is a lock tracer for PostgreSQL.
* `pg_lw_lock_tracer` - is a tracer for PostgreSQL lightweight locks (LWLocks).
* `pg_lock_tracer` - is a lock tracer for PostgreSQL (traces table level locks).
* `pg_lw_lock_tracer` - is a tracer for PostgreSQL lightweight locks (traces LWLocks).
* `pg_row_lock_tracer` - is a tracer for PostgreSQL row locks (traces Row-Level locks).
* `animate_lock_graph` - creates animated locks graphs based on the `pg_lock_tracer` output.

__Note:__ Most of these tools employ the [BPF / eBPF](https://ebpf.io/) (_Extended Berkeley Packet Filter_) technology. At the moment, PostgreSQL 12, 13, 14, 15, and 16 are supported (see additional information below).
__Note:__ These tools employ the [eBPF](https://ebpf.io/) (_Extended Berkeley Packet Filter_) technology. At the moment, PostgreSQL 12, 13, 14, 15, and 16 are supported (see additional information below).

# pg_lock_tracer
`pg_lock_tracer` observes the locking activity of a running PostgreSQL process (using _eBPF_ and _UProbes_). In contrast to the information that is present in the table `pg_locks` (which provides information about which locks are _currently_ requested), `pg_lock_tracer` gives you a continuous view of the locking activity and collects statistics and timings.
Expand Down Expand Up @@ -664,7 +665,7 @@ pg_lock_tracer -x /home/jan/postgresql-sandbox/bin/REL_14_2_DEBUG/bin/postgres -
### Animated Lock Graphs
See the content of the [examples](examples/) directory for examples.

# pg_lw_lock_trace
# pg_lw_lock_tracer

`pg_lw_lock_trace` allows to trace lightweight locks ([LWLocks](https://github.com/postgres/postgres/blob/c8e1ba736b2b9e8c98d37a5b77c4ed31baf94147/src/backend/storage/lmgr/lwlock.c)) in a PostgreSQL process via _Userland Statically Defined Tracing_ (USDT).

Expand Down Expand Up @@ -783,8 +784,73 @@ Locks per type
+--------------+----------+
```

# Additional Information
# pg_row_lock_tracer

`pg_row_lock_tracer` allows to trace row locks (see the PostgreSQL [documentation](https://www.postgresql.org/docs/current/explicit-locking.html#LOCKING-ROWS)) of a PostgreSQL process using _eBPF_ and _UProbes_

## Usage Examples
```
# Trace the row locks of the given PostgreSQL binary
pg_row_lock_tracer -x /home/jan/postgresql-sandbox/bin/REL_14_9_DEBUG/bin/postgres
# Trace the row locks of the PID 1234
pg_row_lock_tracer -p 1234 -x /home/jan/postgresql-sandbox/bin/REL_14_9_DEBUG/bin/postgres
# Trace the row locks of the PID 1234 and 5678
pg_row_lock_tracer -p 1234 -p 5678 -x /home/jan/postgresql-sandbox/bin/REL_14_9_DEBUG/bin/postgres
# Trace the row locks of the PID 1234 and be verbose
pg_row_lock_tracer -p 1234 -x /home/jan/postgresql-sandbox/bin/REL_14_9_DEBUG/bin/postgres -v
# Trace the row locks and show statistics
pg_row_lock_tracer -x /home/jan/postgresql-sandbox/bin/REL_14_9_DEBUG/bin/postgres --statistics
```

## Example output

SQL Query: `SELECT * FROM temperature FOR UPDATE;`

CLI: `sudo pg_row_lock_tracer -x /home/jan/postgresql-sandbox/bin/REL_14_9_DEBUG/bin/postgres --statistics`


Tracer output:

```
[...]
2783502701862408 [Pid 2604491] LOCK_TUPLE_END TM_OK in 13100 ns
2783502701877081 [Pid 2604491] LOCK_TUPLE (Tablespace 1663 database 305234 relation 313419) - (Block and offset 7 143) - LOCK_TUPLE_EXCLUSIVE LOCK_WAIT_BLOCK
2783502701972367 [Pid 2604491] LOCK_TUPLE_END TM_OK in 95286 ns
2783502701988387 [Pid 2604491] LOCK_TUPLE (Tablespace 1663 database 305234 relation 313419) - (Block and offset 7 144) - LOCK_TUPLE_EXCLUSIVE LOCK_WAIT_BLOCK
2783502702001690 [Pid 2604491] LOCK_TUPLE_END TM_OK in 13303 ns
2783502702016387 [Pid 2604491] LOCK_TUPLE (Tablespace 1663 database 305234 relation 313419) - (Block and offset 7 145) - LOCK_TUPLE_EXCLUSIVE LOCK_WAIT_BLOCK
2783502702029375 [Pid 2604491] LOCK_TUPLE_END TM_OK in 12988 ns
^C
Lock statistics:
================
Used wait policies:
+---------+-----------------+----------------+-----------------+
| PID | LOCK_WAIT_BLOCK | LOCK_WAIT_SKIP | LOCK_WAIT_ERROR |
+---------+-----------------+----------------+-----------------+
| 2604491 | 1440 | 0 | 0 |
+---------+-----------------+----------------+-----------------+
Lock modes:
+---------+---------------------+------------------+---------------------------+----------------------+
| PID | LOCK_TUPLE_KEYSHARE | LOCK_TUPLE_SHARE | LOCK_TUPLE_NOKEYEXCLUSIVE | LOCK_TUPLE_EXCLUSIVE |
+---------+---------------------+------------------+---------------------------+----------------------+
| 2604491 | 0 | 0 | 0 | 1440 |
+---------+---------------------+------------------+---------------------------+----------------------+
Lock results:
+---------+-------+--------------+-----------------+------------+------------+------------------+---------------+
| PID | TM_OK | TM_INVISIBLE | TM_SELFMODIFIED | TM_UPDATED | TM_DELETED | TM_BEINGMODIFIED | TM_WOULDBLOCK |
+---------+-------+--------------+-----------------+------------+------------+------------------+---------------+
| 2604491 | 1440 | 0 | 0 | 0 | 0 | 0 | 0 |
+---------+-------+--------------+-----------------+------------+------------+------------------+---------------+
```

# Additional Information

## Installation

Expand Down
1 change: 1 addition & 0 deletions setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ pg_lock_tracer.bpf =
console_scripts =
pg_lock_tracer = pg_lock_tracer.pg_lock_tracer:main
pg_lw_lock_tracer = pg_lock_tracer.pg_lw_lock_tracer:main
pg_row_lock_tracer = pg_lock_tracer.pg_row_lock_tracer:main
animate_lock_graph = pg_lock_tracer.animate_lock_graph:main


121 changes: 121 additions & 0 deletions src/pg_lock_tracer/bpf/pg_row_lock_tracer.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
#include <uapi/linux/ptrace.h>

/* Placeholder for auto generated defines */
__DEFINES__

typedef struct RowLockEvent_t {
u32 pid;
u64 timestamp;
u32 event_type;

/* See RelFileNode - Oid is u32 */
u32 tablespace;
u32 database;
u32 relation;

/* LockTupleMode */
u8 locktuplemode;

/* LockWaitPolicy */
u8 lockwaitpolicy;

/* Locked tuple */
u32 blockid;
u16 offset;

/* TM_Result */
int lockresult;
} RowLockEvent;

BPF_PERF_OUTPUT(lockevents);

static void fill_and_submit(struct pt_regs *ctx, RowLockEvent *event) {
event->pid = bpf_get_current_pid_tgid();
event->timestamp = bpf_ktime_get_ns();

// sudo cat /sys/kernel/debug/tracing/trace_pipe
// bpf_trace_printk("LW lock event for trance: %s\\n", tranche);

lockevents.perf_submit(ctx, event, sizeof(RowLockEvent));
}

/*
* Acquire a tuple lock
*
* Arguments:
* 1. Relation relation (1st member RelFileNode)
* 2. ItemPointer tid
* 3. Snapshot snapshot,
* 4. TupleTableSlot *slot,
* 5. CommandId cid,
* 6. LockTupleMode mode,
* 7. LockWaitPolicy wait_policy,
* 8. uint8 flags,
* 9. TM_FailureData *tmfd
*
*/
int heapam_tuple_lock(struct pt_regs *ctx) {
RowLockEvent event = {.event_type = EVENT_LOCK_TUPLE};

/*
* (gdb) ptype /o RelFileNode
* 0 | 4 Oid spcNode;
* 4 | 4 Oid dbNode;
* 8 | 4 Oid relNode;
*/

char buffer_relation[12];
bpf_probe_read_user(buffer_relation, sizeof(buffer_relation),
(void *)PT_REGS_PARM1(ctx));
bpf_probe_read_kernel(&(event.tablespace), sizeof(event.tablespace),
&(buffer_relation[0]));
bpf_probe_read_kernel(&(event.database), sizeof(event.database),
&(buffer_relation[4]));
bpf_probe_read_kernel(&(event.relation), sizeof(event.relation),
&(buffer_relation[8]));

/* Locked tuple */
char buffer_item_pointer[6];
u16 bi_hi;
u16 bi_lo;

bpf_probe_read_user(buffer_item_pointer, sizeof(buffer_item_pointer),
(void *)PT_REGS_PARM2(ctx));
bpf_probe_read_kernel(&(bi_hi), sizeof(bi_hi), &(buffer_item_pointer[0]));
bpf_probe_read_kernel(&(bi_lo), sizeof(bi_lo), &(buffer_item_pointer[2]));
bpf_probe_read_kernel(&(event.offset), sizeof(event.offset),
&(buffer_item_pointer[4]));

/* See #define BlockIdGetBlockNumber(blockId) */
event.blockid = (bi_hi) << 16 | bi_lo;

/* Locking options */
bpf_probe_read_kernel(&(event.locktuplemode), sizeof(event.locktuplemode),
&(PT_REGS_PARM6(ctx)));

/* Only the first six function parameters are passed via register. All
* remaining parameters are stored on the stack.
*
* See: System V Application Binary Interface—AMD64 Architecture Processor
* Supplement.
*/
void *ptr = 0;
bpf_probe_read(&ptr, sizeof(ptr), (void *)(PT_REGS_SP(ctx) + (1 * 8)));
bpf_probe_read_kernel(&(event.lockwaitpolicy), sizeof(event.lockwaitpolicy),
&ptr);

fill_and_submit(ctx, &event);
return 0;
}

/*
* Acquire a tuple lock - Function done
*/
int heapam_tuple_lock_end(struct pt_regs *ctx) {
RowLockEvent event = {.event_type = EVENT_LOCK_TUPLE_END};

event.lockresult = PT_REGS_RC(ctx);

fill_and_submit(ctx, &event);
return 0;
}
Loading

0 comments on commit 24e6153

Please sign in to comment.