High disk writes #14757

totoCZ · 2024-12-29T02:03:57Z

root@server ~# /usr/share/bcc/tools/xfsslower
Tracing XFS operations slower than 10 ms
TIME     COMM           PID    T BYTES   OFF_KB   LAT(ms) FILENAME
01:59:43 beacon-chain   1679030 W 230199296 558740     74.39 beaconchain.db
01:59:43 beacon-chain   1679030 S 0       0         165.40 beaconchain.db
01:59:45 beacon-chain   1679030 W 230440960 783544     68.87 beaconchain.db
01:59:45 beacon-chain   1679030 S 0       0         160.96 beaconchain.db
01:59:47 beacon-chain   1679030 W 230678528 334172     72.04 beaconchain.db
01:59:47 beacon-chain   1679030 S 0       0         171.51 beaconchain.db
01:59:49 beacon-chain   1679030 W 230920192 559444     71.97 beaconchain.db
01:59:49 geth           1464717 R 3507    23578      47.96 057257.sst
01:59:49 geth           1464717 R 13113   27992      20.45 202679.sst
01:59:49 beacon-chain   1679030 S 0       0         160.16 beaconchain.db
01:59:52 beacon-chain   1679030 W 231157760 784952     69.83 beaconchain.db
01:59:52 beacon-chain   1679030 S 0       0         166.01 beaconchain.db
01:59:54 beacon-chain   1679030 W 231399424 334172     73.27 beaconchain.db
01:59:54 beacon-chain   1679030 S 0       0         164.71 beaconchain.db
01:59:56 beacon-chain   1679030 W 231641088 560148     71.87 beaconchain.db
01:59:57 beacon-chain   1679030 S 0       0         165.27 beaconchain.db
01:59:58 beacon-chain   1679030 W 231878656 786360     73.35 beaconchain.db
01:59:58 beacon-chain   1679030 S 0       0         164.68 beaconchain.db
02:00:00 beacon-chain   1679030 W 232120320 334172     75.96 beaconchain.db
02:00:00 beacon-chain   1679030 S 0       0         162.19 beaconchain.db

As far as I understand Prysm shouldn't be downloading historical blocks by default.
Why is it trying to save 200MB every 2 seconds?

The text was updated successfully, but these errors were encountered:

totoCZ · 2024-12-29T02:07:50Z

totoCZ · 2024-12-29T02:11:23Z

And more and more bytes and the disk goes to hell

yorickdowne · 2024-12-29T08:26:33Z

This isn't a Prysm issue so much as it's a hardware issue. You want a decently fast drive for Prysm. A good NVMe is the recommended storage, which can also hold the execution client. See https://gist.github.com/yorickdowne/f3a3e79a573bf35767cd002cc977b038

totoCZ · 2024-12-29T08:27:49Z

The charts show a comparison with Teku. Definitely poor client design.

8x MX500 behind a PERC in this case. I don't think we should suggest people to invest thousands of dollars in hardware and throw away perfectly working servers, and promote poorly written code instead, especially when other teams have no issues doing it correctly.

Surprisingly this 16TB setup also can hold the execution client. And performs well with Teku. (~99% head votes).

While people in ethereum space like often to talk about NVMe without logic, it's usually also those (like you) who have zero understanding about Linux I/O.

Same as that list of yours based on completely meaningless numbers. Comparing IOPS without knowing the latency is like comparing speed on a car with 1 wheel.

And in this case MX500 latency goes to hell when you try to write 200MB per second!

Do you think Ethereum Blockchain produces
8640000 MB of data every day? Then why are you trying to write it?

Sincerely,
SRE

nisdas · 2024-12-29T11:22:30Z

@totoCZ Can you provide the flags you are running with ? It looks like you have enabled running prysm as an archive node. Also could you provide the logs here

totoCZ · 2024-12-29T11:25:35Z

Exec=--accept-terms-of-use \
--datadir=/prysm \
--execution-endpoint=http://systemd-geth:8551 \
--http-mev-relay=http://systemd-mev-boost:18550 \
--local-block-value-boost=0 \
--suggested-fee-recipient=xxx \
--rpc-host=:: \
--http-host=:: \
--monitoring-host=[::] \
--p2p-udp-port=7750 \
--p2p-tcp-port=7751 \
--p2p-quic-port=7752 \
--p2p-static-id \
--p2p-local-ip=0.0.0.0 \
--p2p-host-ip=149.7.x.x \
--jwt-secret=/jwt/jwt.hex \
--checkpoint-sync-url=https://beaconstate.ethstaker.cc \
--genesis-beacon-api-url=https://beaconstate.ethstaker.cc \

nisdas · 2024-12-29T11:43:56Z

Those flags look normal, so your case is even more bizarre. Not sure why its constantly trying to write data unless its stuck. Can you share the logs in the beacon node over a period of 10 minutes ?

totoCZ · 2024-12-29T12:04:23Z

https://termbin.com/8a3v
Enjoy.
if you can't repro, then close it.
I have another client working.

Assuming I'm not completely stupid (which is possible), then you're writing more bytes every second (kind of like a disk memleak). Because it doesn't seem to me that BYTES is a counter. And it doesn't match the chart. (More bytes written at once = higher ms)

root@server ~# pidstat -d 10
Linux 5.14.0-503.19.1.el9_5.x86_64 (server)  12/29/24        _x86_64_        (56 CPU)

11:51:19      UID       PID   kB_rd/s   kB_wr/s kB_ccwr/s iodelay  Command
11:51:30    101001    725895      0.00      1.59      0.00       0  java
11:51:30    165534    839518      0.00      1.99      0.00       0  prometheus
11:51:30    100000    872424      0.00      0.40      0.00       0  python3
11:51:30    100000   1464717    954.18     30.68      0.00       0  geth
11:51:30    101000   2241665      0.00     21.91      0.00       0  nimbus_beacon_n
11:51:30    100000   2311882      0.00  58352.19      0.00       0  beacon-chain

11:51:30      UID       PID   kB_rd/s   kB_wr/s kB_ccwr/s iodelay  Command
11:51:40        0      2086      0.00      2.40      0.00       0  auditd
11:51:40    100000      3788      0.00      0.40      0.00       0  ripe-atlas
11:51:40    100070      4588      0.00      0.80      0.00       0  postgres
11:51:40    101001    725895      0.00      3.20      0.00       0  java
11:51:40    165534    839518      0.00      1.60      0.00       0  prometheus
11:51:40    100000   1464717    302.00    196.40     51.60       0  geth
11:51:40    101000   2241665      0.00    100.40      0.00       0  nimbus_beacon_n
11:51:40    100070   2308669      0.00      1.60      0.00       0  postgres
11:51:40    100000   2311882      0.00  59913.60      0.00       0  beacon-chain
^C

root@server ~# /usr/share/bcc/tools/fileslower
cannot attach kprobe, probe entry may not exist
Current kernel does not have __vfs_read, try vfs_read instead
cannot attach kprobe, probe entry may not exist
Current kernel does not have __vfs_write, try vfs_write instead
Tracing sync read/writes slower than 10 ms
TIME(s)  COMM           TID    D BYTES   LAT(ms) FILENAME
0.087    beacon-chain   2316918 W 42999808   17.30 beaconchain.db
1.184    beacon-chain   2311988 W 43237376   15.96 beaconchain.db
2.153    beacon-chain   2312054 W 43479040   20.68 beaconchain.db
3.424    beacon-chain   2312035 W 43720704   19.09 beaconchain.db
4.441    beacon-chain   2311899 W 43958272   16.96 beaconchain.db
5.467    beacon-chain   2324165 W 44199936   17.53 beaconchain.db
6.850    beacon-chain   2324219 W 44437504   17.84 beaconchain.db
8.180    systemd        1      W 8         16.66 cgroup.procs
8.180    (time-dir)     2324485 W 8         16.25 cgroup.procs
8.252    beacon-chain   2311971 W 44679168   18.09 beaconchain.db
9.037    beacon-chain   2312043 W 44920832   18.23 beaconchain.db
[...]
283.271  beacon-chain   2312043 W 79241216   29.93 beaconchain.db
285.517  beacon-chain   2311882 W 79478784   29.09 beaconchain.db
287.745  beacon-chain   2324705 W 79720448   31.05 beaconchain.db
289.262  beacon-chain   2312034 W 79958016   27.86 beaconchain.db
290.491  beacon-chain   2312043 W 80199680   25.76 beaconchain.db
291.132  beacon-chain   2312043 W 80437248   30.22 beaconchain.db
291.758  beacon-chain   2312043 W 80678912   28.05 beaconchain.db
292.377  beacon-chain   2324218 W 80920576   24.50 beaconchain.db
293.223  beacon-chain   2324114 W 81158144   30.02 beaconchain.db
294.216  beacon-chain   2312056 W 81399808   28.00 beaconchain.db
295.694  beacon-chain   2311900 W 81637376   25.42 beaconchain.db
296.609  beacon-chain   2312043 W 81879040   31.52 beaconchain.db
296.983  (time-dir)     2325112 W 8         16.57 cgroup.procs
296.983  systemd        1      W 8         16.94 cgroup.procs
297.195  systemd        2325113 W 8         16.83 cgroup.subtree_control
297.352  systemd        1      W 8         10.36 cgroup.procs
297.990  systemd        1      W 8         23.69 cgroup.procs
298.046  systemd        1      W 8         18.80 cgroup.procs
298.553  beacon-chain   2324114 W 82120704   30.45 beaconchain.db
298.920  geth           1470469 R 30722     12.08 receipts.0052.cdat
300.139  beacon-chain   2312038 W 82358272   25.42 beaconchain.db
301.646  beacon-chain   2312055 W 82599936   31.30 beaconchain.db
302.851  beacon-chain   2312033 W 82837504   28.91 beaconchain.db
303.536  beacon-chain   2312043 W 83079168   25.38 beaconchain.db
304.993  beacon-chain   2316917 W 83320832   31.12 beaconchain.db
305.880  beacon-chain   2312055 W 83558400   29.02 beaconchain.db
308.393  beacon-chain   2324165 W 83800064   26.03 beaconchain.db
310.633  beacon-chain   2316917 W 84037632   31.75 beaconchain.db

root@server ~# /usr/share/bcc/tools/fileslower
cannot attach kprobe, probe entry may not exist
Current kernel does not have __vfs_read, try vfs_read instead
cannot attach kprobe, probe entry may not exist
Current kernel does not have __vfs_write, try vfs_write instead
Tracing sync read/writes slower than 10 ms
TIME(s)  COMM           TID    D BYTES   LAT(ms) FILENAME
1.150    beacon-chain   2312035 W 139960320   52.47 beaconchain.db

root@server ~# /usr/share/bcc/tools/fileslower
cannot attach kprobe, probe entry may not exist
Current kernel does not have __vfs_read, try vfs_read instead
cannot attach kprobe, probe entry may not exist
Current kernel does not have __vfs_write, try vfs_write instead
Tracing sync read/writes slower than 10 ms
TIME(s)  COMM           TID    D BYTES   LAT(ms) FILENAME
0.511    beacon-chain   2312038 W 159399936   56.97 beaconchain.db
2.181    beacon-chain   2311971 W 159637504   48.85 beaconchain.db
2.228    geth           1464877 R 3291      28.45 211533.sst
2.244    geth           1465940 R 2677      36.84 095284.sst
2.257    geth           1502686 R 4078      16.25 289344.sst
2.257    geth           1464889 R 4000      29.16 183425.sst
2.259    geth           1464875 R 3515      18.20 184989.sst
2.260    geth           1502683 R 3386      54.90 110356.sst
2.260    geth           1470474 R 4021      42.33 289179.sst
2.269    geth           1464841 R 8236      22.80 268293.sst
2.285    geth           1504961 R 4051      26.99 183425.sst
2.285    geth           1502683 R 3376      24.43 280625.sst

Compare 159399936 with geth that just read 3291 bytes.

nisdas · 2024-12-30T07:36:50Z

Thanks for sharing all your logs, unfortunately it doesn't say much as these are info logs and I can't determine the issue from them. If you are still running prysm, using the --verbosity=debug flag would be helpful. If not its fine but something definitely seems off from the data you have posted.

muratogat · 2025-02-01T14:36:40Z

Did you find a solution? I seem to have the same problem.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

High disk writes #14757

High disk writes #14757

totoCZ commented Dec 29, 2024

totoCZ commented Dec 29, 2024

totoCZ commented Dec 29, 2024

yorickdowne commented Dec 29, 2024

totoCZ commented Dec 29, 2024 •

edited

Loading

nisdas commented Dec 29, 2024 •

edited

Loading

totoCZ commented Dec 29, 2024

nisdas commented Dec 29, 2024

totoCZ commented Dec 29, 2024

nisdas commented Dec 30, 2024

muratogat commented Feb 1, 2025

High disk writes #14757

High disk writes #14757

Comments

totoCZ commented Dec 29, 2024

totoCZ commented Dec 29, 2024

totoCZ commented Dec 29, 2024

yorickdowne commented Dec 29, 2024

totoCZ commented Dec 29, 2024 • edited Loading

nisdas commented Dec 29, 2024 • edited Loading

totoCZ commented Dec 29, 2024

nisdas commented Dec 29, 2024

totoCZ commented Dec 29, 2024

nisdas commented Dec 30, 2024

muratogat commented Feb 1, 2025

totoCZ commented Dec 29, 2024 •

edited

Loading

nisdas commented Dec 29, 2024 •

edited

Loading