Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[improve][log] Print ZK path if write to ZK fails due to data being too large to persist #23652

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

poorbarcode
Copy link
Contributor

@poorbarcode poorbarcode commented Nov 28, 2024

Motivation & Modifications

Improvement 1

java.io.IOException: Len error. A message from /127.0.0.6:43889 with advertised length of 17890775 is either a malformed message or too large to process (length is greater than jute.maxbuffer=10485760)
     at org.apache.zookeeper.server.NIOServerCnxn.readLength(NIOServerCnxn.java:567) ~[org.apache.zookeeper-zookeeper-3.9.2.jar:3.9.2]
     at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:352) ~[org.apache.zookeeper-zookeeper-3.9.2.jar:3.9.2]
     at org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:508) ~[org.apache.zookeeper-zookeeper-3.9.2.jar:3.9.2]
     at org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:153) ~[org.apache.zookeeper-zookeeper-3.9.2.jar:3.9.2]
     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
     at java.lang.Thread.run(Thread.java:840) ~[?:?]

We can hardly find the ZK node when encountering the above error.

Improvement 2

There is a mechanism that persists cursor ack info into ZK if it matches the two conditions:

  • can not create a cursor ledger successfully.
  • all messages were acknowledged.

But it will print Error while using MetaStore, try to persist the position... even if it will not write ZK because there are still messages to consume.

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository: x

@poorbarcode poorbarcode added this to the 4.1.0 milestone Nov 28, 2024
@poorbarcode poorbarcode self-assigned this Nov 28, 2024
@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Nov 28, 2024
@lhotari lhotari changed the title [improve] [log] Print zk path if failed write ZK due to data is too large to persist [improve][log] Print zk path if failed write ZK due to data is too large to persist Nov 28, 2024
@lhotari
Copy link
Member

lhotari commented Nov 28, 2024

@poorbarcode One small detail about the PR titles. Please don't add a space between the 2 prefixes. Instead of [improve] [log], it should be [improve][log].

@lhotari lhotari changed the title [improve][log] Print zk path if failed write ZK due to data is too large to persist [improve][log] Print ZK path if write to ZK fails due to data being too large to persis Nov 28, 2024
@lhotari lhotari changed the title [improve][log] Print ZK path if write to ZK fails due to data being too large to persis [improve][log] Print ZK path if write to ZK fails due to data being too large to persist Nov 28, 2024
@poorbarcode poorbarcode requested a review from lhotari November 28, 2024 14:41
@poorbarcode
Copy link
Contributor Author

/pulsarbot rerun-failure-checks

@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 85.71429% with 1 line in your changes missing coverage. Please review.

Project coverage is 74.39%. Comparing base (bbc6224) to head (cafadbd).
Report is 812 commits behind head on master.

Files with missing lines Patch % Lines
...g/apache/pulsar/metadata/impl/ZKMetadataStore.java 75.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #23652      +/-   ##
============================================
+ Coverage     73.57%   74.39%   +0.82%     
- Complexity    32624    35044    +2420     
============================================
  Files          1877     1944      +67     
  Lines        139502   147336    +7834     
  Branches      15299    16258     +959     
============================================
+ Hits         102638   109617    +6979     
- Misses        28908    29259     +351     
- Partials       7956     8460     +504     
Flag Coverage Δ
inttests 27.33% <42.85%> (+2.75%) ⬆️
systests 24.37% <0.00%> (+0.05%) ⬆️
unittests 73.79% <85.71%> (+0.94%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...che/bookkeeper/mledger/impl/ManagedCursorImpl.java 80.15% <100.00%> (+0.85%) ⬆️
...ache/pulsar/metadata/impl/batching/MetadataOp.java 100.00% <ø> (ø)
...g/apache/pulsar/metadata/impl/ZKMetadataStore.java 85.53% <75.00%> (+0.28%) ⬆️

... and 668 files with indirect coverage changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants