Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: several different error return during choas test (kill one dn pod continously interval 10 minutes) #20898

Open
1 task done
aressu1985 opened this issue Dec 24, 2024 · 1 comment
Assignees
Labels
kind/bug Something isn't working severity/s0 Extreme impact: Cause the application to break down and seriously affect the use
Milestone

Comments

@aressu1985
Copy link
Contributor

Is there an existing issue for the same bug?

  • I have checked the existing issues.

Branch Name

2.0-dev

Commit ID

29838b5

Other Environment Information

- Hardware parameters:
3*CN: 7C 28G
1*DN: 7C 28G
3*PROXY: 2C 5G
3*LOG: 1C 7G
- OS type:
- Others:

Actual Behavior

[test load]
run tpcc 10-10
insert data to a table with 2 thread
and during the test, the chaos tool were continuously kill one log pod by interval 10 mins

[issue]
During the chaos test, there were several different error return to client:

  1. ErrorMessage : context deadline exceeded internal error: commitUnsafe
  2. lock table bind changed
  3. write tcp4 10.10.65.137:49158->10.10.34.57:41010: i/o timeout
  4. ExpectedEOB
  5. no such table sbtest.sbtest2
    more information ,see the mo-load tool log:
    mo-load.log

The 1st error "context deadline exceeded internal error: commitUnsafe" seems reasonable, but others were fuzzy.

mo-log:
https://shanghai.idc.matrixorigin.cn:30001/explore?panes=%7B%22yZZ%22:%7B%22datasource%22:%22loki%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bnamespace%3D%5C%22mo-chaos-29838b5-202412232305%5C%22%7D%20%7C%3D%20%60commitUnsafe%60%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22loki%22%7D,%22editorMode%22:%22builder%22%7D%5D,%22range%22:%7B%22from%22:%221734969900000%22,%22to%22:%221734978659000%22%7D%7D%7D&schemaVersion=1&orgId=1

Expected Behavior

No response

Steps to Reproduce

[test load]
run tpcc 10-10
insert data to a table with 2 thread
and during the test, the chaos tool were continuously kill one log pod by interval 10 mins

Additional information

No response

@aressu1985 aressu1985 added kind/bug Something isn't working needs-triage severity/s0 Extreme impact: Cause the application to break down and seriously affect the use labels Dec 24, 2024
@aressu1985 aressu1985 added this to the 2.0.2 milestone Dec 24, 2024
@XuPeng-SH XuPeng-SH assigned volgariver6 and unassigned XuPeng-SH Dec 25, 2024
@XuPeng-SH
Copy link
Contributor

@volgariver6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working severity/s0 Extreme impact: Cause the application to break down and seriously affect the use
Projects
None yet
Development

No branches or pull requests

4 participants