Skip to content

Hard LOCKUP问题 #107

Description

@wufan618223

我在个别机器上发现如下问题,栈有diagnose_save_stack_trace相关信息,想请大神一起看看,讨论下。我暂时没有看出load_monitor的代码问题,还有crash文件,这里没法提供。
我个人怀疑时候当load非常高的时候,load_monitor关中断时间太长。

11745 [46223316.649832] Kernel panic - not syncing: Hard LOCKUP
11746 [46223316.652645] CPU: 0 PID: 282859 Comm: dockerd-current Kdump: loaded Tainted: G W OE K------------ 3.10.0-862.mt20190308.130.el7.x86_64 #1
11747 [46223316.658281] Hardware name: Sugon I620-G20/60G16-US, BIOS 006 09/15/2018
11748 [46223316.661106] Call Trace:
11749 [46223316.663898] [] dump_stack+0x19/0x1b
11750 [46223316.666675] [] panic+0xe8/0x20d
11751 [46223316.669507] [] nmi_panic+0x3f/0x40
11752 [46223316.672227] [] watchdog_overflow_callback+0x121/0x140
11753 [46223316.674965] [] __perf_event_overflow+0x51/0xf0
11754 [46223316.677697] [] perf_event_overflow+0x14/0x20
11755 [46223316.680307] [] intel_pmu_handle_irq+0x220/0x500
11756 [46223316.682979] [] perf_event_nmi_handler+0x2b/0x50
11757 [46223316.685534] [] nmi_handle.isra.0+0x87/0x140
11758 [46223316.688038] [] do_nmi+0x15d/0x450
11759 [46223316.690452] [] end_repeat_nmi+0x1e/0x81
11760 [46223316.692890] [] ? unwind_next_frame.part.6+0x39/0xd0
11761 [46223316.695291] [] ? unwind_next_frame.part.6+0x39/0xd0
11762 [46223316.697624] [] ? unwind_next_frame.part.6+0x39/0xd0
11763 [46223316.699902] [] __unwind_start+0xc8/0x180
11764 [46223316.702161] [] __save_stack_trace+0x5e/0x100
11765 [46223316.704366] [] save_stack_trace_tsk+0x2c/0x40
11766 [46223316.706487] [] diagnose_save_stack_trace+0x9a/0x100 [diagnose]
11767 [46223316.708576] [] diag_task_kern_stack+0xe/0x10 [diagnose]
11768 [46223316.710708] [] diag_load_timer+0x4be/0x580 [diagnose]
11769 [46223316.712782] [] ? diag_pupil_exit+0x30/0x30 [diagnose]
11770 [46223316.714790] [] hrtimer_handler+0xdc/0x110 [diagnose]
11771 [46223316.716780] [] __hrtimer_run_queues+0xf1/0x260
11772 [46223316.718688] [] hrtimer_interrupt+0xaf/0x1d0
11773 [46223316.720565] [] local_apic_timer_interrupt+0x35/0x60
11774 [46223316.722430] [] smp_apic_timer_interrupt+0x3d/0x50
11775 [46223316.724313] [] apic_timer_interrupt+0x162/0x170
11776 [46223316.726153] [] ? down_read_trylock+0x1a/0x50
11777 [46223316.727932] [] __do_page_fault+0x10f/0x4f0
11778 [46223316.729653] [] do_page_fault+0x35/0x90
11779 [46223316.731383] [] page_fault+0x28/0x30

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions