Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

X18 and X24 disks frequently reset with SAS3008 HBAs under heavy write load #162

Open
putnam opened this issue Oct 16, 2024 · 15 comments
Open

Comments

@putnam
Copy link

putnam commented Oct 16, 2024

I have a bunch (11 each) of ST24000NM000C and ST16000NM001G drives that cause major issues with my SAS3008-based HBA (the onboard HBA on the Supermicro H12SSL-CT, but also just on a regular 9300-8i). Specifically the HBA hits some failure mode under heavy write loads to these new X24's and the driver triggers a whole HBA reset. Heavy reads seem to not be affected.

The X18 default EPC settings vary vs. the X24's. They seem to have Idle_A set to 1 and Idle_B set to 1200; the X24 firmware only has Idle_A set to 1. The first time I saw this occur, I disabled EPC on the new X24's with --EPCfeature disable, and I thought it was resolved, but the next time I had a pretty sustained write load it happened again.

I didn't have this issue when it was purely the X18 disks on this adapter. It was only once the X24s were added to the mix that I saw this occur. It also does not occur with HGST/WD disks.

All X18 disks are on SN02, except one RMA refurbed ST16000NM000J on SN04.
All X24 disks are on SN02.
The SAS3008 HBA is on 16.00.14.00. It is actively cooled and temp is monitored and not overheating.
Disks are all attached on a Supermicro 846 SAS3 backplane/LSI expander on 66.16.11.00.
Kernel is 6.10.11-amd64, current Debian testing/trixie.

Here's dmesg during a heavy write load triggering the problem:

[Wed Oct 16 01:13:02 2024] mpt3sas_cm0 fault info from func: mpt3sas_base_make_ioc_ready
[Wed Oct 16 01:13:02 2024] mpt3sas_cm0: fault_state(0x5854)!
[Wed Oct 16 01:13:02 2024] mpt3sas_cm0: sending diag reset !!
[Wed Oct 16 01:13:03 2024] mpt3sas_cm0: diag reset: SUCCESS
[Wed Oct 16 01:13:03 2024] mpt3sas_cm0: In func: _ctl_do_mpt_command
[Wed Oct 16 01:13:03 2024] mpt3sas_cm0: Command terminated due to Host Reset
[Wed Oct 16 01:13:03 2024] mf:

[Wed Oct 16 01:13:03 2024] 0000000b
[Wed Oct 16 01:13:03 2024] 00000000
[Wed Oct 16 01:13:03 2024] 00000000
[Wed Oct 16 01:13:03 2024] 00000000
[Wed Oct 16 01:13:03 2024] 00000000
[Wed Oct 16 01:13:03 2024] 00000018
[Wed Oct 16 01:13:03 2024] 00000000
[Wed Oct 16 01:13:03 2024] 00000008
[Wed Oct 16 01:13:03 2024]

[Wed Oct 16 01:13:03 2024] 00000000
[Wed Oct 16 01:13:03 2024] 0000000a
[Wed Oct 16 01:13:03 2024] 00000000
[Wed Oct 16 01:13:03 2024] 00000000
[Wed Oct 16 01:13:03 2024] 00000000
[Wed Oct 16 01:13:03 2024] 00000000
[Wed Oct 16 01:13:03 2024] 00000000
[Wed Oct 16 01:13:03 2024] 02000000
[Wed Oct 16 01:13:03 2024]

[Wed Oct 16 01:13:03 2024] 00000025
[Wed Oct 16 01:13:03 2024] 00000000
[Wed Oct 16 01:13:03 2024] 00000000
[Wed Oct 16 01:13:03 2024] 00000000
[Wed Oct 16 01:13:03 2024] 00000000
[Wed Oct 16 01:13:03 2024] 00000000
[Wed Oct 16 01:13:03 2024] 00000000
[Wed Oct 16 01:13:03 2024] 00000000

[Wed Oct 16 01:13:03 2024] mpt3sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k
[Wed Oct 16 01:13:03 2024] mpt3sas_cm0: _base_display_fwpkg_version: complete
[Wed Oct 16 01:13:03 2024] mpt3sas_cm0: overriding NVDATA EEDPTagMode setting
[Wed Oct 16 01:13:03 2024] mpt3sas_cm0: LSISAS3008: FWVersion(16.00.14.00), ChipRevision(0x02)
[Wed Oct 16 01:13:03 2024] mpt3sas_cm0: Protocol=(Initiator,Target), Capabilities=(TLR,EEDP,Snapshot Buffer,Diag Trace Buffer,Task Set Full,NCQ)
[Wed Oct 16 01:13:03 2024] mpt3sas_cm0: sending port enable !!
[Wed Oct 16 01:13:10 2024] mpt3sas_cm0: port enable: SUCCESS
[Wed Oct 16 01:13:10 2024] mpt3sas_cm0: search for end-devices: start
[Wed Oct 16 01:13:10 2024] scsi target0:0:0: handle(0x000a), sas_addr(0x5003048017ab9940)
[Wed Oct 16 01:13:10 2024] scsi target0:0:0: enclosure logical id(0x5003048017ab997f), slot(0)
[Wed Oct 16 01:13:10 2024] scsi target0:0:1: handle(0x000b), sas_addr(0x5003048017ab9941)
[Wed Oct 16 01:13:10 2024] scsi target0:0:1: enclosure logical id(0x5003048017ab997f), slot(1)
[Wed Oct 16 01:13:10 2024] scsi target0:0:2: handle(0x000c), sas_addr(0x5003048017ab9942)
[Wed Oct 16 01:13:10 2024] scsi target0:0:2: enclosure logical id(0x5003048017ab997f), slot(2)
[Wed Oct 16 01:13:10 2024] scsi target0:0:3: handle(0x000d), sas_addr(0x5003048017ab9943)
[Wed Oct 16 01:13:10 2024] scsi target0:0:3: enclosure logical id(0x5003048017ab997f), slot(3)
[Wed Oct 16 01:13:10 2024] scsi target0:0:4: handle(0x000e), sas_addr(0x5003048017ab9944)
[Wed Oct 16 01:13:10 2024] scsi target0:0:4: enclosure logical id(0x5003048017ab997f), slot(4)
[Wed Oct 16 01:13:10 2024] scsi target0:0:5: handle(0x000f), sas_addr(0x5003048017ab9945)
[Wed Oct 16 01:13:10 2024] scsi target0:0:5: enclosure logical id(0x5003048017ab997f), slot(5)
[Wed Oct 16 01:13:10 2024] scsi target0:0:6: handle(0x0010), sas_addr(0x5003048017ab9946)
[Wed Oct 16 01:13:10 2024] scsi target0:0:6: enclosure logical id(0x5003048017ab997f), slot(6)
[Wed Oct 16 01:13:10 2024] scsi target0:0:7: handle(0x0011), sas_addr(0x5003048017ab9947)
[Wed Oct 16 01:13:10 2024] scsi target0:0:7: enclosure logical id(0x5003048017ab997f), slot(7)
[Wed Oct 16 01:13:10 2024] scsi target0:0:8: handle(0x0012), sas_addr(0x5003048017ab9948)
[Wed Oct 16 01:13:10 2024] scsi target0:0:8: enclosure logical id(0x5003048017ab997f), slot(8)
[Wed Oct 16 01:13:10 2024] scsi target0:0:9: handle(0x0013), sas_addr(0x5003048017ab9949)
[Wed Oct 16 01:13:10 2024] scsi target0:0:9: enclosure logical id(0x5003048017ab997f), slot(9)
[Wed Oct 16 01:13:10 2024] scsi target0:0:10: handle(0x0014), sas_addr(0x5003048017ab994a)
[Wed Oct 16 01:13:10 2024] scsi target0:0:10: enclosure logical id(0x5003048017ab997f), slot(10)
[Wed Oct 16 01:13:10 2024] scsi target0:0:11: handle(0x0015), sas_addr(0x5003048017ab994b)
[Wed Oct 16 01:13:10 2024] scsi target0:0:11: enclosure logical id(0x5003048017ab997f), slot(11)
[Wed Oct 16 01:13:10 2024] scsi target0:0:12: handle(0x0016), sas_addr(0x5003048017ab995c)
[Wed Oct 16 01:13:10 2024] scsi target0:0:12: enclosure logical id(0x5003048017ab997f), slot(12)
[Wed Oct 16 01:13:10 2024] scsi target0:0:13: handle(0x0017), sas_addr(0x5003048017ab995d)
[Wed Oct 16 01:13:10 2024] scsi target0:0:13: enclosure logical id(0x5003048017ab997f), slot(13)
[Wed Oct 16 01:13:11 2024] scsi target0:0:14: handle(0x0018), sas_addr(0x5003048017ab995e)
[Wed Oct 16 01:13:11 2024] scsi target0:0:14: enclosure logical id(0x5003048017ab997f), slot(14)
[Wed Oct 16 01:13:11 2024] scsi target0:0:15: handle(0x0019), sas_addr(0x5003048017ab995f)
[Wed Oct 16 01:13:11 2024] scsi target0:0:15: enclosure logical id(0x5003048017ab997f), slot(15)
[Wed Oct 16 01:13:11 2024] scsi target0:0:16: handle(0x001a), sas_addr(0x5003048017ab9960)
[Wed Oct 16 01:13:11 2024] scsi target0:0:16: enclosure logical id(0x5003048017ab997f), slot(16)
[Wed Oct 16 01:13:11 2024] scsi target0:0:17: handle(0x001b), sas_addr(0x5003048017ab9961)
[Wed Oct 16 01:13:11 2024] scsi target0:0:17: enclosure logical id(0x5003048017ab997f), slot(17)
[Wed Oct 16 01:13:11 2024] scsi target0:0:18: handle(0x001c), sas_addr(0x5003048017ab9963)
[Wed Oct 16 01:13:11 2024] scsi target0:0:18: enclosure logical id(0x5003048017ab997f), slot(19)
[Wed Oct 16 01:13:11 2024] scsi target0:0:19: handle(0x001d), sas_addr(0x5003048017ab9964)
[Wed Oct 16 01:13:11 2024] scsi target0:0:19: enclosure logical id(0x5003048017ab997f), slot(20)
[Wed Oct 16 01:13:11 2024] scsi target0:0:20: handle(0x001e), sas_addr(0x5003048017ab9966)
[Wed Oct 16 01:13:11 2024] scsi target0:0:20: enclosure logical id(0x5003048017ab997f), slot(22)
[Wed Oct 16 01:13:11 2024] scsi target0:0:21: handle(0x001f), sas_addr(0x5003048017ab9967)
[Wed Oct 16 01:13:11 2024] scsi target0:0:21: enclosure logical id(0x5003048017ab997f), slot(23)
[Wed Oct 16 01:13:11 2024] scsi target0:0:22: handle(0x0020), sas_addr(0x5003048017ab997d)
[Wed Oct 16 01:13:11 2024] scsi target0:0:22: enclosure logical id(0x5003048017ab997f), slot(24)
[Wed Oct 16 01:13:11 2024] mpt3sas_cm0: search for end-devices: complete
[Wed Oct 16 01:13:11 2024] mpt3sas_cm0: search for end-devices: start
[Wed Oct 16 01:13:11 2024] mpt3sas_cm0: search for PCIe end-devices: complete
[Wed Oct 16 01:13:11 2024] mpt3sas_cm0: search for expanders: start
[Wed Oct 16 01:13:11 2024]      expander present: handle(0x0009), sas_addr(0x5003048017ab997f), port:255
[Wed Oct 16 01:13:11 2024] mpt3sas_cm0: search for expanders: complete
[Wed Oct 16 01:13:11 2024] mpt3sas_cm0: mpt3sas_base_hard_reset_handler: SUCCESS
[Wed Oct 16 01:13:11 2024] mpt3sas_cm0: _base_fault_reset_work: hard reset: success
[Wed Oct 16 01:13:11 2024] sd 0:0:0:0: Power-on or device reset occurred
[Wed Oct 16 01:13:11 2024] sd 0:0:4:0: Power-on or device reset occurred
[Wed Oct 16 01:13:11 2024] sd 0:0:9:0: Power-on or device reset occurred
[Wed Oct 16 01:13:11 2024] sd 0:0:1:0: Power-on or device reset occurred
[Wed Oct 16 01:13:11 2024] sd 0:0:11:0: Power-on or device reset occurred
[Wed Oct 16 01:13:11 2024] sd 0:0:3:0: Power-on or device reset occurred
[Wed Oct 16 01:13:11 2024] sd 0:0:17:0: Power-on or device reset occurred
[Wed Oct 16 01:13:11 2024] sd 0:0:6:0: Power-on or device reset occurred
[Wed Oct 16 01:13:11 2024] sd 0:0:7:0: Power-on or device reset occurred
[Wed Oct 16 01:13:11 2024] sd 0:0:8:0: Power-on or device reset occurred
[Wed Oct 16 01:13:11 2024] sd 0:0:10:0: Power-on or device reset occurred
[Wed Oct 16 01:13:11 2024] sd 0:0:12:0: Power-on or device reset occurred
[Wed Oct 16 01:13:11 2024] sd 0:0:13:0: Power-on or device reset occurred
[Wed Oct 16 01:13:11 2024] sd 0:0:14:0: Power-on or device reset occurred
[Wed Oct 16 01:13:11 2024] sd 0:0:15:0: Power-on or device reset occurred
[Wed Oct 16 01:13:11 2024] sd 0:0:16:0: Power-on or device reset occurred
[Wed Oct 16 01:13:11 2024] sd 0:0:18:0: Power-on or device reset occurred
[Wed Oct 16 01:13:11 2024] sd 0:0:19:0: Power-on or device reset occurred
[Wed Oct 16 01:13:11 2024] sd 0:0:20:0: Power-on or device reset occurred
[Wed Oct 16 01:13:11 2024] sd 0:0:2:0: device_block, handle(0x000c)
[Wed Oct 16 01:13:12 2024] mpt3sas_cm0: removing unresponding devices: start
[Wed Oct 16 01:13:12 2024] mpt3sas_cm0: removing unresponding devices: end-devices
[Wed Oct 16 01:13:12 2024] mpt3sas_cm0: Removing unresponding devices: pcie end-devices
[Wed Oct 16 01:13:12 2024] mpt3sas_cm0: removing unresponding devices: expanders
[Wed Oct 16 01:13:12 2024] mpt3sas_cm0: removing unresponding devices: complete
[Wed Oct 16 01:13:12 2024] sd 0:0:2:0: device_unblock and setting to running, handle(0x000c)
[Wed Oct 16 01:13:12 2024] mpt3sas_cm0: scan devices: start
[Wed Oct 16 01:13:12 2024] mpt3sas_cm0:         scan devices: expanders start
[Wed Oct 16 01:13:12 2024] sd 0:0:5:0: attempting task abort!scmd(0x00000000bfccca11), outstanding for 2948 ms & timeout 1000 ms
[Wed Oct 16 01:13:12 2024] sd 0:0:5:0: [sde] tag#187 CDB: ATA command pass through(16) 85 08 0e 00 d5 00 01 00 e0 00 4f 00 c2 00 b0 00
[Wed Oct 16 01:13:12 2024] scsi target0:0:5: handle(0x000f), sas_address(0x5003048017ab9945), phy(5)
[Wed Oct 16 01:13:12 2024] scsi target0:0:5: enclosure logical id(0x5003048017ab997f), slot(5)
[Wed Oct 16 01:13:12 2024] scsi target0:0:5: enclosure level(0x0000), connector name(     )
[Wed Oct 16 01:13:12 2024] mpt3sas_cm0:         break from expander scan: ioc_status(0x0022), loginfo(0x310f0400)
[Wed Oct 16 01:13:12 2024] mpt3sas_cm0:         scan devices: expanders complete
[Wed Oct 16 01:13:12 2024] mpt3sas_cm0:         scan devices: end devices start
[Wed Oct 16 01:13:12 2024] mpt3sas_cm0:         break from end device scan: ioc_status(0x0022), loginfo(0x310f0400)
[Wed Oct 16 01:13:12 2024] mpt3sas_cm0:         scan devices: end devices complete
[Wed Oct 16 01:13:12 2024] mpt3sas_cm0:         scan devices: pcie end devices start
[Wed Oct 16 01:13:12 2024] mpt3sas_cm0: log_info(0x3003011d): originator(IOP), code(0x03), sub_code(0x011d)
[Wed Oct 16 01:13:12 2024] mpt3sas_cm0: log_info(0x3003011d): originator(IOP), code(0x03), sub_code(0x011d)
[Wed Oct 16 01:13:12 2024] mpt3sas_cm0:         break from pcie end device scan: ioc_status(0x0021), loginfo(0x3003011d)
[Wed Oct 16 01:13:12 2024] mpt3sas_cm0:         pcie devices: pcie end devices complete
[Wed Oct 16 01:13:12 2024] mpt3sas_cm0: scan devices: complete
[Wed Oct 16 01:13:12 2024] mpt3sas_cm0: device is not present handle(0x000c), flags!!!
[Wed Oct 16 01:13:12 2024] sd 0:0:5:0: task abort: SUCCESS scmd(0x00000000bfccca11)
[Wed Oct 16 01:13:20 2024] sd 0:0:2:0: Power-on or device reset occurred
[Wed Oct 16 01:13:20 2024] sd 0:0:5:0: Power-on or device reset occurred
[Wed Oct 16 01:13:20 2024] sd 0:0:21:0: Power-on or device reset occurred

I contacted Seagate support and uh, they told me to install some Windows-only software to monitor for firmware updates and didn't know how to respond to anything technical at all. So I hope maybe through you guys this info is useful.

@putnam putnam changed the title X24 disks frequently reset with SAS3008 HBAs under load X16 and X24 disks frequently reset with SAS3008 HBAs under load Oct 16, 2024
@putnam putnam changed the title X16 and X24 disks frequently reset with SAS3008 HBAs under load X16 and X24 disks frequently reset with SAS3008 HBAs under heavy write load Oct 16, 2024
@vonericsen
Copy link
Contributor

Hi @putnam,

Sorry you are having issues in your system.
To make sure I am following your issue this is what you have seen happening:

  • Installed new drives, started seeing resets during heavy loads
  • Disabled EPC and seemed to resolve it
  • Resets are back

Is this correct?

From the standards, disabling EPC should hold across resets and power cycles.
Are you seeing that the EPC feature is being enabled again even when you have not sent the enable?

As for firmware updates, sometimes those can help (both HBA side and drive side). From the Seagate support site there is a Firmware update finder that you can provide a serial number to check for new firmware. You don't need the other Windows only tool (it basically scans and opens that webpage for you with the SN already loaded).
If you scroll to the bottom of this page you can provide a serial number of a drive to check for newer firmware.
I don't know if that would resolve the issue, but you can try it.

I am asking around to see if any of the customer support engineers have run into this as well, but I have not heard anything yet.

@putnam
Copy link
Author

putnam commented Oct 17, 2024

Thanks so much for the response. I edited my original ticket a lot, so I think you're responding to the initial version. I realized, looking at bash history and the state of the disks, that:

  1. --EPCfeature disable did actually persist. You're right.
  2. Disabling EPC didn't resolve the issues with the X24 disks after all, because I didn't truly load them with writes. Once they were loaded with writes again the same behavior came back.

I'm sure this is now outside the scope of this repo, but you guys have been so useful in the past when reporting possible firmware bugs. Maybe it's useful to have shared it here anyway. I'm not an enterprise customer, just an end user, so it's hard to get a line to someone with inside engineering connections.

I can repro more consistently now by just copying a lot of data to the disks. I have found very little info on these particular 20TB models since I understand they're technically binned/refurbed X24 HAMR disks. It may well be an issue with the LSI/Broadcom firmware or even mpt3sas, but again it doesn't repro on my 60+ HGST/WD disks or the X16's on their own.

Since we're almost certainly outside the scope of openSeaChest here feel free to close but if it's something you guys are open to pursuing with more debug data and info I could share it here or over email privately.

Regarding firmware on the end user portal there's no update available for these yet.

@vonericsen
Copy link
Contributor

@putnam,

I did pass this issue along to some people internally to see if they've seen similar problems before with these drives and hardware, but I have not heard anything yet.

If you dump the SATA phy event counters, are you seeing those increase at all?
openSeaChest_Info -d <handle> --showPhyEvents

If these are increasing (not just the reset counter, but others) if can point towards a cabling issue.

I'll see if there is anything else I can think of trying that might also help debug this.

@putnam
Copy link
Author

putnam commented Oct 17, 2024

Thanks for the reply! OK, so here are the PHY counters from openSeaChest_Info --showPhyEvents for the different Seagate disks hanging onto this backplane/controller. Do you know whether this is a rolling window or lifetime? Back in September when I first got these disks, I replaced the internal SAS cables due to CRC errors during the initial ZFS resilvering. You know, changing firmwares and cables and the workload is always a bunch of variable juggling and I don't want to get it wrong here. But when I changed the internal cables, the resilver continued without any issue or drops and the CRC errors went away at that time. I also have multiple brand new 3M and Amphenol cables on the shelf here and can swap them in to try to eliminate the cable variable one more time, if you like. It wouldn't be the first, or the second, or the third time that cabling randomly came up. In the last 10+ years of dealing with SAS2/SAS3 I feel like cables are an evergreen issue that everyone faces.

Anyway, the resets I see now are specifically when ZFS is copying a large amount of data to the pool and is lighting up the vdevs made up of Seagate devices for a sustained amount of time. Eventually, you see the same message about the HBA resetting with the same fault code in mpt3sas. I did some digging in the mpt3sas driver hoping to find some bitflags or something to identify the fault code but it looks to be internal/proprietary to Broadcom/LSI.

20TB X24 Disks (Newer)
==========================================================================================
 openSeaChest_Info - openSeaChest drive utilities - NVMe Enabled
 Copyright (c) 2014-2024 Seagate Technology LLC and/or its Affiliates, All Rights Reserved
 openSeaChest_Info Version: 2.7.0-8_0_1 X86_64
 Build Date: Sep 25 2024
 Today: 20241017T160234 User: root
==========================================================================================

 - ST24000NM000C-3WD103 - ZXA04RL6 - SN02 - ATA

====SATA Phy Event Counters====
V = Vendor Unique event tracker
M = Counter maximum value reached
D2H = Device to Host
H2D = Host to Device
    ID                Value Description
    10                   16 H2D FISes sent due to COMRESET
     1                    2 Command failed with iCRC error
     3                    0 R_ERR response for D2H data FIS
     4                    2 R_ERR response for H2D data FIS
     6                    0 R_ERR response for D2H non-data FIS
     7                    0 R_ERR response for H2D non-data FIS
    11                    2 CRC errors withing H2D FIS
    13                    0 Non-CRC errors within H2D FIS

==========================================================================================
 openSeaChest_Info - openSeaChest drive utilities - NVMe Enabled
 Copyright (c) 2014-2024 Seagate Technology LLC and/or its Affiliates, All Rights Reserved
 openSeaChest_Info Version: 2.7.0-8_0_1 X86_64
 Build Date: Sep 25 2024
 Today: 20241017T160234 User: root
==========================================================================================

 - ST24000NM000C-3WD103 - ZXA09AWD - SN02 - ATA

====SATA Phy Event Counters====
V = Vendor Unique event tracker
M = Counter maximum value reached
D2H = Device to Host
H2D = Host to Device
    ID                Value Description
    10                   11 H2D FISes sent due to COMRESET
     1                    2 Command failed with iCRC error
     3                    0 R_ERR response for D2H data FIS
     4                    2 R_ERR response for H2D data FIS
     6                    0 R_ERR response for D2H non-data FIS
     7                    0 R_ERR response for H2D non-data FIS
    11                    2 CRC errors withing H2D FIS
    13                    0 Non-CRC errors within H2D FIS

==========================================================================================
 openSeaChest_Info - openSeaChest drive utilities - NVMe Enabled
 Copyright (c) 2014-2024 Seagate Technology LLC and/or its Affiliates, All Rights Reserved
 openSeaChest_Info Version: 2.7.0-8_0_1 X86_64
 Build Date: Sep 25 2024
 Today: 20241017T160234 User: root
==========================================================================================

 - ST24000NM000C-3WD103 - ZXA09KJB - SN02 - ATA

====SATA Phy Event Counters====
V = Vendor Unique event tracker
M = Counter maximum value reached
D2H = Device to Host
H2D = Host to Device
    ID                Value Description
    10                   12 H2D FISes sent due to COMRESET
     1                    3 Command failed with iCRC error
     3                    0 R_ERR response for D2H data FIS
     4                    3 R_ERR response for H2D data FIS
     6                    0 R_ERR response for D2H non-data FIS
     7                    0 R_ERR response for H2D non-data FIS
    11                    3 CRC errors withing H2D FIS
    13                    0 Non-CRC errors within H2D FIS

==========================================================================================
 openSeaChest_Info - openSeaChest drive utilities - NVMe Enabled
 Copyright (c) 2014-2024 Seagate Technology LLC and/or its Affiliates, All Rights Reserved
 openSeaChest_Info Version: 2.7.0-8_0_1 X86_64
 Build Date: Sep 25 2024
 Today: 20241017T160234 User: root
==========================================================================================

 - ST24000NM000C-3WD103 - ZXA09QQH - SN02 - ATA

====SATA Phy Event Counters====
V = Vendor Unique event tracker
M = Counter maximum value reached
D2H = Device to Host
H2D = Host to Device
    ID                Value Description
    10                   13 H2D FISes sent due to COMRESET
     1                    0 Command failed with iCRC error
     3                    0 R_ERR response for D2H data FIS
     4                    0 R_ERR response for H2D data FIS
     6                    0 R_ERR response for D2H non-data FIS
     7                    0 R_ERR response for H2D non-data FIS
    11                    0 CRC errors withing H2D FIS
    13                    0 Non-CRC errors within H2D FIS

==========================================================================================
 openSeaChest_Info - openSeaChest drive utilities - NVMe Enabled
 Copyright (c) 2014-2024 Seagate Technology LLC and/or its Affiliates, All Rights Reserved
 openSeaChest_Info Version: 2.7.0-8_0_1 X86_64
 Build Date: Sep 25 2024
 Today: 20241017T160234 User: root
==========================================================================================

 - ST24000NM000C-3WD103 - ZXA09RSQ - SN02 - ATA

====SATA Phy Event Counters====
V = Vendor Unique event tracker
M = Counter maximum value reached
D2H = Device to Host
H2D = Host to Device
    ID                Value Description
    10                   16 H2D FISes sent due to COMRESET
     1                    3 Command failed with iCRC error
     3                    0 R_ERR response for D2H data FIS
     4                    3 R_ERR response for H2D data FIS
     6                    0 R_ERR response for D2H non-data FIS
     7                    0 R_ERR response for H2D non-data FIS
    11                    3 CRC errors withing H2D FIS
    13                    0 Non-CRC errors within H2D FIS

==========================================================================================
 openSeaChest_Info - openSeaChest drive utilities - NVMe Enabled
 Copyright (c) 2014-2024 Seagate Technology LLC and/or its Affiliates, All Rights Reserved
 openSeaChest_Info Version: 2.7.0-8_0_1 X86_64
 Build Date: Sep 25 2024
 Today: 20241017T160234 User: root
==========================================================================================

 - ST24000NM000C-3WD103 - ZXA0BKFL - SN02 - ATA

====SATA Phy Event Counters====
V = Vendor Unique event tracker
M = Counter maximum value reached
D2H = Device to Host
H2D = Host to Device
    ID                Value Description
    10                   10 H2D FISes sent due to COMRESET
     1                    2 Command failed with iCRC error
     3                    0 R_ERR response for D2H data FIS
     4                    2 R_ERR response for H2D data FIS
     6                    0 R_ERR response for D2H non-data FIS
     7                    0 R_ERR response for H2D non-data FIS
    11                    2 CRC errors withing H2D FIS
    13                    0 Non-CRC errors within H2D FIS

==========================================================================================
 openSeaChest_Info - openSeaChest drive utilities - NVMe Enabled
 Copyright (c) 2014-2024 Seagate Technology LLC and/or its Affiliates, All Rights Reserved
 openSeaChest_Info Version: 2.7.0-8_0_1 X86_64
 Build Date: Sep 25 2024
 Today: 20241017T160234 User: root
==========================================================================================

 - ST24000NM000C-3WD103 - ZXA0C241 - SN02 - ATA

====SATA Phy Event Counters====
V = Vendor Unique event tracker
M = Counter maximum value reached
D2H = Device to Host
H2D = Host to Device
    ID                Value Description
    10                   17 H2D FISes sent due to COMRESET
     1                    2 Command failed with iCRC error
     3                    0 R_ERR response for D2H data FIS
     4                    2 R_ERR response for H2D data FIS
     6                    0 R_ERR response for D2H non-data FIS
     7                    0 R_ERR response for H2D non-data FIS
    11                    2 CRC errors withing H2D FIS
    13                    0 Non-CRC errors within H2D FIS

==========================================================================================
 openSeaChest_Info - openSeaChest drive utilities - NVMe Enabled
 Copyright (c) 2014-2024 Seagate Technology LLC and/or its Affiliates, All Rights Reserved
 openSeaChest_Info Version: 2.7.0-8_0_1 X86_64
 Build Date: Sep 25 2024
 Today: 20241017T160235 User: root
==========================================================================================

 - ST24000NM000C-3WD103 - ZXA0CWPX - SN02 - ATA

====SATA Phy Event Counters====
V = Vendor Unique event tracker
M = Counter maximum value reached
D2H = Device to Host
H2D = Host to Device
    ID                Value Description
    10                   13 H2D FISes sent due to COMRESET
     1                    4 Command failed with iCRC error
     3                    0 R_ERR response for D2H data FIS
     4                    4 R_ERR response for H2D data FIS
     6                    0 R_ERR response for D2H non-data FIS
     7                    0 R_ERR response for H2D non-data FIS
    11                    4 CRC errors withing H2D FIS
    13                    0 Non-CRC errors within H2D FIS

==========================================================================================
 openSeaChest_Info - openSeaChest drive utilities - NVMe Enabled
 Copyright (c) 2014-2024 Seagate Technology LLC and/or its Affiliates, All Rights Reserved
 openSeaChest_Info Version: 2.7.0-8_0_1 X86_64
 Build Date: Sep 25 2024
 Today: 20241017T160235 User: root
==========================================================================================

 - ST24000NM000C-3WD103 - ZXA0D2EL - SN02 - ATA

====SATA Phy Event Counters====
V = Vendor Unique event tracker
M = Counter maximum value reached
D2H = Device to Host
H2D = Host to Device
    ID                Value Description
    10                   12 H2D FISes sent due to COMRESET
     1                    0 Command failed with iCRC error
     3                    0 R_ERR response for D2H data FIS
     4                    0 R_ERR response for H2D data FIS
     6                    0 R_ERR response for D2H non-data FIS
     7                    0 R_ERR response for H2D non-data FIS
    11                    0 CRC errors withing H2D FIS
    13                    0 Non-CRC errors within H2D FIS

==========================================================================================
 openSeaChest_Info - openSeaChest drive utilities - NVMe Enabled
 Copyright (c) 2014-2024 Seagate Technology LLC and/or its Affiliates, All Rights Reserved
 openSeaChest_Info Version: 2.7.0-8_0_1 X86_64
 Build Date: Sep 25 2024
 Today: 20241017T160235 User: root
==========================================================================================

 - ST24000NM000C-3WD103 - ZXA0EWXY - SN02 - ATA

====SATA Phy Event Counters====
V = Vendor Unique event tracker
M = Counter maximum value reached
D2H = Device to Host
H2D = Host to Device
    ID                Value Description
    10                   13 H2D FISes sent due to COMRESET
     1                    5 Command failed with iCRC error
     3                    0 R_ERR response for D2H data FIS
     4                    5 R_ERR response for H2D data FIS
     6                    0 R_ERR response for D2H non-data FIS
     7                    0 R_ERR response for H2D non-data FIS
    11                    5 CRC errors withing H2D FIS
    13                    0 Non-CRC errors within H2D FIS

==========================================================================================
 openSeaChest_Info - openSeaChest drive utilities - NVMe Enabled
 Copyright (c) 2014-2024 Seagate Technology LLC and/or its Affiliates, All Rights Reserved
 openSeaChest_Info Version: 2.7.0-8_0_1 X86_64
 Build Date: Sep 25 2024
 Today: 20241017T160235 User: root
==========================================================================================

 - ST24000NM000C-3WD103 - ZXA0GJGN - SN02 - ATA

====SATA Phy Event Counters====
V = Vendor Unique event tracker
M = Counter maximum value reached
D2H = Device to Host
H2D = Host to Device
    ID                Value Description
    10                   19 H2D FISes sent due to COMRESET
     1                    1 Command failed with iCRC error
     3                    0 R_ERR response for D2H data FIS
     4                    1 R_ERR response for H2D data FIS
     6                    0 R_ERR response for D2H non-data FIS
     7                    1 R_ERR response for H2D non-data FIS
    11                    2 CRC errors withing H2D FIS
    13                    0 Non-CRC errors within H2D FIS
16TB X18 Disks (Older, pre-existing without resets)
==========================================================================================
 openSeaChest_Info - openSeaChest drive utilities - NVMe Enabled
 Copyright (c) 2014-2024 Seagate Technology LLC and/or its Affiliates, All Rights Reserved
 openSeaChest_Info Version: 2.7.0-8_0_1 X86_64
 Build Date: Sep 25 2024
 Today: 20241017T160554 User: root
==========================================================================================

 - ST16000NM000J-2TW103 - ZR5ECA55 - SN04 - ATA

====SATA Phy Event Counters====
V = Vendor Unique event tracker
M = Counter maximum value reached
D2H = Device to Host
H2D = Host to Device
    ID                Value Description
    10                    3 H2D FISes sent due to COMRESET
     1                    1 Command failed with iCRC error
     3                    0 R_ERR response for D2H data FIS
     4                    1 R_ERR response for H2D data FIS
     6                    0 R_ERR response for D2H non-data FIS
     7                    0 R_ERR response for H2D non-data FIS

==========================================================================================
 openSeaChest_Info - openSeaChest drive utilities - NVMe Enabled
 Copyright (c) 2014-2024 Seagate Technology LLC and/or its Affiliates, All Rights Reserved
 openSeaChest_Info Version: 2.7.0-8_0_1 X86_64
 Build Date: Sep 25 2024
 Today: 20241017T160554 User: root
==========================================================================================

 - ST16000NM001G-2KK103 - ZL20CAJ9 - SN02 - ATA

====SATA Phy Event Counters====
V = Vendor Unique event tracker
M = Counter maximum value reached
D2H = Device to Host
H2D = Host to Device
    ID                Value Description
    10                    4 H2D FISes sent due to COMRESET
     1                    0 Command failed with iCRC error
     3                    0 R_ERR response for D2H data FIS
     4                    0 R_ERR response for H2D data FIS
     6                    0 R_ERR response for D2H non-data FIS
     7                    0 R_ERR response for H2D non-data FIS

==========================================================================================
 openSeaChest_Info - openSeaChest drive utilities - NVMe Enabled
 Copyright (c) 2014-2024 Seagate Technology LLC and/or its Affiliates, All Rights Reserved
 openSeaChest_Info Version: 2.7.0-8_0_1 X86_64
 Build Date: Sep 25 2024
 Today: 20241017T160554 User: root
==========================================================================================

 - ST16000NM001G-2KK103 - ZL20D3TL - SN02 - ATA

====SATA Phy Event Counters====
V = Vendor Unique event tracker
M = Counter maximum value reached
D2H = Device to Host
H2D = Host to Device
    ID                Value Description
    10                    7 H2D FISes sent due to COMRESET
     1                    1 Command failed with iCRC error
     3                    0 R_ERR response for D2H data FIS
     4                    1 R_ERR response for H2D data FIS
     6                    0 R_ERR response for D2H non-data FIS
     7                    0 R_ERR response for H2D non-data FIS

==========================================================================================
 openSeaChest_Info - openSeaChest drive utilities - NVMe Enabled
 Copyright (c) 2014-2024 Seagate Technology LLC and/or its Affiliates, All Rights Reserved
 openSeaChest_Info Version: 2.7.0-8_0_1 X86_64
 Build Date: Sep 25 2024
 Today: 20241017T160554 User: root
==========================================================================================

 - ST16000NM001G-2KK103 - ZL20YT4M - SN02 - ATA

====SATA Phy Event Counters====
V = Vendor Unique event tracker
M = Counter maximum value reached
D2H = Device to Host
H2D = Host to Device
    ID                Value Description
    10                    5 H2D FISes sent due to COMRESET
     1                    1 Command failed with iCRC error
     3                    0 R_ERR response for D2H data FIS
     4                    1 R_ERR response for H2D data FIS
     6                    0 R_ERR response for D2H non-data FIS
     7                    0 R_ERR response for H2D non-data FIS

==========================================================================================
 openSeaChest_Info - openSeaChest drive utilities - NVMe Enabled
 Copyright (c) 2014-2024 Seagate Technology LLC and/or its Affiliates, All Rights Reserved
 openSeaChest_Info Version: 2.7.0-8_0_1 X86_64
 Build Date: Sep 25 2024
 Today: 20241017T160554 User: root
==========================================================================================

 - ST16000NM001G-2KK103 - ZL213QN0 - SN02 - ATA

====SATA Phy Event Counters====
V = Vendor Unique event tracker
M = Counter maximum value reached
D2H = Device to Host
H2D = Host to Device
    ID                Value Description
    10                    4 H2D FISes sent due to COMRESET
     1                    0 Command failed with iCRC error
     3                    0 R_ERR response for D2H data FIS
     4                    0 R_ERR response for H2D data FIS
     6                    0 R_ERR response for D2H non-data FIS
     7                    0 R_ERR response for H2D non-data FIS

==========================================================================================
 openSeaChest_Info - openSeaChest drive utilities - NVMe Enabled
 Copyright (c) 2014-2024 Seagate Technology LLC and/or its Affiliates, All Rights Reserved
 openSeaChest_Info Version: 2.7.0-8_0_1 X86_64
 Build Date: Sep 25 2024
 Today: 20241017T160555 User: root
==========================================================================================

 - ST16000NM001G-2KK103 - ZL21909L - SN02 - ATA

====SATA Phy Event Counters====
V = Vendor Unique event tracker
M = Counter maximum value reached
D2H = Device to Host
H2D = Host to Device
    ID                Value Description
    10                    7 H2D FISes sent due to COMRESET
     1                    2 Command failed with iCRC error
     3                    0 R_ERR response for D2H data FIS
     4                    2 R_ERR response for H2D data FIS
     6                    0 R_ERR response for D2H non-data FIS
     7                    0 R_ERR response for H2D non-data FIS

==========================================================================================
 openSeaChest_Info - openSeaChest drive utilities - NVMe Enabled
 Copyright (c) 2014-2024 Seagate Technology LLC and/or its Affiliates, All Rights Reserved
 openSeaChest_Info Version: 2.7.0-8_0_1 X86_64
 Build Date: Sep 25 2024
 Today: 20241017T160555 User: root
==========================================================================================

 - ST16000NM001G-2KK103 - ZL21AHY7 - SN02 - ATA

====SATA Phy Event Counters====
V = Vendor Unique event tracker
M = Counter maximum value reached
D2H = Device to Host
H2D = Host to Device
    ID                Value Description
    10                   12 H2D FISes sent due to COMRESET
     1                    1 Command failed with iCRC error
     3                    0 R_ERR response for D2H data FIS
     4                    1 R_ERR response for H2D data FIS
     6                    0 R_ERR response for D2H non-data FIS
     7                    1 R_ERR response for H2D non-data FIS

==========================================================================================
 openSeaChest_Info - openSeaChest drive utilities - NVMe Enabled
 Copyright (c) 2014-2024 Seagate Technology LLC and/or its Affiliates, All Rights Reserved
 openSeaChest_Info Version: 2.7.0-8_0_1 X86_64
 Build Date: Sep 25 2024
 Today: 20241017T160555 User: root
==========================================================================================

 - ST16000NM001G-2KK103 - ZL21L84X - SN02 - ATA

====SATA Phy Event Counters====
V = Vendor Unique event tracker
M = Counter maximum value reached
D2H = Device to Host
H2D = Host to Device
    ID                Value Description
    10                   16 H2D FISes sent due to COMRESET
     1                    2 Command failed with iCRC error
     3                    0 R_ERR response for D2H data FIS
     4                    2 R_ERR response for H2D data FIS
     6                    0 R_ERR response for D2H non-data FIS
     7                    0 R_ERR response for H2D non-data FIS

==========================================================================================
 openSeaChest_Info - openSeaChest drive utilities - NVMe Enabled
 Copyright (c) 2014-2024 Seagate Technology LLC and/or its Affiliates, All Rights Reserved
 openSeaChest_Info Version: 2.7.0-8_0_1 X86_64
 Build Date: Sep 25 2024
 Today: 20241017T160555 User: root
==========================================================================================

 - ST16000NM001G-2KK103 - ZL21L97Y - SN02 - ATA

====SATA Phy Event Counters====
V = Vendor Unique event tracker
M = Counter maximum value reached
D2H = Device to Host
H2D = Host to Device
    ID                Value Description
    10                    4 H2D FISes sent due to COMRESET
     1                    1 Command failed with iCRC error
     3                    0 R_ERR response for D2H data FIS
     4                    1 R_ERR response for H2D data FIS
     6                    0 R_ERR response for D2H non-data FIS
     7                    0 R_ERR response for H2D non-data FIS

==========================================================================================
 openSeaChest_Info - openSeaChest drive utilities - NVMe Enabled
 Copyright (c) 2014-2024 Seagate Technology LLC and/or its Affiliates, All Rights Reserved
 openSeaChest_Info Version: 2.7.0-8_0_1 X86_64
 Build Date: Sep 25 2024
 Today: 20241017T160555 User: root
==========================================================================================

 - ST16000NM001G-2KK103 - ZL21LGDW - SN02 - ATA

====SATA Phy Event Counters====
V = Vendor Unique event tracker
M = Counter maximum value reached
D2H = Device to Host
H2D = Host to Device
    ID                Value Description
    10                    4 H2D FISes sent due to COMRESET
     1                    1 Command failed with iCRC error
     3                    0 R_ERR response for D2H data FIS
     4                    1 R_ERR response for H2D data FIS
     6                    0 R_ERR response for D2H non-data FIS
     7                    0 R_ERR response for H2D non-data FIS

==========================================================================================
 openSeaChest_Info - openSeaChest drive utilities - NVMe Enabled
 Copyright (c) 2014-2024 Seagate Technology LLC and/or its Affiliates, All Rights Reserved
 openSeaChest_Info Version: 2.7.0-8_0_1 X86_64
 Build Date: Sep 25 2024
 Today: 20241017T160555 User: root
==========================================================================================

 - ST16000NM001G-2KK103 - ZL21LYZC - SN02 - ATA

====SATA Phy Event Counters====
V = Vendor Unique event tracker
M = Counter maximum value reached
D2H = Device to Host
H2D = Host to Device
    ID                Value Description
    10                    6 H2D FISes sent due to COMRESET
     1                    1 Command failed with iCRC error
     3                    0 R_ERR response for D2H data FIS
     4                    1 R_ERR response for H2D data FIS
     6                    0 R_ERR response for D2H non-data FIS
     7                    0 R_ERR response for H2D non-data FIS

@putnam putnam changed the title X16 and X24 disks frequently reset with SAS3008 HBAs under heavy write load X18 and X24 disks frequently reset with SAS3008 HBAs under heavy write load Oct 17, 2024
@vonericsen
Copy link
Contributor

Do you know whether this is a rolling window or lifetime?

For this page it continues counting until you reset the counters on the page. I don't remember if we put that in as an option in openSeaChest yet. I will have to review the code.

The reason I mentioned the CRC errors is due to some of my own past experience trying to troubleshoot some issues other customers have seen.

I have also had some long conversations with one of the Seagate engineers who works on the phy level with the goal of figuring out a way to write a test for detecting a bad cable. It's not an easy task 😆 but we did come up with some ideas including using these logs. I have not had time to implement it yet, but it will be an expanded version of the openSeaChest_GenericTests --bufferTest routine I already have....sometimes that will detect an error, but it runs far too short of a time to be reliable right now.

One thing I learned from him was that the faster the interface is running (6Gb/s vs 3Gb/s) the sooner you notice signaling issues. The most common is seeing the CRC counters increasing. This is often increasing due to a cabling problem....not always, but in your case I suspect it is since it's happening on multiple different drives, even drives that were not previously having an issue. It's possible that these new drives have a slightly different phy behavior that managed to bring this out.
There are a couple different issues that can happen on the bus that HBAs and drives are both trying to mitigate (such as signal reflections) but sometimes that can only go so far before it's no longer correctable. There are also limits on how many signal level issues can be worked around and with these new drives maybe some existing problem was manageable that is no longer manageable (just guessing here).

Another thing that can happen (and I have experienced myself) is similar things happen as the backplane connectors wear out from plugging and unplugging drives. Eventually all connectors will fail but as you approach the insertion count limit you can start to see these kinds of issues too.

I don't know if any of these will solve the issue, but you can try these things:

  1. Unplug the drives and plug them back in (sometimes this reseats the connector better and may mitigate this issue)
  2. If you have backplanes and can replace them easily, maybe give it a try
  3. Replacing cables in the system.

openSeaChest_Configure also has an option to set the phy speed lower as well, which you can also try but it may limit your maximum sequential read/write on more modern drives.
DO NOT go below 3.0Gb/s though. I found out that some modern SAS/SATA controllers no longer support 1.5Gb/s and once the drive is set to that you will have to track down another HBA that does support that low speed to restore it to a higher speed. I found this in the HBA documentation, so you can also check that to see what it supports first.

One last thing I want to mention is that if you can check for updates on the HBA firmware that may also help. I have seen that resolve odd behavior issues as well due to fixes made to the HBA's firmware. I have seen some past Broadcom HBA's resolve some odd phy issues before, but I don't know if that is affecting this specific case.

Let me know if this helps. I'll see if I can talk to that signal engineer I mentioned about this to see if he has any other ideas.

@putnam
Copy link
Author

putnam commented Oct 17, 2024

Thanks. Will go over and try. Regarding the HBA, it's a pretty common SAS3008 HBA and on latest firmware (16.00.14.00). The backplane hasn't had a ton of insertion cycles, but reseating can't hurt. I will swap to a new-in-bag Amphenol cable set + reseat disks and see if I can repro again and report back.

@vonericsen
Copy link
Contributor

@putnam,

Did swapping cables make a difference in your case?

Another idea is to see if the HBA's BIOS/UEFI settings allow disabling link power management. I am not sure if that is supported by your HBA or not, but I had an issue that rings a lot of very similar bells to this reported to me and in that case disabling the link power management in the BIOS/UEFI for the AHCI card stopped the resets.
I'm not sure if that will be the solution here, but something else you can check.

@putnam
Copy link
Author

putnam commented Nov 22, 2024

No, unfortunately it has not, after cooking a while. I changed it out and left town -- still out of town at the moment until next week -- but I'm still seeing the same behavior under only a little bit of write load. And right now it only seems to affect the newer X24 disks.

Link power management (ASPM) is disabled I think. You can see the state of it in lspci -vv:

---snip---
41:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS3008 PCI-Express Fusion-MPT SAS-3 (rev 02)
        DeviceName: LSI 3008 SAS
        Subsystem: Super Micro Computer Inc AOC-S3008L-L8e
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 353
        IOMMU group: 26
        Region 0: I/O ports at 7000 [size=256]
        Region 1: Memory at b1840000 (64-bit, non-prefetchable) [size=64K]
        Region 3: Memory at b1800000 (64-bit, non-prefetchable) [size=256K]
        Expansion ROM at b1700000 [disabled] [size=1M]
        Capabilities: [50] Power Management version 3
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [68] Express (v2) Endpoint, IntMsgNum 0
                DevCap: MaxPayload 4096 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0W TEE-IO-
                DevCtl: CorrErr- NonFatalErr- FatalErr+ UnsupReq-
                        RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
                        MaxPayload 512 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr+ NonFatalErr+ FatalErr- UnsupReq+ AuxPwr- TransPend-
                LnkCap: Port #0, Speed 8GT/s, Width x8, ASPM not supported
                        ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
                LnkCtl: ASPM Disabled; RCB 64 bytes, LnkDisable- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 8GT/s, Width x8
                        TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range BC, TimeoutDis+ NROPrPrP- LTR-
                         10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix-
                         EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
                         FRS- TPHComp- ExtTPHComp-
                         AtomicOpsCap: 32bit- 64bit- 128bitCAS-
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
                         AtomicOpsCtl: ReqEn-
                         IDOReq- IDOCompl- LTR- EmergencyPowerReductionReq-
                         10BitTagReq- OBFF Disabled, EETLPPrefixBlk-
                LnkCap2: Supported Link Speeds: 2.5-8GT/s, Crosslink- Retimer- 2Retimers- DRS-
                LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance Preset/De-emphasis: -6dB de-emphasis, 0dB preshoot
                LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+ EqualizationPhase1+
                         EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-
                         Retimer- 2Retimers- CrosslinkRes: unsupported
---snip---

Note under LnkCtl it says ASPM Disabled.

As I write this I'm taking a minute on vacation to prop up the array before it goes totally offline. This has happened before, but one X24 disk actually got knocked so offline it hasn't come back. It will need a physical unplug/replug or a full server power cycle to bring it back up.

As far as drive power, I reconfirmed their EPC settings across the board:

root@dwight:~/seagate/openseachest_exes# ./openSeaChest_PowerControl --showEPCSettings -d /dev/sdb
==========================================================================================
 openSeaChest_PowerControl - openSeaChest drive utilities - NVMe Enabled
 Copyright (c) 2014-2021 Seagate Technology LLC and/or its Affiliates, All Rights Reserved
 openSeaChest_PowerControl Version: 3.0.2-2_2_3 X86_64
 Build Date: Jun 21 2021
 Today: Fri Nov 22 08:13:27 2024	User: root
==========================================================================================

/dev/sg1 - ST24000NM000C-3WD103 - XXXXXXXX - ATA
.

===EPC Settings===
	* = timer is enabled
	C column = Changeable
	S column = Savable
	All times are in 100 milliseconds

Name       Current Timer Default Timer Saved Timer   Recovery Time C S
Idle A      0            *1            *1            1             Y Y
Idle B      0             1200          1200         4             Y Y
Idle C      0             6000          6000         20            Y Y
Standby Z   0             9000          9000         110           Y Y

Kind of at a loss as to what to do with it right now besides swap in another vendor. There must be something going on between the firmware and the controller but I don't know where else to look.

@lwfitzgerald
Copy link

@putnam We're asking about "SATA Link power management" (putting the SATA Phy connection to sleep), rather than ASPM (putting the PCI-e link to sleep).

I think you can see if this is enabled by running:

openSeaChest_SMART -d /dev/sdb --SATInfo

and looking for whether SATA Device Initiated Power Management has [Enabled] at the end of the same line.

@putnam
Copy link
Author

putnam commented Nov 23, 2024

Ah, sorry. Here is the output on a sample X24 disk. It doesn't have [Enabled] at the end:

==========================================================================================
 openSeaChest_SMART - openSeaChest drive utilities - NVMe Enabled
 Copyright (c) 2014-2021 Seagate Technology LLC and/or its Affiliates, All Rights Reserved
 openSeaChest_SMART Version: 2.0.1-2_2_3 X86_64
 Build Date: Jun 21 2021
 Today: Sat Nov 23 06:50:48 2024	User: root
==========================================================================================

/dev/sg0 - ST24000NM000C-3WD103 - ZXXXXXXX - ATA
SCSI Translator Reported Information:
	Vendor ID: ATA     
	Model Number: ST24000NM000C-3W
	Serial Number: ZXXXXXXX
	Firmware Revision: SN02
	SAT Vendor ID: LSI     
	SAT Product ID: LSI SATL        
	SAT Product Rev: 0008
	World Wide Name: XXXXXXX
	Drive Capacity (TB/TiB): 24.00/21.83
	Temperature Data:
		Current Temperature (C): 34
		Highest Temperature (C): Not Reported
		Lowest Temperature (C): Not Reported
	Power On Time:  80 days 3 hours 
	Power On Hours: 1923.00
	MaxLBA: 46875541503
	Native MaxLBA: Not Reported
	Logical Sector Size (B): 512
	Physical Sector Size (B): 4096
	Sector Alignment: 0
	Rotation Rate (RPM): 7200
	Form Factor: 3.5"
	Last DST information:
		DST has never been run
	Long Drive Self Test Time:  18 hours 13 minutes 
	Interface speed:
		Not Reported
	Annualized Workload Rate (TB/yr): Not Reported
	Total Bytes Read (B): Not Reported
	Total Bytes Written (B): Not Reported
	Encryption Support: Not Supported
	Cache Size (MiB): Not Reported
	Read Look-Ahead: Enabled
	Write Cache: Enabled
	SMART Status: Good
	ATA Security Information: Supported
	Firmware Download Support: Full, Segmented
	Specifications Supported:
		SPC-4
		SAM-4
		SAT-3
		SPC-4
		SBC-3
		SAS
		ATA8-ACS
		ZBC
	Features Supported:
		SAT
		ATA Security
		Self Test
		Automatic Write Reassignment [Enabled]
		EPC [Enabled]
		Informational Exceptions [Mode 6]
	Adapter Information:
		Vendor ID: 1000h
		Product ID: 0097h
		Revision: 0002h
ATA Reported Information:
	Model Number: ST24000NM000C-3WD103
	Serial Number: ZXXXXXX
	Firmware Revision: SN02
	World Wide Name: XXXXXXXX
	Drive Capacity (TB/TiB): 24.00/21.83
	Native Drive Capacity (TB/TiB): 24.00/21.83
	Temperature Data:
		Current Temperature (C): 34
		Highest Temperature (C): 51
		Lowest Temperature (C): 29
	Power On Time:  80 days 3 hours 
	Power On Hours: 1923.00
	MaxLBA: 46875541503
	Native MaxLBA: 46875541503
	Logical Sector Size (B): 512
	Physical Sector Size (B): 4096
	Sector Alignment: 0
	Rotation Rate (RPM): 7200
	Form Factor: 3.5"
	Last DST information:
		DST has never been run
	Long Drive Self Test Time:  1 day 15 hours 1 minute 
	Interface speed:
		Max Speed (Gb/s): 6.0
		Negotiated Speed (Gb/s): 6.0
	Annualized Workload Rate (TB/yr): 309.50
	Total Bytes Read (TB): 56.03
	Total Bytes Written (TB): 11.91
	Encryption Support: Not Supported
	Cache Size (MiB): 512.00
	Read Look-Ahead: Enabled
	Write Cache: Enabled
	Low Current Spinup: Disabled
	SMART Status: Unknown or Not Supported
	ATA Security Information: Supported
	Firmware Download Support: Full, Segmented, Deferred
	Specifications Supported:
		ACS-5
		ACS-4
		ACS-3
		ACS-2
		ATA8-ACS
		ATA/ATAPI-7
		ATA/ATAPI-6
		ATA/ATAPI-5
		SATA 3.3
		SATA 3.2
		SATA 3.1
		SATA 3.0
		SATA 2.6
		SATA 2.5
		SATA II: Extensions
		SATA 1.0a
		ATA8-AST
	Features Supported:
		Sanitize
		SATA NCQ
		SATA Software Settings Preservation [Enabled]
		SATA Device Initiated Power Management
		Power Management
		Security
		SMART [Enabled]
		48bit Address
		PUIS
		GPL
		Streaming
		SMART Self-Test
		SMART Error Logging
		Write-Read-Verify
		DSN
		AMAC
		EPC
		Sense Data Reporting
		SCT Write Same
		SCT Error Recovery Control
		SCT Feature Control
		SCT Data Tables
		Host Logging
		Set Sector Configuration
		Storage Element Depopulation
		Seagate In Drive Diagnostics (IDD)
	Adapter Information:
		Vendor ID: 1000h
		Product ID: 0097h
		Revision: 0002h

@vonericsen
Copy link
Contributor

@putnam,

Thanks for sharing that additional information.

There are 2 parts to power management of the phy on both SATA and SAS: Host initiated, and Device initiated (sometimes abbreviated HIPM and DIPM).

Please note I DO NOT recommend using openSeaChest_PowerControl to enable the device-initiated power management. If your system has not already enabled it on its own, enabling it may make the drive inaccessible. That option was added to the tool due to some customer request, but if you are not certain if your hardware supports it, I recommend leaving it as-is. The chipset or HBA should be enabling it themselves when it is supported and compatible. I have had a few people report issues around this internally because they enabled it on a system that was unable to wake the phy back up.

There are a few SATA capability bits that are not part of the humanized -i output today that might provide a few more clues.
Can you share the output of openSeaChest_PowerControl -d <hande> -i -v4 | tee verboseIdentify.txt?
This will have the raw data from the drive and I can review it manually to see if that gives some other details that may be useful.

@putnam
Copy link
Author

putnam commented Nov 27, 2024

Thanks for the response @vonericsen -- I have actually gotten myself into that situation before and can confirm it's not a good idea :)

Here is the output of that command for an example disk that was at the top of the stack on the last set of resets.

verboseIdentify.txt

@putnam
Copy link
Author

putnam commented Dec 5, 2024

EDIT: Most of this is still accurate, but the power transitions reported by smartd (0x81->0xFF) are expected because they're on WD disks that have EPC enabled.

I'm still bashing on this. I've been trying to reduce things down to a reliable repro and I'm not quite there, but let me explain my test setup.

The server has a zpool made up of many vdevs from different vendors; one of the Seagate vdevs is made up of 11x 16TB Exos disks and another is made up of 11x 24TB Exos disks. Both of these sets live on the same backplane, which is attached to a SAS3008-based controller that's built-in on the motherboard, the Supermicro H12SSL-CT. The HBA is functionally the same as a 9300-8i and shares the same firmware image.

I am creating a continuous synthetic load by copying a 100GB random file from a scratch disk into a test dataset (and then deleting it). To be sure the disks are always busy I have two of these running in a loop.

There are a few monitoring processes on this server:

  1. smartd (part of smartmontools)
  2. netdata
  3. hddtemp
  4. storcli64 (part of LSI/Broadcom management tools for their HBAs -- used to check HBA temp)

I decided to start disabling these one-by-one to reduce things talking to the disks. First I disabled my storcli64 script because I've had issues with it in the past. But the resets continued at roughly the same clip. So next I tried disabling smartd. Right away when I disabled smartd the frequency of the resets went down dramatically. Before I disabled smartd, resets would occur fairly reliably under load (but not on a reliable schedule). But after disabling smartd it took over 12 hours of hard writes before it occurred again. When I restart smartd, the frequency increases again.

Here is my smartd config line, for reference:

DEVICESCAN -H -f -l error -l selftest -n standby,q -m [email protected] -M exec /usr/share/smartmontools/smartd-runner -M diminishing

Note I don't do regular short/long SMART tests with smartd; it's only tracking the health status and error logs. In fact, I used to actually do these, but the automated tests would reliably cause disk resets with Seagate disks and I never did come up with a solution besides disabling automated tests. In those cases, it would affect individual disks, not the whole controller. I think when a SMART test is under way some commands may hang for longer than the kernel likes which causes the kernel to reset the disk on the HBA (a default behavior of mpt3sas).

So then I tried running smartd in the foreground in debug mode to see if anything strange was happening. Although nothing stuck out immediately, I was surprised to see the occasional note that a drive's power status transitioned when queried. Looking back in journalctl I see these quite frequently since installing the X24 disks. Here are some examples:

Oct 29 15:38:38 dwight smartd[7299]: Device: /dev/sdao [SAT], CHECK POWER STATUS spins up disk (0x81 -> 0xff)
Oct 29 15:38:43 dwight smartd[7299]: Device: /dev/sdau [SAT], CHECK POWER STATUS spins up disk (0x81 -> 0xff)
Oct 29 15:38:53 dwight smartd[7299]: Device: /dev/sday [SAT], CHECK POWER STATUS spins up disk (0x81 -> 0xff)
Oct 29 16:08:43 dwight smartd[7299]: Device: /dev/sday [SAT], CHECK POWER STATUS spins up disk (0x81 -> 0xff)
Oct 29 18:08:43 dwight smartd[7299]: Device: /dev/sdax [SAT], CHECK POWER STATUS spins up disk (0x81 -> 0xff)
Oct 29 18:08:48 dwight smartd[7299]: Device: /dev/sday [SAT], CHECK POWER STATUS spins up disk (0x81 -> 0xff)

Reading the smartctI source code this line is printing the old and new power states reported by the disk when it sends its query. I didn't know what the 0x81 state was, but looked it up in the ATA spec (page 344, table 204) and it says that's EPC at Idle_A. Now that's weird because EPC is disabled on these disks. I can confirm it with Seachest across all of them. Example:

==========================================================================================
 openSeaChest_PowerControl - openSeaChest drive utilities - NVMe Enabled
 Copyright (c) 2014-2021 Seagate Technology LLC and/or its Affiliates, All Rights Reserved
 openSeaChest_PowerControl Version: 3.0.2-2_2_3 X86_64
 Build Date: Jun 21 2021
 Today: Thu Dec  5 04:42:11 2024        User: root
==========================================================================================

/dev/sg18 - ST24000NM000C-3WD103 - ZXA0BKFL - ATA
.

===EPC Settings===
        * = timer is enabled
        C column = Changeable
        S column = Savable
        All times are in 100 milliseconds

Name       Current Timer Default Timer Saved Timer   Recovery Time C S
Idle A      0            *1            *1            1             Y Y
Idle B      0             1200          1200         4             Y Y
Idle C      0             6000          6000         20            Y Y
Standby Z   0             9000          9000         110           Y Y

So, why do these sometimes end up in Idle A? I'm not sure. I don't know if this is directly related, either, but I don't have any other logs that show the power state transitions except for smartd, which happens to show them when it does a check on all the disks (which, by default, is every 20 minutes).

Is it possible something in the firmware in the X24 disks is causing power transitions even when EPC is disabled/the timers are all set to 0?

And, why would the querying of SMART data so greatly increase the frequency of these HBA resets, and only on Seagate disks? Again if I disable smartd, the frequency drops dramatically. My theory is that other processes try querying SMART (netdata, hddtemp, maybe others) but they do so less frequently.

I will keep digging and hopefully this is useful.

@putnam
Copy link
Author

putnam commented Dec 5, 2024

(I'd edit my post above but I think most of you guys are reading via email and you might not see it)

Apologies, late night jetlag brain here, but of all the reported power transitions almost all of them were actually WD disks. There was a single Seagate X16 that was showing a transition, and it did in fact have a timer set. I'm not sure how that happened and I've disabled it again now, but resets continue regardless.

So from the above all I can say is disabling smartd, which queries the disks roughly every 20 minutes, greatly reduces the frequency of resets. I think if I could whack-a-mole any process that checks the SMART data I could probably eliminate them entirely. But I don't really get why.

@vonericsen
Copy link
Contributor

Hi @putnam,

This is really interesting information!

And, why would the querying of SMART data so greatly increase the frequency of these HBA resets, and only on Seagate disks? Again if I disable smartd, the frequency drops dramatically. My theory is that other processes try querying SMART (netdata, hddtemp, maybe others) but they do so less frequently.

Do you know what data is being pulled each time smartd runs? Is is equivalent to the smartctx options -a or -x?

One thing I have observed in the past about resets and software to talk to drives is in every operating system you must provide a timeout value, or how long the software expects a command to take before it should be considered a failure. OpenSeaChest usually uses 15 seconds for most commands.
When a command takes longer than this amount of time the OS will return an error for a command time out within about 1 second or less of this timeout value. When it returns from this timeout the OS also has to perform some amount of error recovery, so the drive is not hung for the next process to access it which ends up being a reset (COMRESET on SATA).

With that in mind I am thinking of a couple things that could be happening leading to this happening:

  1. The drive has spun down and in the process of spinning back up it's taking longer than the command timeout value used. (Unlikely since you have essentially disabled EPC)
  2. smartd is using a command timeout value that is too short (not sure how likely, have not looked at the code but I would be surprised if it's less than the 15 seconds we use in openSeaChest)
  3. The drive is responding to smartd/smartctl but some piece of data that it wants is missing for some reason (log not supported, or the first few commands read from flash but one reading from disk is taking longer than expected).

openSeaChest does not have an equivalent to smartctl's -a or -x options to do a lot of things all at once, but you can add many options together to get somewhat close:
openSeaChest_SMART -d <handle> --smartAttributes hybrid --showDSTLog --showSMARTErrorLog comprehensive --smartCheck --smartInfo --deviceStatistics

This is close, but not exactly the same. However, I would be curious if running this triggers anything similar to what you are seeing with smartd.
As I mentioned above in point 3, some data is stored in flash and some is stored on the disc so maybe something about this is causing the issue, but I am not certain.

One other difference I know about in openSeaChest is it is coded to prefer the GPL logs over the SMART logs for DST info and SMART error log info. If I remember correctly, that is not how smartctl works (but maybe this has changed over time). Maybe if it's still querying the smart logs over the GPL logs that is another part of what is triggering this. I do not have an option to force it down the SMART log path, but I can look into it to see if it helps with debugging.

If you get a chance, can you share the output of openSeaChest_Logs -d <handle> --listSupportedLogs? This may help me understand if that is also a source of difference in how smartctl is running based on what logs the drive is supporting.

Note I don't do regular short/long SMART tests with smartd; it's only tracking the health status and error logs. In fact, I used to actually do these, but the automated tests would reliably cause disk resets with Seagate disks and I never did come up with a solution besides disabling automated tests.

I know of a bug in the Windows version of smartctl running DST in captive/foreground mode where the timeout value is too short which always ends up in a reset from the system. I do not remember if this also affected Linux....I found it months/a year ago. In background/offline mode it should not be an issue since those commands return to the host immediately after starting and can continue processing other commands while it runs.
Our default in openSeaChest is to run in background/offline mode since it tends to be the most compatible, but you can force it with --captive.
openSeaChest_SMART -d <handle> --shortDST --captive

We are setting the timeout for short DST in captive to 2 minutes as the spec requires and to the drive's time estimate for long (I do not recommend running long in captive mode...that will probably never complete without a reset since it can take so many hours).
You can also try this to see if it triggers the same kind of reset scenario you are seeing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants