Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[vioscsi] Extend VioScsiReadRegistryParameter() #1216

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

benyamin-codez
Copy link
Contributor

Adds capability to do per-HBA registry reads.

This permits, e.g. PhysicalBreaks_001, PhysicalBreaks_002, etc.
This is an alternative to \Parameters\Device(d) by instead using the format \Parameters\Device\Valuename_123.

Includes FIXME NOTE regarding StorPortGetSystemPortNumber().
StorPortGetSystemPortNumber() should return ...
...before HBA-specific registry reads can successfully use \Parameters\Device(d).

Coincidentally updated the following in VioScsiFindAdapter():
adaptExt->dump_mode (new tracing)
adaptExt->hba_id (new tracing and new definition)

The HBA ID that determines the "123" in Valuename_123 of ...
...the registry path \Parameters\Device\Valuename_123 is the HBA slot number minus 1.

@benyamin-codez
Copy link
Contributor Author

If anyone knows how to get StorPortGetSystemPortNumber() working, that would be helpful...

As we're using a SAS BusType, we might need to set the adaptExt->wwn, port_wwn and port_idx before calling it...
Or some other technique...
See the FIXME NOTE in the commit for more hints...

@benyamin-codez
Copy link
Contributor Author

@vrozenfe

I was minded to think how this PR might be of benefit to the Jira issue you previously mentioned.

Firstly, the issue mentions enabling SG_IO. You can probably speak best as to whether or not this would still work, as vioscsi is largely a StorPort miniport driver now. It's also possible the fault lies in how the IO was produced.

Anyway, as for this PR it would mean such disks could co-exist with disks of a variety of backing characteristics because each HBA could have it's own NOPB and MaximumTransferLength.

The solution that I imagined, would require a PR for new NOPB calculations, like the one I mentioned here.
This will expose adaptExt->scsi_config.seg_max, which is populated with data reported by QEMU.
FYI, I intend on raising a PR for this soon.

The final part of the solution would be a PR for QEMU to set scsiconf->seg_max to the correct value based on backing device characteristics. These are noteworthy mentions as to how this currently works:
https://github.com/qemu/qemu/blob/ea9cdbcf3a0b8d5497cddf87990f1b39d8f3bb0a/hw/scsi/virtio-scsi.c#L893-L894
https://github.com/qemu/qemu/blob/ea9cdbcf3a0b8d5497cddf87990f1b39d8f3bb0a/hw/scsi/virtio-scsi.c#L1292-L1295

There's probably other ways to do it too - perhaps even solely on the driver side via VPD - but I believe this would require a HBA restart with new parameters rather than single init during FindAdapter().

Best regards,
Ben

Adds capability to do per HBA registry reads.
This permits, e.g. PhysicalBreaks_001, PhysicalBreaks_002, etc. as
an alternative to \Parameters\Device(d) by instead using the
format \Parameters\Device\Valuename_123.
Includes FIXME NOTE regarding StorPortGetSystemPortNumber().
StorPortGetSystemPortNumber() should return before HBA-specific
registry reads can successfully use \Parameters\Device(d).

The HBA ID that determines the "123" in Valuename_123 of the registry
path \Parameters\Device\Valuename_123 is the HBA slot number minus 1.

Credit to @MartinDrab for eagle-eyed code review. 8^D

Signed-off-by: benyamin-codez <[email protected]>
@benyamin-codez benyamin-codez force-pushed the vioscsi-read-reg-per-hba branch from 09b0439 to 07693aa Compare February 18, 2025 12:24
@benyamin-codez
Copy link
Contributor Author

I have reverted adaptExt->hba_id to equal HBA_ID so that it can be used with existing WMI code. If the adapter ever uses STOR_FEATURE_SET_ADAPTER_INTERFACE_TYPE and BusTypeScsi this will need to be used for the adaptor's SCSI ID.

I also split coincidental traces in VioScsiFindAdapter() to another PR.

We now just set the identifier to ConfigInfo->SlotNumber - 1, which is the same as the viostor back-port in PR #1297. This means that if the builtin \Parameters\Device(d) per-HBA function ever works the identifier should match the one used for the alternative \Parameters\Device\Valuename_123 technique...

This one should be right to go... 8^d

@benyamin-codez
Copy link
Contributor Author

Green ✔️ for the HCK-CI/vioscsi-Win2025x64_host check...!! 🚀

Anymore checks for this PR @kostyanf14 ..?

@benyamin-codez
Copy link
Contributor Author

Just a Flush Test failure.

The mentions in PR #1297 apply also to this PR...

Not having a solution for the issue with the built-in \Parameters\Device(d) per-HBA mechanism is not ideal.
I might raise a new issue (bug) for that so as to not get hung up on it here.
Once resolved, we should consider employing registry setting migration via an installer CustomAction.

Did anyone have any comments about the in-line FIXME notes or related comments...?
...as to content and any prudent trimming or the like?

@benyamin-codez
Copy link
Contributor Author

Green ✔️ for the HCK-CI/vioscsi-Win11_24H2x64_host_uefi_viommu check...!! 🚀

@vrozenfe
Copy link
Collaborator

vrozenfe commented Mar 2, 2025

@benyamin-codez
Thank you for your patch. Just to understand it a bit better, can you please a practical example when a per-adapter parameter except for PhysicalBreaks is needed. Historically this parameter was added before adding a proper support for adaptExt->scsi_config.seg_max which IMHO is a proper way to limit the number of SG elements for DMA.

Best,
Vadim.

@benyamin-codez
Copy link
Contributor Author

benyamin-codez commented Mar 2, 2025

@vrozenfe

Just to understand it a bit better, can you please a practical example when a per-adapter parameter except for PhysicalBreaks is needed.

I'm unsure if this will provide sufficient insight or justification, but these are the registry values defined in my WIP:

#define REGISTRY_BUSTYPE                     "BusType"
#define REGISTRY_MAX_PH_BREAKS               "PhysicalBreaks"
#define REGISTRY_MAX_PH_SEGMENTS             "MaxPhysicalSegments"
#define REGISTRY_ACTION_ON_RESET             "VioscsiActionOnReset"
#define REGISTRY_RESP_TIME_LIMIT             "TraceResponseTime"
#define REGISTRY_ALLOC_FOR_CPU_HOTPLUG       "AllocForCpuHotplug"
#define REGISTRY_ONLY_DMA32BITADDRESSES      "OnlyDma32BitAddresses"
#define REGISTRY_INITIATOR_BUS_ID            "InitiatorBusId"

I'm unsure how useful the first and last are to the project. They would be per-HBA. Maybe the same for the reset values? AllocForCpuHotplug is more likely to be global. I guess the idea here is that per-HBA becomes possible.

Historically this parameter was added before adding a proper support for adaptExt->scsi_config.seg_max which IMHO is a proper way to limit the number of SG elements for DMA.

I agree that the best way to initially set this is via adaptExt->scsi_config.seg_max or even adaptExt->scsi_config.max_sectors provided the data is valid. However, QEMU still sets these to 254 and 0xFFFF respectively.

Leaving .max_sectors aside, .seg_max at best is simply MAX_CPU - 2, historically set as a BIOS limit, but unenforced in QEMU in favour of VIRTQUEUE_MAX_SIZE = 1024 ( - 2) - which is 4X the current value.

Workloads involved in the processing and transfer of large files will obviously benefit from larger values of MaxXferLen, which is directly proportional to the number of segments, and this mechanism is really the only way to set this.

The reality is some workload and backing combinations perform better with a higher number of segments and others perform better when the queue characteristics are matched to those of the backing. Per RHEL-56722, sometimes they MUST be the same value. Some work better with a multiple of the backing's max_segments queue value; some don't care if it's a multiple. Some max_segments values are not Base2-aligned and sometimes this matters, and other times not.

One argument for per-HBA tuning would be for the scenarios where we are mixing workloads, e.g. where a transactional RDBMS might be in use, such as in Exchange and SQL Server, where we might have DATA disks with 64KiB clusters and LOG disks with 4KiB clusters, and where a specific number of segments has the best performance for each disk.

The fix really needs to come via QEMU. We need it to report the backing's max_segments queue value for breaking cases such as the one in RHEL-56722. So this then would be safe if not ideal. It would then be up to sys engineers or admins during implementation / commissioning to tune this if necessary or desired.

There are no doubt many things that QEMU devs would need to consider when coming up with a fix. One consideration needs to be: when in non-pass-through mode with multiple backings with different max_segments queue values. I would suggest the HBA would need to be set to the lowest value.

Regardless of any fix to QEMU, having the capability to tune each and every vioscsi disk in a system for its intended workload and backing is surely a feature worth having. There will always be storage teams out there who set the configuration specifications regardless of workload, and sometimes misconfigured backings will remain in use until data can be migrated.

Does that make sense?

Best regards,
Ben

* Storport provided slot_number minus one, padded to hundreds,
* e.g. \Parameters\Device\Valuename_123.
*
* This permits up to 999 HBAs. That ought to be enough... c( O.O )ɔ
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vrozenfe @YanVugenfirer
c( O.O )ɔ <-- This is me looking at you guys in astonishment...
... after saying catering for 999 (incorrectly - it should be 1,000 btw) HBAs in a system should be enough...!
I decided to use a generous value to demonstrate the ASCII conversion below.
If you want to provide a more realistic number it could simplify that part.
You will notice the division is on Base10 boundaries (units, tens, hundreds).
Therefore, the maximums would be 10, 100 and 1,000 HBAs.
I suspect we only need units and tens, catering for up to 100 HBAs, but let me know what you think.

*
* Difficult to determine what is wrong here...
* ¯\_(ツ)_/¯
*
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My present thoughts on this issue with Parameters\Device(d) not initialising:
Being a BusTypeSas adapter, I suspect we may not be setting adaptExt->port_idx early enough.
I'll try to test this in the next few days...
...but if anyone has any insights, please do let me know ASAP before I burn more time on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants