-
Notifications
You must be signed in to change notification settings - Fork 392
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CI-NO-BUILD] [vioscsi] Implement cold and hot tracing paths #1228
base: master
Are you sure you want to change the base?
[CI-NO-BUILD] [vioscsi] Implement cold and hot tracing paths #1228
Conversation
Note somewhat cheeky inclusion to raise the kvm-guest-drivers-windows/vioscsi/vioscsi.vcxproj Lines 141 to 148 in 9b47ab2
EDIT: This was removed. |
Addendum to b6edb81, b7904fc and acaf26d. 1. Split NTDDI_ definitions to new file ntddi_ver.h with minor refactor. 2. Minor refactor for RegistryPath reporting to appear outside of conditional. 3. Removed references to obsoleted RUN_MIN_CHECKED definition (PR virtio-win#1228). Signed-off-by: benyamin-codez <[email protected]>
Addendum to 100af3e. 1. Removed references to obsoleted RUN_MIN_CHECKED definition (PR virtio-win#1228). 2. Add comment to conditionally excluded variable. Signed-off-by: benyamin-codez <[email protected]>
Addendum to c4ac94b. 1. Removed references to obsoleted RUN_MIN_CHECKED definition (PR virtio-win#1228). Signed-off-by: benyamin-codez <[email protected]>
EDIT: This was removed. See PR #1309. Still valid: Does this need to be done by a different means anyway, e.g. via |
I've left PR #1214 in draft until this PR merges as that one may have hot path entries that I will need to refactor to make use of the Let me know what might be needed to progress this one. Best regards, |
1. Show the RegistryPath. 2. Show the CrashDump Mode. 3. Show StorPortInitialize() return value (including LONG). 4. Show NTDDI_VERSION. 5. Split NTDDI_ definitions to new file ntddi_ver.h and addedd missing NTDDI definitions 6. Removed references to obsoleted RUN_MIN_CHECKED definition (PR virtio-win#1228). Signed-off-by: benyamin-codez <[email protected]>
1. Show the RegistryPath. 2. Show the CrashDump Mode. 3. Show StorPortInitialize() return value (including LONG). 4. Show NTDDI_VERSION. 5. Split NTDDI_ definitions to new file ntddi_ver.h and addedd missing NTDDI definitions 6. Removed references to obsoleted RUN_MIN_CHECKED definition (PR virtio-win#1228). Signed-off-by: benyamin-codez <[email protected]>
1. Show the RegistryPath. 2. Show the CrashDump Mode. 3. Show StorPortInitialize() return value (including LONG). 4. Show NTDDI_VERSION. 5. Split NTDDI_ definitions to new file ntddi_ver.h and addedd missing NTDDI definitions 6. Removed references to obsoleted RUN_MIN_CHECKED definition (PR virtio-win#1228). Signed-off-by: benyamin-codez <[email protected]>
516c3fa
to
b55e6b1
Compare
b55e6b1
to
664dde7
Compare
1. Show the RegistryPath. 2. Show the CrashDump Mode. 3. Show StorPortInitialize() return value (including LONG). 4. Show NTDDI_VERSION. 5. Split NTDDI_ definitions to new file ntddi_ver.h and addedd missing NTDDI definitions 6. Removed references to obsoleted RUN_MIN_CHECKED definition (PR virtio-win#1228). Signed-off-by: benyamin-codez <[email protected]>
@YanVugenfirer @vrozenfe @kostyanf14
If you're ok with retaining this and there are no other issues, it looks like this one's ready to go... EDIT: This was removed. |
1. Show the RegistryPath. 2. Show the CrashDump Mode. 3. Show StorPortInitialize() return value (including LONG). 4. Show NTDDI_VERSION. 5. Split NTDDI_ definitions to new file ntddi_ver.h and addedd missing NTDDI definitions 6. Removed references to obsoleted RUN_MIN_CHECKED definition (PR virtio-win#1228). Signed-off-by: benyamin-codez <[email protected]>
664dde7
to
a0cb1cf
Compare
@YanVugenfirer @vrozenfe @kostyanf14 I have split the cheeky NTDDI version raise to PR #1309 and rebased. This one is probably the plug in the pipe... Any concerns I need to address...? |
The Some observations:
Under here is the table's scaffold if you want to reply with corrections.
Any further prudent |
I'm guessing we should probably do a |
So it looks like Where there any more checks you wanted to run? _Outdated content_I have the following I want to force push: --- a/vioscsi/vioscsi.c
+++ b/vioscsi/vioscsi.c
@@ -1551,6 +1551,9 @@ VioScsiBuildIo(IN PVOID DeviceExtension, IN PSCSI_REQUEST_BLOCK Srb)
{
SRB_SET_SRB_STATUS(Srb, SRB_STATUS_NO_DEVICE);
StorPortNotification(RequestComplete, DeviceExtension, Srb);
+#if !defined(RUN_UNCHECKED) || !defined(RUN_COLD_PATH_ONLY)
+ EXIT_FN_SRB_HP();
+#endif
return FALSE;
}
@@ -1922,7 +1925,6 @@ PreProcessRequest(IN PVOID DeviceExtension, IN PSRB_TYPE Srb, IN PVOID InlineFun
#if !defined(RUN_UNCHECKED)
RhelDbgPrint(TRACE_LEVEL_INFORMATION, " Completing all pending SRBs\n");
#endif
- // CompletePendingRequestsOnReset(DeviceExtension, DpcLock);
CompletePendingRequestsOnReset(DeviceExtension);
SRB_SET_SRB_STATUS(Srb, SRB_STATUS_SUCCESS);
#if !defined(RUN_UNCHECKED) Is that ok...? Shall I push that first, before any more checks? |
@kostyanf14 @YanVugenfirer @vrozenfe
I've also noticed I should relocate some of the _Outdated content_So I could do these and the new commit I mentioned above in a new PR if preferred. If someone can please let me know which way to go that would be helpful...? 8^d |
1. Show the RegistryPath. 2. Show the CrashDump Mode. 3. Show StorPortInitialize() return value (including LONG). 4. Show NTDDI_VERSION. 5. Split NTDDI_ definitions to new file ntddi_ver.h and addedd missing NTDDI definitions 6. Removed references to obsoleted RUN_MIN_CHECKED definition (PR virtio-win#1228). Signed-off-by: benyamin-codez <[email protected]>
fwiw, here are those changes. _Outdated content_--- a/vioscsi/helper.c
+++ b/vioscsi/helper.c
@@ -237,7 +237,6 @@ DeviceReset(IN PVOID DeviceExtension)
ULONG fragLen;
ULONG sgElement;
- ENTER_FN();
if (adaptExt->dump_mode)
{
#if !defined(RUN_UNCHECKED)
--- a/vioscsi/vioscsi.c
+++ b/vioscsi/vioscsi.c
@@ -1524,6 +1524,10 @@ VioScsiUnitControl(IN PVOID DeviceExtension, IN SCSI_UNIT_CONTROL_TYPE ControlTy
BOOLEAN
VioScsiBuildIo(IN PVOID DeviceExtension, IN PSCSI_REQUEST_BLOCK Srb)
{
+#if !defined(RUN_UNCHECKED) || !defined(RUN_COLD_PATH_ONLY)
+ ENTER_FN_SRB_HP();
+#endif
+
PCDB cdb;
ULONG i;
ULONG fragLen;
@@ -1536,10 +1540,6 @@ VioScsiBuildIo(IN PVOID DeviceExtension, IN PSCSI_REQUEST_BLOCK Srb)
UCHAR TargetId;
UCHAR Lun;
-#if !defined(RUN_UNCHECKED) || !defined(RUN_COLD_PATH_ONLY)
- ENTER_FN_SRB_HP();
-#endif
-
cdb = SRB_CDB(Srb);
srbExt = SRB_EXTENSION(Srb);
adaptExt = (PADAPTER_EXTENSION)DeviceExtension;
@@ -1677,6 +1677,10 @@ VOID FORCEINLINE DispatchQueue(IN PVOID DeviceExtension, IN ULONG MessageId, IN
VOID ProcessBuffer(IN PVOID DeviceExtension, IN ULONG MessageId, IN STOR_SPINLOCK LockMode)
{
+#if !defined(RUN_UNCHECKED) || !defined(RUN_COLD_PATH_ONLY)
+ ENTER_FN_HP();
+#endif
+
PVirtIOSCSICmd cmd;
unsigned int len;
PADAPTER_EXTENSION adaptExt = (PADAPTER_EXTENSION)DeviceExtension;
@@ -1689,10 +1693,6 @@ VOID ProcessBuffer(IN PVOID DeviceExtension, IN ULONG MessageId, IN STOR_SPINLOC
ULONG vq_req_idx;
PVOID LockContext = NULL; // sanity check for LockMode = InterruptLock or StartIoLock
-#if !defined(RUN_UNCHECKED) || !defined(RUN_COLD_PATH_ONLY)
- ENTER_FN_HP();
-#endif
-
if (QueueNumber >= (adaptExt->num_queues + VIRTIO_SCSI_REQUEST_QUEUE_0))
{
#if !defined(RUN_UNCHECKED)
@@ -1755,10 +1755,10 @@ VOID ProcessBuffer(IN PVOID DeviceExtension, IN ULONG MessageId, IN STOR_SPINLOC
VOID VioScsiCompleteDpcRoutine(IN PSTOR_DPC Dpc, IN PVOID Context, IN PVOID SystemArgument1, IN PVOID SystemArgument2)
{
- ULONG MessageId;
#if !defined(RUN_UNCHECKED) || !defined(RUN_COLD_PATH_ONLY)
ENTER_FN_HP();
#endif
+ ULONG MessageId;
MessageId = PtrToUlong(SystemArgument1);
ProcessBuffer(Context, MessageId, DpcLock);
#if !defined(RUN_UNCHECKED) || !defined(RUN_COLD_PATH_ONLY) |
Plus new |
I'm also having another round at in-lined tracing. |
Just to be clear, the problem with ETW / WPP and inline functions is that the built-in variable So I started with building an enum containing all inline function names, both calling and called functions, so that we could use an integer-based index of names when calling inline functions rather than a string literal via PVOID. Spent a bit of time using WPP Then it was on to parameterised |
a0cb1cf
to
b549bdb
Compare
I've seen a radical improvement in performance with my new solution for inline functions. It would appear using With
As such, my comments here are perhaps not correct:
...all of which - to a certain degree - diminishes the case for Of course, there is still a performance overhead when monitoring or capturing a trace, with say As such, one argument for separate hot and cold code paths, might be for when monitoring a trace on a system that must maintain performance without the additional user-mode CPU consumption by ETW. In such - perhaps production environments - the better approach might be to use a default cold-path only collection of messages, but then issue all desired/critical hot-path messages in the cold-path under The addition of I'm leaning towards keeping things as posted in the last push, but I would appreciate some discussion and sharing of alternate views and considerations, to inform me as to where to go from here. In the meantime, I'll drop this to draft. |
b549bdb
to
fe25a69
Compare
I built a version without If you are interested in perusing the relevant WIPs: With
The instrumentation in my WIP is already fairly extensive, but if we were to add more (will we though?), I think the latency will grow a bit. Therefore, my thoughts are that it's probably best to keep the hot and cold path variant. The cold path variant is usually also marginally quicker. Arguments for keeping Please do let me know your views per the request in my last post so that we can progress this. |
1. Introduced three (3) compile-time environment variables: RUN_WPP_ALL_PATHS - Compile WPP for the hot path too. Is the inverse of RUN_COLD_PATH_ONLY, which is the default behaviour. FORCE_RUN_UNCHECKED - Run without EVENT_TRACING or DBG. Defines RUN_UNCHECKED. FORCE_RUN_DEBUG - Run with DBG rather than EVENT_TRACING. Undefines EVENT_TRACING. 2. WPP scans for WPP config on a per file basis. Sections of the config file cannot be excluded and conditional macros will be ignored. To solve this, we establish two new files with WPP configuration data. One holds exclusively cold path data (wpp_cold_path.h) and the other holds data for all paths (wpp_all_paths.h). Which file to use is set conditionally per the vioscsi.vcxproj project file configuration. 3. The hot path DBG and WPP macro variants are a clone of the cold path macros, but with a "_HP" suffix. The exception to this is RhelDbgPrint() which we extend as RhelDbgPrintHotPath(). 4. Encapsulated exisitng cold and and hot trace messages in compile-time conditional macros and in same manner drops *.tmh includes when necessary. 5. Minor refactoring of: a) helper.c - GetScsiConfig() b) helper.c - SetGuestFeatures() 6. Major refactoring of: a) trace.h b) vioscsi.c - VioScsiHwInitialize() - corrects order of init for benefit of clean trace. 7. Major new instrumentation of: a) helper.c - InitVirtIODevice() 8. Reconfigured the following for INLINE function tracing: a) vioscsi.c - HandleResponse() b) vioscsi.c - PreProcessRequest() c) vioscsi.c - PostProcessRequest() d) vioscsi.c - DispatchQueue() 9. Mnemonic rename of VioScsiPassiveInitializeRoutine() to VioScsiPassiveInitializeDpcRoutine(). 10. Correct struct virtio_bar for bMemorySpace rather than bPortSpace (vioscsi.h and virtio_pci.c). 11. Add missing structs for PCI Capabilities (vioscsi.h). 12. Fix various clang-format issues. 13. Also added RhelDebugPrintInclude() and RhelDebugPrintIncludeHotPath(). 14. Inline functions use an enum INL_FUNC_IDX to encode a ULONG index to each function that either calls or is an inline function. We then use inline_func_string_map[] to map / decode the ULONG index to a string literal (the name of the function). This can then be used to properly label the calling and inline function names within messages. 15. Renamed adaptExt->dpc_ok to adaptExt->dpc_ready. Hard to split this one up further methinks. I do have a fair bit of additional instrumentation to merge after this one. Signed-off-by: benyamin-codez <[email protected]>
fe25a69
to
30d3831
Compare
Introduced three (3) compile-time environment variables:
a)
RUN_WPP_ALL_PATHS
Compile WPP for the hot path too.
Is the inverse of
RUN_COLD_PATH_ONLY
, which is the default behaviour.b)
FORCE_RUN_UNCHECKED
Run without
EVENT_TRACING
orDBG
. DefinesRUN_UNCHECKED
.c)
FORCE_RUN_DEBUG
Run with
DBG
rather thanEVENT_TRACING
. UndefinesEVENT_TRACING
.WPP scans for WPP config on a per file basis. Sections of the config file cannot be excluded and conditional macros will be ignored. To solve this, we establish two new files with WPP configuration data. One holds exclusively cold path data (
wpp_cold_path.h
) and the other holds data for all paths (wpp_all_paths.h
). Which file to use is set conditionally per thevioscsi.vcxproj
project file configuration.The hot path DBG and WPP macro variants are a clone of the cold path macros, but with a "_HP" suffix. The exception to this is
RhelDbgPrint()
which we extend asRhelDbgPrintHotPath()
.Encapsulated exisitng cold and and hot trace messages in compile-time conditional macros and in same manner drops
*.tmh
includes when necessary. The general format for the cold path is#if !defined(RUN_UNCHECKED)
and for the hot path#if !defined(RUN_UNCHECKED) || !defined(RUN_COLD_PATH_ONLY)
.Minor refactoring of:
a)
helper.c
-GetScsiConfig()
b)
helper.c
-SetGuestFeatures()
Major refactoring of:
a)
trace.h
b)
vioscsi.c
-VioScsiHwInitialize()
- corrects order of init for benefit of clean trace.Major new instrumentation of:
a)
helper.c
-InitVirtIODevice()
Reconfigured the following for INLINE function tracing:
a)
vioscsi.c
-HandleResponse()
b)
vioscsi.c
-PreProcessRequest()
c)
vioscsi.c
-PostProcessRequest()
d)
vioscsi.c
-DispatchQueue()
Mnemonic rename of
VioScsiPassiveInitializeRoutine()
toVioScsiPassiveDpcInitializeRoutine()
.Correct struct
virtio_bar
forbMemorySpace
rather thanbPortSpace
(vioscsi.h
andvirtio_pci.c
).Add missing structs for PCI Capabilities (
vioscsi.h
).Included
stddef.h
foroffsetof
function.Also added
RhelDebugPrintInclude()
andRhelDebugPrintIncludeHotPath()
.Inline functions use an enum
INL_FUNC_IDX
to encode a ULONG index to each function that either calls or is an inline function. We then useinline_func_string_map[]
to map / decode the ULONG index to a string literal (the name of the function). This can then be used to properly label the calling and inline function names within messages.Renamed
adaptExt->dpc_ok
toadaptExt->dpc_ready
.Hard to split this one up further methinks.
I do have a fair bit of additional instrumentation to merge after this one.