Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Compare Runs metric section #2798

Conversation

Gkrumbach07
Copy link
Member

@Gkrumbach07 Gkrumbach07 commented May 9, 2024

closes: https://issues.redhat.com/browse/RHOAIENG-4772
closes: https://issues.redhat.com/browse/RHOAIENG-4771

Description

Adds compare runs section for the following metrics:

  • scalar metric table
  • confusion matrix
  • roc curve
  • I added a markdown tab but kept it disabled until fetching from s3 is done

Scalar metric table

  • should be able to see all scalar metrics for each execution.
  • will react to selected runs
  • horizontal scroll
  • ability to hide same metrics

Confusion matrix

  • horizontal scroll
  • has expand button for graph which opens graph by itself.
    • i moved away from the mocks on how the dropdown works in this state becuase the dropdown is a checkbox. @yannnz you can clarify what is intended on this

roc curve

  • curve has legend that matches table under it
  • table can be sorted and filtered
    Screenshot 2024-05-09 at 5 27 14 PM
    Screenshot 2024-05-09 at 5 27 04 PM
    Screenshot 2024-05-09 at 5 26 52 PM
    Screenshot 2024-05-09 at 5 26 47 PM
    Screenshot 2024-05-09 at 5 26 42 PM
    Screenshot 2024-05-09 at 5 26 28 PM

How Has This Been Tested?

Test Impact

Tests are complicated becuase they involve graphs and mlmd. I am working on a solution now but i want to get this PR up so it unblocks

Request review criteria:

Self checklist (all need to be checked):

  • The developer has manually tested the changes and verified that the changes work
  • Commits have been squashed into descriptive, self-contained units of work (e.g. 'WIP' and 'Implements feedback' style messages have been removed)
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has added tests or explained why testing cannot be added (unit or cypress tests for related changes)

If you have UI changes:

  • Included any necessary screenshots or gifs if it was a UI change.
  • Included tags to the UX team if it was a UI/UX change (find relevant UX in the SMEs section).

After the PR is posted & before it merges:

  • The developer has tested their solution on a cluster by using the image produced by the PR to main

cc @yannnz

@Gkrumbach07 Gkrumbach07 force-pushed the story/RHOAIENG-2976-spike-determine-artifact-rendering-details branch from 4cae1ff to 9a969fb Compare May 10, 2024 12:24
Copy link
Contributor

@jpuzz0 jpuzz0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using some existing runs that probably don't have matrix data to display, I'm seeing an infinite loading spinner:
image

@Gkrumbach07 Gkrumbach07 force-pushed the story/RHOAIENG-2976-spike-determine-artifact-rendering-details branch 2 times, most recently from 2fd73b2 to e25ca4e Compare May 14, 2024 15:32
@jpuzz0
Copy link
Contributor

jpuzz0 commented May 14, 2024

Are you sure the intention of Expand link was to just span the whole width of the drawer? Based on this Figma conversation, it seems like the intention for all Expands might be to open them in a modal, although I'm questioning that in this conversation and waiting for Yan's response.

e.g. https://www.figma.com/proto/etbN0ccCcZR1qKCpJXQNAg/Pipeline-phase-2?page-id=615%3A34927&type=design&node-id=1682-101538&viewport=1131%2C-244%2C0.1&t=I8pPcuoDXIu1XQas-1&scaling=scale-down-width&starting-point-node-id=1585%3A95719

@jpuzz0
Copy link
Contributor

jpuzz0 commented May 14, 2024

Why would anything be hidden if only 1 run is shown here?
image

image

Also, why is only 1 run shown, when 3 are selected in this comparison example?
image

@jpuzz0
Copy link
Contributor

jpuzz0 commented May 14, 2024

When selecting the already checked/selected item for the Confusion matrix, this is what I see:
image

Seems like maybe the click handler should do nothing when attempting to click an item that is already selected to prevent this.

@jpuzz0
Copy link
Contributor

jpuzz0 commented May 14, 2024

Should we not show empty state when there are no parameters with differences?
image

@jpuzz0
Copy link
Contributor

jpuzz0 commented May 14, 2024

Wondering why we would use "Series" instead of the run name for the ROC curve legend:
image

If we do keep Series, it seems like there should be a tooltip for each series color in the table showing which series corresponds to the color:
image

@jpuzz0
Copy link
Contributor

jpuzz0 commented May 14, 2024

Should we still show ROC curve data if the search finds none?
image

@Gkrumbach07
Copy link
Member Author

Are you sure the intention of Expand link was to just span the whole width of the drawer? Based on this Figma conversation, it seems like the intention for all Expands might be to open them in a modal, although I'm questioning that in this conversation and waiting for Yan's response.

The UX is different on the compare runs page vs the run details for expand

@Gkrumbach07
Copy link
Member Author

Should we still show ROC curve data if the search finds none?

This is what is done in KF, and i did not have a ref for what yan wanted so i fell back on what upstream did

@Gkrumbach07
Copy link
Member Author

Wondering why we would use "Series" instead of the run name for the ROC curve legend:

there could be multiple per run so it would have to be run name > execution name > artifact name.... this can get large so i assume thats why its just using series name. this can be iterated on late if it is confusing

@Gkrumbach07
Copy link
Member Author

Should we not show empty state when there are no parameters with differences?

We could, this is what upstream had and there was no mock for what to do in our case

@Gkrumbach07
Copy link
Member Author

Why would anything be hidden if only 1 run is shown here?

I can remove the switch if there is only one run

@Gkrumbach07
Copy link
Member Author

Also, why is only 1 run shown, when 3 are selected in this comparison example?

it only shows the runs with scalar metrics

@Gkrumbach07
Copy link
Member Author

When selecting the already checked/selected item for the Confusion matrix, this is what I see:

I think that would be confusing to the user if they could not deselect it.

The blank skeleton was the closest PF like thing i could use to simulate what KF used.

I think a lot of the concerns need UX to decide what is best, maybe we can wait until demo and @yannnz can give her opinion

@Gkrumbach07 Gkrumbach07 force-pushed the story/RHOAIENG-2976-spike-determine-artifact-rendering-details branch 2 times, most recently from 1967ca6 to 5c20c13 Compare May 14, 2024 21:12
@jpuzz0
Copy link
Contributor

jpuzz0 commented May 14, 2024

When selecting the already checked/selected item for the Confusion matrix, this is what I see:

I think that would be confusing to the user if they could not deselect it.

The blank skeleton was the closest PF like thing i could use to simulate what KF used.

I think a lot of the concerns need UX to decide what is best, maybe we can wait until demo and @yannnz can give her opinion

The skeleton is used for loading states, but this doesn't load anything, so I think if we are meant to deselect the item, the content should instead show empty state?

Copy link

codecov bot commented May 14, 2024

Codecov Report

Attention: Patch coverage is 29.93197% with 309 lines in your changes are missing coverage. Please review.

❗ No coverage uploaded for pull request base (main@424378e). Click here to learn what that means.

❗ Current head 5c20c13 differs from pull request most recent head c1291d3. Consider uploading reports for the commit c1291d3 to get more accurate results

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #2798   +/-   ##
=======================================
  Coverage        ?   77.35%           
=======================================
  Files           ?     1093           
  Lines           ?    23021           
  Branches        ?     5794           
=======================================
  Hits            ?    17808           
  Misses          ?     5213           
  Partials        ?        0           
Files Coverage Δ
...tend/src/concepts/pipelines/apiHooks/mlmd/types.ts 100.00% <ø> (ø)
...epts/pipelines/apiHooks/mlmd/useGetArtifactById.ts 11.11% <ø> (ø)
...pts/pipelines/apiHooks/mlmd/useGetArtifactsList.ts 6.25% <ø> (ø)
...elines/content/compareRuns/metricsSection/const.ts 100.00% <100.00%> (ø)
...ents/artifacts/ArtifactDetails/ArtifactDetails.tsx 83.33% <ø> (ø)
...ifacts/ArtifactDetails/ArtifactOverviewDetails.tsx 87.50% <100.00%> (ø)
...nes/global/experiments/artifacts/ArtifactsList.tsx 75.00% <ø> (ø)
...es/global/experiments/artifacts/ArtifactsTable.tsx 40.35% <ø> (ø)
...xperiments/compareRuns/CompareRunParamsSection.tsx 97.67% <ø> (ø)
...global/experiments/compareRuns/CompareRunsPage.tsx 100.00% <ø> (ø)
... and 21 more

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 424378e...c1291d3. Read the comment docs.

@yannnz
Copy link

yannnz commented May 15, 2024

  • For the ROC Curve, there is an interaction been missed. When the user hover the legend of the series, the corresponding curve highlighted with a thicker curve.
Screenshot 2024-05-15 at 21 32 51
  • Experiment column of the run table list should be clickable if we keep parity with KFP UI.
Screenshot 2024-05-15 at 21 36 20
  • Not quite sure why the Markdown tab is disable. If it is because there is no markdown, I think we need an empty states.
Screenshot 2024-05-15 at 21 38 22
  • On my screen, the space between the Metrics section and the Parameter section can be larger to keep the space consistency between sections. The space between the Run table list under the ROC Curve and the ROC Curve graph can be smaller because they related with each other.
Screenshot 2024-05-15 at 21 35 53 Screenshot 2024-05-15 at 21 35 39

@yannnz
Copy link

yannnz commented May 15, 2024

Should we still show ROC curve data if the search finds none? image

I wonder how KFP does it? I still cannot figure out how their filters work. I can see how to use this filter is to filter the listed artifact names, if there are several results matching the results, the UI will only display those filtered ones, right? If this is the logic for filtering, filtering ended with no result means nothing to show on the graph.

@yannnz
Copy link

yannnz commented May 15, 2024

Wondering why we would use "Series" instead of the run name for the ROC curve legend: image

If we do keep Series, it seems like there should be a tooltip for each series color in the table showing which series corresponds to the color: image

I still don't get what are Series TBH. Are those runs? We just followed KFP UI....

@yannnz
Copy link

yannnz commented May 15, 2024

Should we not show empty state when there are no parameters with differences? image

Yes I don't think we need to show empty state for this scenario, because there are still parameters. It might confuse the user. WDYT?

@yannnz
Copy link

yannnz commented May 15, 2024

When selecting the already checked/selected item for the Confusion matrix, this is what I see: image

Seems like maybe the click handler should do nothing when attempting to click an item that is already selected to prevent this.

Thanks for raising this. Yes this is not a good experience when you already selected something, but it started reloading just because you clicked it again.

@yannnz
Copy link

yannnz commented May 15, 2024

Why would anything be hidden if only 1 run is shown here? image

image Also, why is only 1 run shown, when 3 are selected in this comparison example? image

It is weird. Is it because of the KFP API behavior?

@jpuzz0
Copy link
Contributor

jpuzz0 commented May 15, 2024

Wondering why we would use "Series" instead of the run name for the ROC curve legend: image
If we do keep Series, it seems like there should be a tooltip for each series color in the table showing which series corresponds to the color: image

I still don't get what are Series TBH. Are those runs? We just followed KFP UI....

@yannnz I'm not sure either. I understood it as them being the runs since that's what is unique for each table row in this case, but I could be wrong.

@jpuzz0
Copy link
Contributor

jpuzz0 commented May 15, 2024

Why would anything be hidden if only 1 run is shown here? image
image
Also, why is only 1 run shown, when 3 are selected in this comparison example? image

It is weird. Is it because of the KFP API behavior?

@yannnz I don't think so. At least how the Parameters compare table works, if 1 run has data, the other 2 show to compare against empty values. This should be achievable here as well, but not sure what was expected.

@jpuzz0
Copy link
Contributor

jpuzz0 commented May 15, 2024

Should we not show empty state when there are no parameters with differences? image

Yes I don't think we need to show empty state for this scenario, because there are still parameters. It might confuse the user. WDYT?

@yannnz Does it still have parameters? I think the 2nd row here doesn't represent a parameter, just the execution name.

This has a parameter called accuracy:
image

This has no parameters as far as my understanding goes:
image

@jpuzz0
Copy link
Contributor

jpuzz0 commented May 15, 2024

Should we still show ROC curve data if the search finds none? image

I wonder how KFP does it? I still cannot figure out how their filters work. I can see how to use this filter is to filter the listed artifact names, if there are several results matching the results, the UI will only display those filtered ones, right? If this is the logic for filtering, filtering ended with no result means nothing to show on the graph.

@yannnz In KF UI, When adding filter text to the search box and no results are found, they don't update the table data, and instead still show the rows. In our case, we allow users to filter, which hides the rows, but yet they're still selected:

KF UI:
image

.filter((confidenceMetrics) => !!confidenceMetrics.data)
.map(({ data, fullArtifactPath }, i) => {
// validate the custom properties
const confidenceMetricsArray = data?.list as unknown[];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this case to unknown?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

becuase it is coming from custom properties

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i changed the logic to be based on the actual value now

@jpuzz0
Copy link
Contributor

jpuzz0 commented May 15, 2024

When the width is small enough and there's enough data, using the maxDimension prop - set to 500 here, the x-axis values overlap for the ROC curve:
image

@Gkrumbach07
Copy link
Member Author

Gkrumbach07 commented May 15, 2024

For the ROC Curve, there is an interaction been missed. When the user hover the legend of the series, the corresponding curve highlighted with a thicker curve.

This is not possible in victory charts

@Gkrumbach07
Copy link
Member Author

@yannnz I don't think so. At least how the Parameters compare table works, if 1 run has data, the other 2 show to compare against empty values. This should be achievable here as well, but not sure what was expected.

KF does not show empty ones, just ones that gave the param. otherwise it may give the impression that the execution has that scalar metric. And if we do go down this route, then we would be showing a bunch of empty cells

@Gkrumbach07
Copy link
Member Author

@yannnz In KF UI, When adding filter text to the search box and no results are found, they don't update the table data, and instead still show the rows. In our case, we allow users to filter, which hides the rows, but yet they're still selected:

filtering should not deselect things from the checkboxes. that is not our behavior in other areas so it should not be here. i guess the idea is to help find your artifact to show if you have a lot?

… and types.ts

Refactor imports and remove unused code in ArtifactDetails.spec.tsx and ArtifactsTable.spec.tsx

Refactor imports and remove unused code in ArtifactDetails.spec.tsx and ArtifactsTable.spec.tsx

Add TestArtifacts component to GlobalPipelineExperimentsRoutes

Refactor ConfusionMatrix component to improve readability and styling

Add ConfusionMatrix component and buildConfusionMatrixConfig function

Add ConfusionMatrix and ROCCurve components

Remove TestArtifacts component and update related routes

Refactor ROC curve table components, update imports, and remove unnecessary code

Refactor ConfusionMatrixSelect component and add skeleton loading

Refactor ROC curve table components and update imports

Refactor ROC curve table components and add related constants

Refactor ROC curve table components and add related constants

Refactor CompareRunsMetricsSection component, update imports, and remove console.log statements

Refactor CompareRunsMetricsSection component, update imports, and add ROC curve tab

Refactor Dockerfile to use linux/amd64 platform

Refactor CompareRunsMetricsSection component, update imports, and add confusion matrix tab

Refactor CompareRunsMetricsSection component, add MetricSectionTabLabels enum, and update related types and imports

Refactor CompareRunsMetricsSection component, add MetricSectionTabLabels enum, and update related types and imports

Refactor CompareRunsMetricsSection component and add MetricSectionTabLabels enum

Refactor and remove unused code in usePipelinesUiRoute.ts and ArtifactUriLink.tsx

Refactor MLMD API hooks and remove unused code

Refactor components to improve readability and add loading spinners

Refactor components to improve readability and add loading spinners

clean up

Refactor imports and remove unused code in ArtifactDetails.spec.tsx and ArtifactsTable.spec.tsx

Refactor ConfusionMatrix component to improve readability and styling

Refactor ROC curve table components and update imports
@Gkrumbach07 Gkrumbach07 force-pushed the story/RHOAIENG-2976-spike-determine-artifact-rendering-details branch from 5c20c13 to c1291d3 Compare May 15, 2024 16:38
};

const ROCCurve: React.FC<ROCCurveProps> = ({ configs, maxDimension }) => {
const width = Math.min(maxDimension || 800, 500);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this was added just to test out 500 px? Won't this always return 500?

Copy link
Contributor

@jpuzz0 jpuzz0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm /approve

Approving with the expectation for any UX discussions to be made into a separate story and followed up on.

Copy link
Contributor

openshift-ci bot commented May 15, 2024

[APPROVALNOTIFIER] This PR is APPROVED

Approval requirements bypassed by manually added approval.

This pull-request has been approved by:

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot bot merged commit 8645fe6 into opendatahub-io:main May 15, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants