-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NickAkhmetov/CAT-888 Improve combined provenance logic to safeguard against duplicate nodes/edges #3535
Conversation
…gainst duplicate nodes/edges
context/app/static/js/components/detailPage/provenance/utils.ts
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want to block the fix, but I'm curious why this case necessitates type assertion?
Object.keys(obj1).forEach((key) => { | ||
// @ts-expect-error -- We're specifically checking for the key in obj2 | ||
if (!(key in obj2) || obj1[key] !== obj2[key]) { | ||
diffKeys.push(key); | ||
} | ||
}); | ||
Object.keys(obj2).forEach((key) => { | ||
// @ts-expect-error -- We're specifically checking for the key in obj1 | ||
if (!(key in obj1) || obj1[key] !== obj2[key]) { | ||
if (!diffKeys.includes(key)) { | ||
diffKeys.push(key); | ||
} | ||
} | ||
}); | ||
return diffKeys; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just brainstorming a potentially simpler method. Feel free to take or leave.
const diffKeys = Object.keys({...obj1, ...obj2}).reduce((acc, k) => {
const v1 = obj1?.[k];
const v2 = obj2?.[k];
if(v1 === v2){
return acc;
}
return [...acc, k];
}, [])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, good call. I was definitely overthinking this.
@@ -8,6 +8,27 @@ export function createProvDataURL(uuid: string, entityEndpoint: string) { | |||
return `${entityEndpoint}/entities/${uuid}/provenance`; | |||
} | |||
|
|||
function findDifference(obj1: object, obj2: object) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why object
as the types here rather than something like Record<string, unknown>
? Is it because the later used merged[key]
is unknown
and we can only use a type guard for object
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've generified this to match nonDestructiveMerge
, thanks for pointing out the improvement!
@@ -21,11 +42,35 @@ export function nonDestructiveMerge<T extends Record<string, unknown>, U extends | |||
): T { | |||
const merged = { ...a } as Record<string, unknown>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we provide stricter types for the provenance data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The types are currently provided via the generics, i.e. this function gets called with ProvData
objects, then gets cast back to the ProvData type at the end with the return merged as T
. Prior to adding this most recent prov-only logic, this was a more generic method we could reuse elsewhere; if we end up needing a similar functionality in the future, we could generify this approach by extracting the prov-specific logic into a function param.
@@ -8,6 +8,18 @@ export function createProvDataURL(uuid: string, entityEndpoint: string) { | |||
return `${entityEndpoint}/entities/${uuid}/provenance`; | |||
} | |||
|
|||
function findDifference<T extends Record<string, unknown>, U extends Record<string, unknown>>(obj1: T, obj2: U) { | |||
const diffKeys = Object.keys({ ...obj1, ...obj2 }).reduce((acc, key) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const diffKeys = Object.keys({ ...obj1, ...obj2 }).reduce((acc, key) => { | |
const diffKeys = Object.keys({ ...obj1, ...obj2 }).reduce<string[]>((acc, key) => { |
Would this avoid the assertion below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, thanks!
Summary
This PR reinforces the logic used to combine provenance graphs to guard against duplicate edges more effectively. Since edge IDs are not globally unique, collisions in edges don't necessarily mean they're referring to the same object - however, this led to cases where two edges with unique ID's led to duplicate connections between nodes, which caused the
cwlInput
conversion to throw errors.With these changes, duplicate edges should no longer propagate to the graph from the combining logic. I've also improved the error thrown by the
makeCwlInput
function so that it is easier to track down exactly what's going wrong.Design Documentation/Original Tickets
https://hms-dbmi.atlassian.net/browse/CAT-888
Testing
Manually tested with the dataset that this issue was occuring on:
graph
tab while logged in no longer causes crashes: http://localhost:5001/browse/dataset/438f7ba3b54f4020e8cbd852594aaca3?redirected=True#provenanceScreenshots/Video
Checklist
CHANGELOG-your-feature-name-here.md
is present in the root directory, describing the change(s) in full sentences.