Skip to content

monarch_semsim doesn't handle identical ID phenotypes well  #31

@oneilsh

Description

@oneilsh

This example should produce two pairs of nodes best matching each other in both directions:

phenos1 <- monarch_engine() |> 
	fetch_nodes(query_ids = c("HP:0001763", "HP:0006316")) |>
	activate(nodes) |>
	mutate(source = "phenos1")

phenos2 <- monarch_engine() |> 
	fetch_nodes(query_ids = c("HP:0001763", "HP:0000678")) |>
	activate(nodes) |>
	mutate(source = "phenos2")


semsim <- monarch_semsim(phenos1, 
                         phenos2, 
                         include_reverse = TRUE, 
                         keep_unmatched = TRUE)

plot(semsim, node_color = source)

However, the identical node present in both sets has a self loop in phenos1 and no connection to phenos2.

Diagnosis: this comes down to how graph joins work in tidygraph (which is natural, over all matching columns) when the two node sets disagree in one of the columns. I think the desired behavior is right and desired with respect to that, we're just not keeping the original node sets properly in creating the join.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions