monarch_semsim doesn't handle identical ID phenotypes well 

This example should produce two pairs of nodes best matching each other in both directions:

```
phenos1 <- monarch_engine() |> 
	fetch_nodes(query_ids = c("HP:0001763", "HP:0006316")) |>
	activate(nodes) |>
	mutate(source = "phenos1")

phenos2 <- monarch_engine() |> 
	fetch_nodes(query_ids = c("HP:0001763", "HP:0000678")) |>
	activate(nodes) |>
	mutate(source = "phenos2")


semsim <- monarch_semsim(phenos1, 
                         phenos2, 
                         include_reverse = TRUE, 
                         keep_unmatched = TRUE)

plot(semsim, node_color = source)
```

However, the identical node present in both sets has a self loop in phenos1 and no connection to phenos2.

Diagnosis: this comes down to how graph joins work in tidygraph (which is natural, over all matching columns) when the two node sets disagree in one of the columns. I think the desired behavior is right and desired with respect to that, we're just not keeping the original node sets properly in creating the join.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

monarch_semsim doesn't handle identical ID phenotypes well #31

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

monarch_semsim doesn't handle identical ID phenotypes well #31

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions