Path diagram (for linear statistical modelling) #5721
Labels
Status: Triage
Needs to be verified, categorized, etc
Type: Enhancement
New feature or request
Type: New Diagram
Proposal
A great deal of statistical modelling makes use of path diagrams. A simple Mermaid 'graph' is already very close to being able to do what's needed:
Latent Variables: Represented by ellipses or circles.
Observed Variables: Represented by rectangles or squares.
Straight Arrows: Indicate direct effects or causal relationships, pointing from the cause to the effect.
Curved Arrows: Represent correlations or covariances between variables, typically double-headed.
Whilst ellipses and circles are used interchangeably in path diagrams, in many cases, circles take up too much screen real estate, so having an ellipse option would improve things dramatically. Pretty much all dedicated software for this purposes uses ellipses preferentially.
Curved and straight arrows have different meanings in path diagrams, so it's essential to be able to stipulate for each on an arrow-by-arrow basis.
Use Cases
Pretty much all data scientists, statisticians, social scientists, etc will use path diagrams at some stage in their work. Diagramming software is generally part of larger expensive statistical packages, putting it out of the reach of some students etc.
Screenshots
Syntax
Option 1
The lavaan package in R is highly used for path modelling, and its syntax could be borrowed pretty nearly wholesale:
Here, =~ dictates a straight uni-directional arrow (causation), and ~~ a curved bi-directional arrow (correlation).
Option 2
Alternatively, it might be easier to just add the required features to the 'graph' type in Mermaid. Since all we need is the ability to have curved arrows and elliptical nodes, perhaps denote ellipses with something like:
(-foobar-)
and curved arrows with<~~>
?The existing features, such as adding subgraphs, are not technically part of path diagrams but could be useful to scientists trying to demonstrate and discuss different parts of a model, so the second option here might actually be better even though it would require one to translate one's statistical code into Mermaid syntax. It would presumably also require much less work for the Mermaid community.
Implementation
This is a proposal which I'd love to see built into mermaid by the wonderful community.
The text was updated successfully, but these errors were encountered: