Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Path diagram (for linear statistical modelling) #5721

Open
drleehw opened this issue Aug 19, 2024 · 0 comments
Open

Path diagram (for linear statistical modelling) #5721

drleehw opened this issue Aug 19, 2024 · 0 comments
Labels
Status: Triage Needs to be verified, categorized, etc Type: Enhancement New feature or request Type: New Diagram

Comments

@drleehw
Copy link

drleehw commented Aug 19, 2024

Proposal

A great deal of statistical modelling makes use of path diagrams. A simple Mermaid 'graph' is already very close to being able to do what's needed:

Latent Variables: Represented by ellipses or circles.
Observed Variables: Represented by rectangles or squares.
Straight Arrows: Indicate direct effects or causal relationships, pointing from the cause to the effect.
Curved Arrows: Represent correlations or covariances between variables, typically double-headed.

Whilst ellipses and circles are used interchangeably in path diagrams, in many cases, circles take up too much screen real estate, so having an ellipse option would improve things dramatically. Pretty much all dedicated software for this purposes uses ellipses preferentially.

Curved and straight arrows have different meanings in path diagrams, so it's essential to be able to stipulate for each on an arrow-by-arrow basis.

Use Cases

Pretty much all data scientists, statisticians, social scientists, etc will use path diagrams at some stage in their work. Diagramming software is generally part of larger expensive statistical packages, putting it out of the reach of some students etc.

Screenshots

Syntax

Option 1

The lavaan package in R is highly used for path modelling, and its syntax could be borrowed pretty nearly wholesale:

  # Latent variables
  spatial =~ visperc + cubes + lozenges
  verbal =~ paragrp + sentence + wordmean

  # Error terms
  visperc ~~ err_v*visperc
  cubes ~~ err_c*cubes
  lozenges ~~ err_l*lozenges
  paragrp ~~ err_p*paragrp
  sentence ~~ err_s*sentence
  wordmean ~~ err_w*wordmean

  # Covariance between latent variables
  spatial ~~ verbal

Here, =~ dictates a straight uni-directional arrow (causation), and ~~ a curved bi-directional arrow (correlation).

Option 2

Alternatively, it might be easier to just add the required features to the 'graph' type in Mermaid. Since all we need is the ability to have curved arrows and elliptical nodes, perhaps denote ellipses with something like: (-foobar-) and curved arrows with <~~> ?

The existing features, such as adding subgraphs, are not technically part of path diagrams but could be useful to scientists trying to demonstrate and discuss different parts of a model, so the second option here might actually be better even though it would require one to translate one's statistical code into Mermaid syntax. It would presumably also require much less work for the Mermaid community.

Implementation

This is a proposal which I'd love to see built into mermaid by the wonderful community.

@drleehw drleehw added Status: Triage Needs to be verified, categorized, etc Type: Enhancement New feature or request Type: New Diagram labels Aug 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Triage Needs to be verified, categorized, etc Type: Enhancement New feature or request Type: New Diagram
Projects
None yet
Development

No branches or pull requests

1 participant