Skip to content

Commit

Permalink
Merge pull request #222 from savitakartik/documentation
Browse files Browse the repository at this point in the history
docs page for Nodes
  • Loading branch information
benjeffery authored Nov 14, 2024
2 parents ab2307a + ce1ed81 commit 98b0aa5
Show file tree
Hide file tree
Showing 8 changed files with 73 additions and 4 deletions.
5 changes: 5 additions & 0 deletions docs/_toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,12 @@ parts:
- caption: Pages
chapters:
- file: overview
- file: tables
- file: mutations
- file: edges
- file: nodes
- file: trees


- caption: Extras
chapters:
Expand Down
4 changes: 2 additions & 2 deletions docs/edges.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,11 @@

![Edges Page](tsbrowse:example.tsbrowse:edges)

An edge represents the relationship between a pair of nodes (parent, child) in the ARGs over time (assigning the node occurring in more recent time as the child).
An edge represents the relationship between a pair of nodes (parent, child) in an ARG over time (assigning the node occurring in more recent time as the child).

The interactive plot on the top row allows us to visualise these relationships as horizontal lines denoting the length of the edge on the sequence (X-axis) and time of one of the nodes (Y-axis). On mouse-over of an individual edge, the tool-tip displays the parent and child node IDs, length of sequence the edge spans, and the amount of time it spans (branch length).

The bottom panel contains three static histograms which. left to right, summarise the span of edges over the sequence, the span of edges over times (branch lengths), and the product of the two spans.
The bottom panel contains three static histograms which, left to right, summarise the span of edges over the sequence, the span of edges over time (branch lengths), and the product of the two spans.


## Plot controls (sidebar):
Expand Down
3 changes: 3 additions & 0 deletions docs/intro.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,7 @@

# Introduction to tsbrowse

# Data Model

# How to
TODO! Add a quick start guide with example tree sequences
20 changes: 20 additions & 0 deletions docs/mutations.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
(sec_mutations)=

# Mutations page

![Mutations Page](tsbrowse:example.tsbrowse:mutations)

This page plots the mutations on a 2D plot of time and genomic position, coloured
by the number of inheritors. The surrounding
histograms then give mutation density in time and space, along with breakpoint density.
As there are often more mutations than can be displayed in a performant way, the plot
summarises the data on a discrete grid when there are more than 1000 mutations in the view.
Zooming in (using the mouse wheel or top right controls) will show the individual mutations.
Once the individual mutations are displayed, hovering over one will show a tooltip with
more information about the mutation. Clicking the mutation will open a popup window with
the population frequency of the mutation, along with its full data. This popup can be
moved around by dragging its title bar, and closed by clicking whitespace on the plot.

## Plot controls (sidebar):
* The `Log Y-axis` checkbox plots node times on log scale.
* The `X Range` control allows the user to set the range of the X-axis, using a comma separated pair of `start:stop` values.
16 changes: 16 additions & 0 deletions docs/nodes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
(sec_nodes)=

# Nodes page

![Nodes Page](tsbrowse:example.tsbrowse:nodes)

A node defines a sampled or ancestral sequence represented in an ARG. It is identified by a numerical ID and can be present in many marginal trees.

The interactive top plot visualises the total span on the sequence for each ancestral node over time.

The histograms at the bottom show the distributions of node spans over different dimensions. The leftmost histogram summarises the span of nodes on the sequence; the middle plot summarises the span of nodes over time and the rightmost plot summarises the edge "area" defined as the product of sequence span and time span for each node.

## Plot controls (sidebar):
* The `Node flags`checkbox group
The tskit Nodes table includes a bitwise `flags` column used to store information about a node. For example. a value of 1 indicates that the node is a sample. By using these checkboxes it is possible to select which nodes to include in the node spans plot. based on their flag values.
* The `log y-axis` checkbox plots node time on log scale.
8 changes: 6 additions & 2 deletions docs/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,10 @@

# Overview page

TODO! Explain page

![Overview Page](tsbrowse:example.tsbrowse:overview)

This page gives high-level details on the loaded tree sequence including
the number of each type of object and the (uncompressed) disk space used
by each.
Provenance information is also displayed, showing the commands used to
generate the tree sequence, if available.
16 changes: 16 additions & 0 deletions docs/tables.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
(sec_tables)=

# Tables page

![Tables Page](tsbrowse:example.tsbrowse:tables)

In tskit, tree sequences are represented as a collection of tables.
This page allows the querying and inspection of the raw data in these tables.
For more detail on what the columns mean see the [tskit data model](https://tskit.dev/tskit/docs/stable/data-model.html). Note that additional columns, used by tsbrowse, are added to the data model and displayed here. These include convenience columns such as the number of
inheritors for a given mutation.


## Controls (sidebar):
There are two sidebar controls, the first allows the user to select which table to display, and the second allows the user to filter the rows of the table based on a Python-like expression. This expression is passed to the [pandas query method](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.query.html). Column names can be used as variables in the expression, and the expression should evaluate to a boolean value.
For example, to filter the mutations table to only show mutations with a derived state of 'C' that have more than 20,000 inheritors, you could use the expression
`derived_state == 'C' and num_inheritors > 20000`.
5 changes: 5 additions & 0 deletions docs/trees.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
(sec_trees)=

# Trees page

![Trees Page](tsbrowse:example.tsbrowse:trees)

0 comments on commit 98b0aa5

Please sign in to comment.