-
Notifications
You must be signed in to change notification settings - Fork 4
/
paper.tex
115 lines (86 loc) · 3.44 KB
/
paper.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
\documentclass{article}
\usepackage[round]{natbib}
\usepackage{listings}
\usepackage[english]{babel}%
\usepackage[T1]{fontenc}%
\usepackage[utf8]{inputenc}%
\usepackage{amsmath,amssymb,amsfonts}%
\usepackage{geometry}%
\usepackage{color}
\usepackage{graphicx}
\usepackage{dsfont}
\usepackage{verbatim}%
\usepackage{environ}%
\usepackage[right]{lineno}%
\usepackage{nameref}
%\usepackage{showkeys}
% local definitions
\newcommand{\msprime}[0]{\texttt{msprime}}
\newcommand{\tskit}[0]{\texttt{tskit}}
\newcommand{\SLiM}[0]{\texttt{SLiM}}
\newcommand{\fwdpy}[0]{\texttt{fwdpy11}}
\newcommand{\jkcomment}[1]{\textcolor{red}{#1}}
\begin{document}
\title{Tskit: a portable library for population scale genealogical analysis}
\author{Author list to be filled in
}
% \address{
%%Affiliations:
% Franz Baumdicker:
% Cluster of Excellence "Controlling Microbes to Fight Infections", Mathematical and Computational Population Genetics, University of Tübingen, Germany
% \section*{Contact:} \href{[email protected]}{[email protected]}
\maketitle
\begin{abstract}
The ability to store and analyse related genetic sequences is an
essential requirement for simulation, inference and analysis in both
population genetics and phlyogenetics. The recent introduction of the
succinct tree sequence data structure has provided a way to achieve
this at population scale. Here we present the \tskit\ software,
a high-performance, portable, open-source, community-driven library.
\tskit\ allows the creation, manipulation and analysis of succinct tree
sequences, with first-class support for provenance and user-defined metadata.
\tskit\ enables a common foundation across software projects that use
succinct tree sequences, which results in unprecedented interoperability,
reproducibility and maintainability.
\end{abstract}
\textbf{Keywords:} Tree sequences, Python
\section*{Introduction}
\citep{kelleher2018efficient}
\section*{Results}
\subsection*{Datamodel}
\jkcomment{Describe the tree sequence data model and the Tables API, with diagram}
\subsection*{In-memory data structures}
\jkcomment{Explain the quintuply linked tree and give an example and reasons for it's
efficiency, with possible comparisons}
\subsection*{Metadata and provenance}
\jkcomment{Describe the metadata implementation and how it solves common issues with both
separate metadata files and how the schema allows for self-describing data. Explain the
provenance mechanisms and how they promote traceability and reproducibility}
\subsection*{APIs}
\jkcomment{Briefly describe each of the C, Rust and Python APIs and their uses}
\subsection*{Efficiency working with large trees}
\jkcomment{Show use of tskit working with a large covid tree, with possible comparison
to usher}
\section*{Discussion}
\section*{Acknowledgments}
\bibliographystyle{plainnat}
\bibliography{paper}
%% local definitions for section multiple merger coalescents
\newcommand{\be}{\begin{equation}}
\newcommand{\ee}{\end{equation}}
\newcommand{\bd}{\begin{displaymath}}
\newcommand{\ed}{\end{displaymath}}
\newcommand{\IN}{\ensuremath{\mathds{N}}}%
\newcommand{\EE}[1]{\ensuremath{\mathds{E}\left[ #1 \right]}}%
\newcommand{\one}[1]{\ensuremath{\mathds{1}_{\left\{ #1 \right\}}}}%
\newcommand{\prb}[1]{\ensuremath{\mathds{P}\left( #1 \right) } }%
\NewEnviron{esplit}[1]{%
\begin{equation}
\label{#1}
\begin{split}
\BODY
\end{split}\end{equation}
}
\setcounter{secnumdepth}{2} % Print out appendix section numbers
\appendix
\end{document}