C API 0.99.3
Pre-release
Pre-release
C API release.
Breaking changes
- tsk_mutation_table_add_row has an extra
time
argument. If the time is unknownTSK_UNKNOWN_TIME
should be passed. (@benjeffery, #672) - Change genotypes from unsigned to signed to accommodate missing data. (see #144 for discussion). This only affects users of the
tsk_vargen_t class
. Genotypes are now stored asint8_t
andint16_t
types rather than the former unsigned types. The field names in the genotypes union of thetsk_variant_t
struct returned bytsk_vargen_next
have been renamed toi8
andi16
accordingly; care should be taken when updating client code to ensure that types are correct. The number of distinct alleles supported by 8 bit genotypes has therefore dropped from 255 to 127, with a similar reduction for 16 bit genotypes. - Change the
tsk_vargen_init
method to take an extra parameteralleles
. To keep the current behaviour, set this parameter toNULL
. - Edges can now have metadata. Hence edge methods now take two extra arguments:
metadata
andmetadata length
. The file format has also changed to accommodate this, but is backwards compatible. Edge metadata can be disabled for a table collection with theTSK_NO_EDGE_METADATA
flag. (@benjeffery, #496, #712) - Migrations can now have metadata. Hence migration methods now take two extra arguments:
metadata
andmetadata length
. The file format has also changed to accommodate this, but is backwards compatible. (@benjeffery, #505) - The text dump of tables with metadata now includes the metadata schema as a header. (@benjeffery, #493)
- Bad tree topologies are detected earlier, so that it is no longer possible to create a
tsk_treeseq_t
object which contains a parent with contradictory children on an interval. Previously an error occurred when some operation building the trees was attempted. (@jeromekelleher, #709)
New features
- New methods to perform set operations on table collections.
tsk_table_collection_subset
subsets and reorders table collections by nodes (@mufernando, @petrelharp, #663, #690).tsk_table_collection_union
forms the node-wise union of two table collections. (@mufernando, @petrelharp, #381, #623) - Mutations now have an optional double-precision floating-point
time
column. If not specified, this defaults to a particularNaN
value (TSK_UNKNOWN_TIME
) indicating that the time is unknown. For a tree sequence to be considered valid it must meet new criteria for mutation times, see Mutation requirements. Addtsk_table_collection_compute_mutation_times
and new flag totsk_table_collection_check_integrity
:TSK_CHECK_MUTATION_TIME
. Table sorting orders mutations by non-increasing time per-site, which is also a requirement for a valid tree sequence. (@benjeffery, #672) - Add
metadata
andmetadata_schema
fields to table collection, with accessors on tree sequence. These store arbitrary bytes and are optional in the file format. (:user: benjeffery, #641) - Add the
TSK_KEEP_UNARY
option to simplify (@gtsambos). See #1 and #143. - Add a
set_root_threshold
option totsk_tree_t
which allows us to set the number of samples a node must be an ancestor of to be considered a root. (#462) - Change the semantics of
tsk_tree_t
so that sample counts are always computed, and add a newTSK_NO_SAMPLE_COUNTS
option to turn this off. (#462) - Tables with metadata now have an optional
metadata_schema
field that can contain arbitrary bytes. (@benjeffery, #493) - Tables loaded from a file can now be edited in the same way as any other table collection (@jeromekelleher, #536, #530)
- Support for reading/writing to arbitrary file streams with the
loadf
/dumpf
variants for tree sequence and table collection load/dump. (@jeromekelleher, @grahamgower, #565, #599) - Add low-level sorting API and
TSK_NO_CHECK_INTEGRITY
flag. (@jeromekelleher, #627, #626) - Add extension of Kendall-Colijn tree distance metric for tree sequences computed by
tsk_treeseq_kc_distance
(@daniel-goldstein, #548)
Deprecated
- The
TSK_SAMPLE_COUNTS
options is now ignored and will print out a warning if used. (#462)