Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a note about what is serialised to file #3075

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

hyanwong
Copy link
Member

This is information that I had to ask about (here). I'm still not 100% sure what is saved into a tree sequence file (e.g. not just the indexes, but various cached properties too, I suspect), so at the moment this doc change says:

When serializing (e.g. storing a {class}TreeSequence to disk). the underlying tables
are stored along with the indexes and other stuff. When the tree sequence is loaded
from file, it is then guaranteed to be valid, with pre-calculated indexes and cached
properties (which ones?) immediately available.

If a {class}TableCollection is saved to file, then any indexes are also stored in the
file. A {class}TableCollection that has been loaded from a file is not, however,
guaranteed to be a valid tree sequence.

Once I have information to fill in the and other stuff bits, I'll take this off draft PR mode

Copy link

codecov bot commented Dec 16, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.86%. Comparing base (14f8ed2) to head (6f5127c).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #3075   +/-   ##
=======================================
  Coverage   89.86%   89.86%           
=======================================
  Files          29       29           
  Lines       32150    32150           
  Branches     5768     5768           
=======================================
  Hits        28890    28890           
  Misses       1859     1859           
  Partials     1401     1401           
Flag Coverage Δ
c-tests 86.71% <ø> (ø)
lwt-tests 80.78% <ø> (ø)
python-c-tests 89.05% <ø> (ø)
python-tests 98.98% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

@jeromekelleher
Copy link
Member

jeromekelleher commented Dec 16, 2024

The other stuff is pretty minimal:

$ msp ancestry 10 -o tmp.trees
$ kastore ls tmp.trees 
edges/child
edges/left
edges/metadata
edges/metadata_offset
edges/metadata_schema
edges/parent
edges/right
format/name
format/version
indexes/edge_insertion_order
indexes/edge_removal_order
individuals/flags
individuals/location
individuals/location_offset
individuals/metadata
individuals/metadata_offset
individuals/metadata_schema
individuals/parents
individuals/parents_offset
metadata
metadata_schema
migrations/dest
migrations/left
migrations/metadata
migrations/metadata_offset
migrations/metadata_schema
migrations/node
migrations/right
migrations/source
migrations/time
mutations/derived_state
mutations/derived_state_offset
mutations/metadata
mutations/metadata_offset
mutations/metadata_schema
mutations/node
mutations/parent
mutations/site
mutations/time
nodes/flags
nodes/individual
nodes/metadata
nodes/metadata_offset
nodes/metadata_schema
nodes/population
nodes/time
populations/metadata
populations/metadata_offset
populations/metadata_schema
provenances/record
provenances/record_offset
provenances/timestamp
provenances/timestamp_offset
sequence_length
sites/ancestral_state
sites/ancestral_state_offset
sites/metadata
sites/metadata_offset
sites/metadata_schema
sites/position
time_units
uuid

So, tables, top-level metadata, indexes and a handful of properties: sequence_length, format info, sequence_length, time_units and uuid (which we don't really use).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants