Skip to content

Grouping of genomes

Adrian Fritz edited this page Sep 12, 2018 · 5 revisions

(Sub-)Communities

It is possible to group genomes into what could be called 'communities' or 'sub-communities' by splitting the metadata of genomes or using multiple metadata files in the first place. This allows to set a specific group abundance ratio. One possibility would be to split data into archaea and bacteria and have a one to one ratio. This would ensure that the summed up abundance of one group is identical to that of the other group. In the CAMI-challenge microorganisms were used as one group and circular elements as a second group. The ratio of 1 to 15 was used to reflect the fact that there are far more circular elements than genomes.

Abundances

The abundances of all groups are drawn independently of each other from a distribution that can be set for each group individually. For the CAMI-challenge all groups were set to the same distribution. Using the ratios, the abundances of groups are then increased or reduced, but keeping the relative abundance of genomes within a group the same.
All groups are combined into a single abundance file for further processing by a read simulator.

Clone this wiki locally