-
Notifications
You must be signed in to change notification settings - Fork 37
Grouping of genomes
It is possible to group genomes into what could be called 'communities' or 'sub-communities' by splitting the metadata of genomes or using multiple metadata files in the first place. This allows to set a specific group abundance ratio. One possibility would be to split data into archaea and bacteria and have a one to one ratio. This would ensure that the summed up abundance of one group is identical to that of the other group. In the CAMI-challenge microorganisms were used as one group and circular elements as a second group. The ratio of 1 to 15 was used to reflect the fact that there are far more circular elements than genomes.
The abundances of all groups are drawn independently of each other from a distribution that can be set for each group individually.
For the CAMI-challenge all groups were set to the same distribution.
Using the ratios, the abundances of groups are then increased or reduced, but keeping the relative abundance of genomes within a group the same.
All groups are combined into a single abundance file for further processing by a read simulator.