Grouping of genomes

(Sub-)Communities

It is possible to group genomes into what could be called 'communities' or 'sub-communities' by splitting the metadata of genomes or using multiple metadata files in the first place. This allows to set a specific group abundance ratio. One possibility would be to split data into archaea and bacteria and have a one to one ratio. This would ensure that the summed up abundance of one group is identical to that of the other group. In the CAMI-challenge microorganisms were used as one group and circular elements as a second group. The ratio of 1 to 15 was used to reflect the fact that there are far more circular elements than genomes.

Abundances

The abundances of all groups are drawn independently of each other from a distribution that can be set for each group individually. For the CAMI-challenge all groups were set to the same distribution. Using the ratios, the abundances of groups are then increased or reduced, but keeping the relative abundance of genomes within a group the same.
All groups are combined into a single abundance file for further processing by a read simulator.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Grouping of genomes

(Sub-)Communities

Abundances

Clone this wiki locally