You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jun 21, 2023. It is now read-only.
In our analysis of copy number variation, we currently use a manually curated set of regions for which copy number variant analysis tends to be inaccurate, but this list is not tied to a clear public resource that is easily referenced and downloadable. The current list also seems to miss many false positive-prone regions, as indicated by strong banding patterns independent of sample type in figures such as this heatmap of CN variation
The ENCODE blacklist may provide an alternative resource for identifying regions which may be problematic in CNV analysis as well as other analyses where mismapping can lead to error.
A previous version of this blacklist and discussion of the situations where its use is recommended is published here:
Amemiya, H.M., Kundaje, A. & Boyle, A.P. The ENCODE Blacklist: Identification of Problematic Regions of the Genome. Sci Rep 9, 9354 (2019). https://doi.org/10.1038/s41598-019-45839-z
What changes need to be made? Please provide enough detail for another participant to make the update.
The ENCODE blacklist should be compared to our current CNV exclusion lists in the copy_number_consensus_call module. If there is sufficient overlap with the current set of excluded regions, we may be able to simply replace the current exclusion regions with the ENCODE blacklist. Otherwise, we may consider adding the ENCODE blacklist to the list of excluded regions.
Results should be checked to see if apparent false positives are reduced, while preserving known signals.
Other analysis, including SNVs, should also be checked against the ENCODE blacklist to avoid potential false positive signals.
The text was updated successfully, but these errors were encountered:
In our analysis of copy number variation, we currently use a manually curated set of regions for which copy number variant analysis tends to be inaccurate, but this list is not tied to a clear public resource that is easily referenced and downloadable. The current list also seems to miss many false positive-prone regions, as indicated by strong banding patterns independent of sample type in figures such as this heatmap of CN variation
The ENCODE blacklist may provide an alternative resource for identifying regions which may be problematic in CNV analysis as well as other analyses where mismapping can lead to error.
A previous version of this blacklist and discussion of the situations where its use is recommended is published here:
Amemiya, H.M., Kundaje, A. & Boyle, A.P. The ENCODE Blacklist: Identification of Problematic Regions of the Genome. Sci Rep 9, 9354 (2019). https://doi.org/10.1038/s41598-019-45839-z
An updated version of the blacklist has just been released at the following location.
https://www.encodeproject.org/files/ENCFF356LFX/
What changes need to be made? Please provide enough detail for another participant to make the update.
The ENCODE blacklist should be compared to our current CNV exclusion lists in the
copy_number_consensus_call
module. If there is sufficient overlap with the current set of excluded regions, we may be able to simply replace the current exclusion regions with the ENCODE blacklist. Otherwise, we may consider adding the ENCODE blacklist to the list of excluded regions.Results should be checked to see if apparent false positives are reduced, while preserving known signals.
Other analysis, including SNVs, should also be checked against the ENCODE blacklist to avoid potential false positive signals.
The text was updated successfully, but these errors were encountered: