-
Notifications
You must be signed in to change notification settings - Fork 81
Error and Warning messages
SnpEff defines several messages in roughly 3 categories:
- INFO: An informative message
- WARNING: A problem in the reference genome definition that MAY result in an incorrect variant annotation
- ERROR: A problem in the reference genome definition that WILL ALMOST CERTAINLY result in an incorrect variant annotation
The variant has been realigned to the most 3-prime position within the transcript.
This is usually done to comply with HGVS specification to always report the most 3-prime annotation. While VCF requires to realign to the left-most of the reference genome, HGSV requires to realign to the most 3-prime. These two specifications are contradicting in some cases, so in order to comply with HGSV, sometimes a local realignment is required.
IMPORTANT: This message is just indicating that a realignment was performed, so ** when this INFO message is present, the original coordinates from the VCF file are not exactly the same as the coordinates used to calculate the variant annotation **
The exon does not have reference sequence information. The annotation may not be calculated (e.g. incomplete transcripts).
The genome reference does not match the variant's reference.
For example, if the VCF file indicates that the reference at a certain location is 'A', while SnpEff's database indicates that the reference should be 'C', this WARNING would be added.
Under normal circumstances, there should be none of these warnings (or at most a handful).
IMPORTANT: If too many of these warnings are seen, this indicates a severe problem (version mismatch between your VCF files and the reference genome). A typical case when too many of these warning are seen is when trying to annotate using a different genome than the one used for alignment (e.g. reads are aligned to hg19 but variants are annotated to using hg38)
The number of coding bases is NOT multiple of 3, so there is missing information for at least one codon. This indicates an error in the reference genome gene and/or transcript definition. This could happen in genomes that are not well understood.
Multiple STOP codons found in a CDS. There should be only one STOP codon at the end of the transcript, but in this case, the transcript has multiple STOP codons, which is unlikely to be real.
This usually indicates an error on the reference genome (or database). Could for, for example, indicating frame errors in the reference genome for one or more exons in this transcript.
Start codon does not match any 'start' codon in the CodonTable.
This usually indicates an error on the reference genome (or database) but could be also due to a misconfigured codon table for the genome. You should check that the codon table is properly set in snpEff.config
Stop codon does not match any 'stop' codon in the CodonTable.
This usually indicates an error on the reference genome (or database) but could be also due to a misconfigured codon table for the genome. You should check that the codon table is properly set in snpEff.config
Chromosome name not found. Typically due to mismatch in chromosome naming conventions between variants file and database, but can be a more several problems (different reference genome).
See more details (here)[https://github.com/pcingola/SnpEff/wiki/ERROR_CHROMOSOME_NOT_FOUND]
Variant's genomic position is outside chromosome's range.
Simple, the variant coordinate is outside the reference genome chromosome's length.
IMPORTANT: If too many of these warnings are seen, this indicates a severe problem (version mismatch between your VCF files and the reference genome). A typical case when too many of these warning are seen is when trying to annotate using a different genome than the one used for alignment (e.g. reads are aligned to hg19 but variants are annotated to using hg38)
An exonic variant is falling outside the exon.
Missing coding sequence information. In this case, the full variant annotation cannot be calculated due to missing CDS information.
This usually indicates an error on the reference genome (or database).