Note: Adding this here as a reference since this was an issue with earlier versions of the Illumina pipeline and can lead to spurious reversions to reference bases.
bcftools consensus
calls a consensus sequence by “applying” variants to a reference sequence. However, the alignment file might have regions of low coverage due to issues like amplicon dropout and the low coverage might not be sufficient to reliably call variants. If such regions of low coverage are not masked (typically using N) properly, the consensus sequence generated will contain reference bases in place of any real variants that might be present in the “true” consensus sequence. To avoid this issue, regions of low coverage should be masked using tools like bedtools genomecov + bedtools maskfasta
and this masked reference sequence should be supplied to bcftools consensus
to call a reliable consensus sequence.