Misphased simple tandem repeat at chr10_PATERNAL:98945239-98945283 #669
Labels
small_error
This error spans fewer than 50 bases in the assembly
v1.0
This is an issue/error in the hg002v1.0 assembly
Have you confirmed that this issue hasn't already been reported?
Issue location in assembly
chr10_PATERNAL:98945239-98945283
Description of the issue
The v1.0 assembly is correct in representing this TAAA tandem repeat as a heterozygote, but it has assigned the two haplotypes incorrectly. At chr10_MATERNAL:99,073,347-99,073,387 in the assembly, there are 10 TAAA's (followed by a T) and at chr10_PATERNAL:98945239-98945283 in the assembly, there are 11 TAAA's (followed by a T). Short reads (e.g., Element 1000-base insert reads for the entire trio, aligned separately to the maternal/paternal chromosomes of the assembly, shown below) show that the longer haplotype had to have been inherited from mom (HG004) and the shorter haplotype had to have been inherited from dad (HG003). Most long reads (all HiFi revio and duplex, and the majority of UL reads, but not all) align perfectly to the two haplotypes, *because this heterozygous position is in a long run of homozygosity (ROH) in HG002, so that the nearest neighboring heterozygous site is about 115kb downstream (5') or 180kb upstream (3'). So there aren't many reads that align to the correct haplotype here, but the ones that do point out that this spot on the paternal chromosome should have 10 TAAA's in its repeat, not 11. DeepTrio calls on the Element data for the trio also back up this assessment.
VCF patch:
chr10_PATERNAL 98945238 chr10_PATERNAL_98945238_patch CTAAA C 40 . . GT 1/1
The text was updated successfully, but these errors were encountered: