Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misphased simple tandem repeat at chr10_PATERNAL:98945239-98945283 #669

Open
1 task done
nhansen opened this issue Feb 9, 2024 · 2 comments
Open
1 task done

Misphased simple tandem repeat at chr10_PATERNAL:98945239-98945283 #669

nhansen opened this issue Feb 9, 2024 · 2 comments
Labels
small_error This error spans fewer than 50 bases in the assembly v1.0 This is an issue/error in the hg002v1.0 assembly

Comments

@nhansen
Copy link
Collaborator

nhansen commented Feb 9, 2024

Have you confirmed that this issue hasn't already been reported?

  • I have confirmed in the UCSC browser hub that this is a new issue (required)

Issue location in assembly

chr10_PATERNAL:98945239-98945283

Description of the issue

The v1.0 assembly is correct in representing this TAAA tandem repeat as a heterozygote, but it has assigned the two haplotypes incorrectly. At chr10_MATERNAL:99,073,347-99,073,387 in the assembly, there are 10 TAAA's (followed by a T) and at chr10_PATERNAL:98945239-98945283 in the assembly, there are 11 TAAA's (followed by a T). Short reads (e.g., Element 1000-base insert reads for the entire trio, aligned separately to the maternal/paternal chromosomes of the assembly, shown below) show that the longer haplotype had to have been inherited from mom (HG004) and the shorter haplotype had to have been inherited from dad (HG003). Most long reads (all HiFi revio and duplex, and the majority of UL reads, but not all) align perfectly to the two haplotypes, *because this heterozygous position is in a long run of homozygosity (ROH) in HG002, so that the nearest neighboring heterozygous site is about 115kb downstream (5') or 180kb upstream (3'). So there aren't many reads that align to the correct haplotype here, but the ones that do point out that this spot on the paternal chromosome should have 10 TAAA's in its repeat, not 11. DeepTrio calls on the Element data for the trio also back up this assessment.

VCF patch:
chr10_PATERNAL 98945238 chr10_PATERNAL_98945238_patch CTAAA C 40 . . GT 1/1

@nhansen nhansen added v1.0 This is an issue/error in the hg002v1.0 assembly small_error This error spans fewer than 50 bases in the assembly labels Feb 9, 2024
@nhansen
Copy link
Collaborator Author

nhansen commented Feb 9, 2024

image

@nhansen
Copy link
Collaborator Author

nhansen commented Feb 9, 2024

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
small_error This error spans fewer than 50 bases in the assembly v1.0 This is an issue/error in the hg002v1.0 assembly
Projects
None yet
Development

No branches or pull requests

1 participant