Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Double counting validParis #68

Open
WeiqiangZhou opened this issue May 17, 2019 · 2 comments
Open

Double counting validParis #68

WeiqiangZhou opened this issue May 17, 2019 · 2 comments

Comments

@WeiqiangZhou
Copy link

Hi Caleb,

I think hichipper is double counting the validPairs. In the hicpro output folder /hic_results/data/sample/ , there will be a number of "*.validPairs" files and a "allValidPairs" file. The "allValidPairs" files should be the same as merging the "*.validPairs" files. I found that hichipper will search for all "*Pairs" files in the folder which means it will count the validPairs twice. I think it affects a number of steps in the hichipper pipeline including the peak calling and counting reads in peak regions.
I used some tricks to workaround it but it may be good for you to know this bug.

Ken

@caleblareau
Copy link
Contributor

caleblareau commented May 28, 2019 via email

@WeiqiangZhou
Copy link
Author

This is a good point. Thanks for catching it. My sense is that it would have little impact if fragments were double counted based on Macs2 duplicate removal, but I will keep this in mind. Thanks Ken.

Thanks Caleb. In my experience, without correcting for this bug, hichipper generates significantly more peaks (e.g., N=268,065) than correcting for this bug (e.g., N=187,489). This is based on the following peak calling setting:
peaks:

  • EACH,ALL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants