-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Double counting validParis #68
Comments
This is a good point. Thanks for catching it.
My sense is that it would have little impact if fragments were double counted based on Macs2 duplicate removal, but I will keep this in mind. Thanks Ken.
… On May 17, 2019, at 12:46 PM, Weiqiang Zhou ***@***.***> wrote:
Hi Caleb,
I think hichipper is double counting the validPairs. In the hicpro output folder /hic_results/data/sample/ , there will be a number of "*.validPairs" files and a "allValidPairs" file. The "allValidPairs" files should be the same as merging the "*.validPairs" files. I found that hichipper will search for all "*Pairs" files in the folder which means it will count the validPairs twice. I think it affects a number of steps in the hichipper pipeline including the peak calling and counting reads in peak regions.
I used some tricks to workaround it but it may be good for you to know this bug.
Ken
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub <#68?email_source=notifications&email_token=AD32FYJNB75GH4AJBCIYMZLPV3ONTA5CNFSM4HNW2MKKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4GUOHPHQ>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AD32FYKQH33L3YDXMRPNU4TPV3ONTANCNFSM4HNW2MKA>.
|
Thanks Caleb. In my experience, without correcting for this bug, hichipper generates significantly more peaks (e.g., N=268,065) than correcting for this bug (e.g., N=187,489). This is based on the following peak calling setting:
|
Hi Caleb,
I think hichipper is double counting the validPairs. In the hicpro output folder /hic_results/data/sample/ , there will be a number of "*.validPairs" files and a "allValidPairs" file. The "allValidPairs" files should be the same as merging the "*.validPairs" files. I found that hichipper will search for all "*Pairs" files in the folder which means it will count the validPairs twice. I think it affects a number of steps in the hichipper pipeline including the peak calling and counting reads in peak regions.
I used some tricks to workaround it but it may be good for you to know this bug.
Ken
The text was updated successfully, but these errors were encountered: