Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need help to determine method for inference of convergent evolution #66

Open
qianxuans opened this issue Jan 30, 2023 · 1 comment
Open

Comments

@qianxuans
Copy link

Hi,
I am doing an analysis to infer the convergent evolution of bacteria in a longitudinal study.
Several clones of the same bacterium are studied to determine whether they have within-host convergent evolution. For each clone, samples were collected from different time points. It is kind of similar idea to this research #62 .
If I want to analyze whether there is convergent evolution that occurs among several clones, what is the best method I should use?

  1. Should I call SNP and use the original SNPGenie or should I use the within-group with the msa? If I use msa instead of vcf, would it be overkill like the situation described here? analysing within-host diversity #44
  2. Will VCFGenie be helpful in this case?

Thank you so much for your help!

@singing-scientist
Copy link
Contributor

Greetings, @qianxuans !

To me, the question 'is there convergent evolution' could simply mean, 'does the same mutation arise independently in different lineages'? Alternatively, it could mean 'does the same mutation arise independently and also increase in frequency to >50% in different lineages'? In the first case it might be enough to determine whether the variant is present in multiple clones. In the second, it might be even simpler, i.e., whether the same variant is present in the consensus sequence of multiple clones at the end of the study. If you do find such a variant, you'd probably want to deep sequence the original/source sample to see whether the variant was already present at low levels, or whether it arose de novo in multiple lineages.

I'm not sure what to advise, because the best approach will depend on the specific question you have. VCFgenie is up and running, and would be useful for quality filtering VCF files to help determine which variants are real (not sequencing error). SNPGenie can use those VCF files to estimate natural selection, if that's part of your goal. If you chose the MSA version, you'd probably be comparing consensus sequences from different time points, which is a different approach than within-timepoint variant.

Let me know if that helps!
Chase

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants