-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE REQUEST] Different gene caller option(s) for anvi-gen-contigs-database
#2298
Comments
Using |
Thanks @xvazquezc, I just found out all the unfixed bugs in prodigal that were fixed in prodigal-gv and pyrogidal/pyrogidal-gv! Just for documentation, here are some known issues:
We should default to pyrodigal/pyrogidal-gv. |
I don't know how multi-threading is managed in anvi'o, but maybe this will be relevant: althonos/pyrodigal#57
Depending on how you are parsing Prodigal's outputs, you might to change the parsing code a bit because |
Thank you very much for your input, @apcamargo. I will work on this and try to come up with a modular solution. |
Sure! Let me know if there's anything I can help. Another (minor) consequence of changing the gene caller that I just remembered, and that is somewhat related to an issue that I opened a few months ago (#2195), is that the alternative genetic codes are not taken into account in In my data, I wrote the code to compute pN/pS from scratch (due to the bug in the potential computation I linked above) and, as far as I remember, the effect of alternative genetic codes in pN/pS was negligible. So, I don't think this is something super important, but could be good to have in mind. |
A small project to improve anvi'o, based upon feedback/ideas @FlorianTrigodet and I heard from our colleagues at the QIB in Norwich.
The need
There is interest in being able to use alternative gene calling software in addition to
prodigal
, within anvi'o (ie, instead of having to run gene calling outside of anvi'o and using external gene calls). We've heard specifically aboutprodigal-gv
, a fork ofprodigal
that has additions to improve gene calling for viruses, andpyrodigal
/pyrodigal-gv
which are the respective Python modules for using these software directly in the code. However, there could be other gene callers of interest to the community.The solution
This small project is flexible in scope depending on which gene calling software we want to support and how far you (the developer) want to go with the refactor. Here are some possibilities:
prodigal-gv
could be as simple as adding a variable to store eitherprodigal
orprodigal-gv
according to user input, and replacing all instances of callingprodigal
with this variable. It would use the same driver/parser modules asprodigal
uses, and in theory no further changes would be necessarypyrodigal
options would require changes to how we actually run the gene calling step. We would no longer use a driver program that runs theprodigal
binary, but would switch that to using thepyrodigal
classes directly. Multi-threading and parsing of the results would also have to change to be compatible with those classes (they are thread-safe but it looks like we would still manage the multi-threading on our own).anvi-export-gene-calls
#2181 and [FEATURE REQUEST] Refactoring Anvio to be more eukaryote friendly/account for different genetic architectures #2297. That would require much more extensive code changes.Beneficiaries
All users of anvi'o, but (in the case of
prodigal-gv
) especially those who work on viruses.The text was updated successfully, but these errors were encountered: