You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Functional enrichment is a widely used piece of the pangenomics workflow, but as with any automated statistical procedure, its important to give clear guidance on its use case and monitor its use in the wild. A quick survey of papers citing Shaiber et al (by @ivagljiva and @adw96 ) suggested that most users are doing a great job of using the method appropriately. That said, there are a few things that I (as the original author of the underpinning script) could do to clarify its use case, point out some potential pitfalls, and generally guide people in the right direction.
The solution
I aspire to do the following
documentation
clarify that use case is for pre-determined groups. Groups should not be determined using the pangenome and then tested for differentially enriched functions.
clarify use case is two-group comparison
point people to blog post for more complex designs
blog post
how to pull out the relevant data and import into R
showcase flexibility of general procedure
provide clear interpretation of estimated parameters
incorporating additional covariates
how you could look at time series data
how you could do a global test eg if you have >2 groups
how to fit a different model or run a different test. Showcase happi as example 🥕
A challenge will be that, unlike the two group comparison case, users now need to choose what model is reasonable. While many uses have fantastic intuition for this, writing out the "rules" is very difficult, and many people aren't going to get good statistical instruction (especially not from chatgpt/the internet). So, how to we guide people without writing a textbook. (Could point them to In Press NM paper?)
I aspire to have a draft on a branch by the beginning of February. I will ask @ivagljiva and @tucker4 for feedback.
Beneficiaries
Folx using the pangenomics workflow.
The text was updated successfully, but these errors were encountered:
The need
Functional enrichment is a widely used piece of the pangenomics workflow, but as with any automated statistical procedure, its important to give clear guidance on its use case and monitor its use in the wild. A quick survey of papers citing Shaiber et al (by @ivagljiva and @adw96 ) suggested that most users are doing a great job of using the method appropriately. That said, there are a few things that I (as the original author of the underpinning script) could do to clarify its use case, point out some potential pitfalls, and generally guide people in the right direction.
The solution
I aspire to do the following
A challenge will be that, unlike the two group comparison case, users now need to choose what model is reasonable. While many uses have fantastic intuition for this, writing out the "rules" is very difficult, and many people aren't going to get good statistical instruction (especially not from chatgpt/the internet). So, how to we guide people without writing a textbook. (Could point them to In Press NM paper?)
I aspire to have a draft on a branch by the beginning of February. I will ask @ivagljiva and @tucker4 for feedback.
Beneficiaries
Folx using the pangenomics workflow.
The text was updated successfully, but these errors were encountered: