Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot encode fasta to BioSequences.DNAAlphabet when N present #28

Open
pdimens opened this issue Nov 1, 2023 · 0 comments
Open

Cannot encode fasta to BioSequences.DNAAlphabet when N present #28

pdimens opened this issue Nov 1, 2023 · 0 comments

Comments

@pdimens
Copy link

pdimens commented Nov 1, 2023

Hello, I'm trying to go through the tutorial/example in the docs and I'm using a D. melanogaster genome instead of the E. coli one (it's what I have on hand). I cannot seem to get Pseudoseq to load the fasta file into Molecules() unless I remove N characters, which in my case I replaced them with A.
image
The genome is a pretty normal genome and the only funky in it is 7 N nucleotides. It's pretty normal to have N's in an assembly (although I expect they are less useful in simulation context). Is the intolerance of N nucleotides specific to Pseudoseq or FASTX/BioSequences?

@pdimens pdimens changed the title Cannot encode fasta to BioSequences.DNAAlphabet Cannot encode fasta to BioSequences.DNAAlphabet when N present Nov 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant