You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello again. I found a rather bizarre case in my testing and I'm a bit stuck now. I'm trying to run anvi-cluster-contigs with MaxBin2, and it's failing due to a single missing coverage value. I confirmed that the /tmp files indeed are missing that value, and it's precisely the last one (last contig), which is very suspicious. Strangely enough, I only have this issue with one out of three samples.
As you can see, the sequence_contigs.fa has 34,990 sequences, but the contig_coverages.txt file only has values for 34,989 contigs (plus the header line). The one missing is the very last one:
✖ anvi-cluster-contigs encountered an error after 0:11:43.825390
Config Error: Some critical output files are missing. Please take a look at the log file:
/tmp/tmp87wnge7z/logs.txt
And the logs.txt file has:
# DATE: 23 Jul 24 20:37:22
# CMD LINE: run_MaxBin.pl -contig /tmp/tmp87wnge7z/sequence_contigs.fa -abund /tmp/tmp87wnge7z/contig_coverages.txt -out /tmp/tmp87wnge7z/MAXBIN_ -thread 1
MaxBin 2.2.7
Input contig: /tmp/tmp87wnge7z/sequence_contigs.fa
Located abundance file [/tmp/tmp87wnge7z/contig_coverages.txt]
out header: /tmp/tmp87wnge7z/MAXBIN_
Thread: 1
Searching against 107 marker genes to find starting seed contigs for [/tmp/tmp87wnge7z/sequence_contigs.fa]...
Running FragGeneScan....
Running HMMER hmmsearch....
Done data collection. Running MaxBin...
Command: /gpfs/gpfs1/scratch/c9881009/local/.conda/envs/anvio-dev/opt/MaxBin-2.2.7/src/MaxBin -fasta /tmp/tmp87wnge7z/MAXBIN_.contig.tmp -abund /tmp/tmp87wnge7z/MAXBIN_.contig.tmp.abund1 -seed /tmp/tmp87wnge7z/MAXBIN_.seed -out /tmp/tmp87wnge7z/MAXBIN_ -min_contig_length 1000
Failed to get Abundance information for contig [d071_br05_S7_000000034990] in file [/tmp/tmp87wnge7z/MAXBIN_.contig.tmp.abund1]
Error encountered while running core MaxBin program. Error recorded in /tmp/tmp87wnge7z/MAXBIN_.log.
Program Stop.
I checked using other drivers (e.g., CONCOCT or MetaBat2), and the same issue is present (i.e., one missing coverage value) but it simply doesn't fail because those programs silently ignore missing values (my guess).
The files are relatively big, so I'm making them available only temporarily here (apologies if you find this issue years from now and want to reproduce it!):
03_CONTIGS/d071_br05_S7-contigs.db (927M)
06_MERGED/d071_br05_S7/PROFILE.db (1.6G)
Thank you!
Miguel.
The text was updated successfully, but these errors were encountered:
Some (potentially) important additional information: I just noticed that the "missing coverage" is for a contig shorter than the limit I had set. I ran anvi-profile with --min-contig-length 5000, but this contig is 4,167 bp in length, so it shouldn't even be here. Any ideas on how it made it through?
Apologies for the frustration here, @lmrodriguezr. My general response for #2309 sadly applies here as well. But there is certainly a bug here given you mentioned this:
Some (potentially) important additional information: I just noticed that the "missing coverage" is for a contig shorter than the limit I had set. I ran anvi-profile with --min-contig-length 5000, but this contig is 4,167 bp in length, so it shouldn't even be here. Any ideas on how it made it through?
I think the problem here likely stems from anvi-cluster-contigs reading from the contigs-db rather than profile-db to figure out which contigs to report. That will always lead to an issue since you may have more contigs in contigs-db compared to the linked profile-db as a function of flags like --min-contig-length that excludes some contigs from the profiled results.
One quick question: I presume there are many more contigs in the contigs-db that are shorter than 4,167, right? Because if that is the case, perhaps the bug is not coming from where I think it is :)
Short description of the problem
Hello again. I found a rather bizarre case in my testing and I'm a bit stuck now. I'm trying to run
anvi-cluster-contigs
with MaxBin2, and it's failing due to a single missing coverage value. I confirmed that the/tmp
files indeed are missing that value, and it's precisely the last one (last contig), which is very suspicious. Strangely enough, I only have this issue with one out of three samples.Any help is greatly appreciated!
anvi'o version
System info
OS: Rocky Linux 8.6 (Green Obsidian). Installed using the instructions for developer version.
Detailed description of the issue
I'm running a metagenome workflow using three samples, and only one of them causes the issue. The specific step is:
In the temporary folder:
As you can see, the
sequence_contigs.fa
has 34,990 sequences, but thecontig_coverages.txt
file only has values for 34,989 contigs (plus the header line). The one missing is the very last one:After a few minutes, the command above returns:
And the
logs.txt
file has:I checked using other drivers (e.g., CONCOCT or MetaBat2), and the same issue is present (i.e., one missing coverage value) but it simply doesn't fail because those programs silently ignore missing values (my guess).
Files / commands to reproduce the issue
Command
Files
The files are relatively big, so I'm making them available only temporarily here (apologies if you find this issue years from now and want to reproduce it!):
03_CONTIGS/d071_br05_S7-contigs.db
(927M)06_MERGED/d071_br05_S7/PROFILE.db
(1.6G)Thank you!
Miguel.
The text was updated successfully, but these errors were encountered: