-
Notifications
You must be signed in to change notification settings - Fork 4
Output files
The scope of allele calling feature in taranis is to collect as much information as possible from the sample files and the schema.
For that reason, when taranis is executed, it will create a big number of files, grouped in 3 main different folders.
The alignments folder groups the matching alignments files. A matching alignment information is generated each time that blastn is executed for a core gene against the sample file and the result is neither exact match nor locus not found (LNF).
The file follow this convention to facilitate its identification; match_alignment_<core_gene_name>_<sample_name>_paired_assembly.txt. Each file contains the heading
Core Gene | Sample Name | Alignment | Sequence |
---|
And 3 rows containing the alignment sequence of the sample, the schema sequence and the the row in between to identify if is a match "|" or space " " if there is a mismatch. An example of matching alignment is :
Core Gene | Sample Name | Alignment | Sequence |
---|---|---|---|
lmo0359 | RA-L2073 | sample | --C--GTAG-- |
lmo0359 | RA-L2073 | match | --!--!--!--!--!--- |
lmo0359 | RA-L2073 | schema | GCAGTAGG |
Note that matching alignment file is a tabulate separated file, but "txt" extension has been set for the file to keep in the right position the matching/no matching characters.
The proteins folder groups the translate coding to protein files. The protein information is generated also, when the blastn is executed for a core gene against the sample file and the result is neither exact match nor locus not found (LNF).
The file follow this convention to facilitate its identification; protein_<core_gene_name>_<sample_name>_paired_assembly.txt. Each file contains the heading
Core Gene | Sample Name | Protein in | Protein Sequence |
---|
And 3 rows containing the alignment sequence of the sample, the schema sequence and the the row in between to identify if is a match "|" or space " " if there is a mismatch. An example of matching alignment is :
Core Gene | Sample Name | Protein in | Protein Sequence |
---|---|---|---|
lmo0359 | RA-L2073 | sample | LTAVAIGTLAG |
lmo0359 | RA-L2073 | match | --------------!----- |
lmo0359 | RA-L2073 | schema | MLYTMKDLLA |
The file has the "txt" extension to be opened by a text editor to keep the alignment matching.
The graphic folder will have statistics graphics about the allele calling.
Under the output folder the following directory structure will be created:
- deletions.tsv
- inferred_alleles.tsv
- insertions.tsv
- matching_contings.tsv
- paralog.tsv
- plot.tsv
- result.tsv
- snp.tsv
- summary_result.tsv