This is a time-saving/automation tool for searching for potential splice sites in sequence data files (such as SAM or FASTQ files). Given a transcript and a position (or a delimited file containing a set of these per row) it searches the December 2013 archive of the Ensembl Genome Browser for the transcript and gets its cDNA FASTA sequence. The sequence found at the given position is then searched for in the sequence data files, any matching lines are written to a new file.
This application is written in JavaScript and requires Node.js to be installed in order to run. If Node isn't installed the easiest thing to do is download the binaries for your operating system and place them somewhere in your path.
With Node.js installed you can either clone this repository, or download and extract the ZIP:
$ git clone https://github.com/dsusco/splice-site-search.git
Next, install the module globally with:
$ npm install -g splice-site-search/
If you run into problems here you might need to run the command with sudo
. If you don't have sudo access you either install Node.js yourself (placing its bin
directory somewhere in your path) or use the program with node splice-site-search/index.js
instead.
To confirm that the program is working, run the following to display the help information:
$ splice-site-search -h
The program can be run in two ways:
This searches the files given for the sequence found for the given transcript and position.
$ splice-site-search [options] -t <transcript> -p <integer> <files...>
This searches the files given for the sequence found for each of the given transcripts and positions in the potential splice sites file.
$ splice-site-search [options] -s <file> <files...>
Additional options are described in the command's help information:
$ splice-site-search -h