Skip to content

Preprocessing Script Tutorial

Marcus Fedarko edited this page Jan 7, 2018 · 4 revisions

This tutorial will take you through the process of converting an assembly graph file into a SQLite3 database file that can be visualized in MetagenomeScope's viewer interface application. This will necessitate using MetagenomeScope's preprocessing script, a command-line tool.

Installing the script

Full instructions for installing the preprocessing script are outlined in INSTALLING.txt.

Running the script

Sample data

The preprocessing script should work with any assembly graph in GFA, GML, or LastGraph format. We've provided a sample test file (in GML format) for reference, if you don't have any assembly graph data readily available. (This same assembly graph will be used in the viewer interface tutorial, also.)

Commands

TODO -- installation location of collate.py should be taken care of in INSTALLING.txt / in the installation section above in this tutorial. that should be used here.

Our objective is to convert the given assembly graph file into a database file ready for visualization in the viewer interface. The simplest way to do this is to run the following command:

python graph_collator/collate.py -i sample.gml -o sample

This will create a database file sample.db in the current working directory.

The preprocessing script features a number of other arguments that can be used to produce alternate functionality. See the Preprocessing Script Settings page for more information on these settings.