Skip to content

sharkLoc/rust-in-bioinformatics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

49 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

rust in bioinformatics

A collection of genomics software tools written in Rust

index section

bam
  • alignoth : Creating alignment plots from bam files
  • bamrescue : Utility to check Binary Sequence Alignment / Map (BAM) files for corruption and repair them
  • best : Bam Error Stats Tool (best): analysis of error types in aligned reads
  • modkit : A bioinformatics tool for working with modified bases
  • mapAD : An aDNA aware short-read mapper
  • perbase : Per-base per-nucleotide depth analysis
  • rustybam : bioinformatics toolkit in rust
csv
  • csview : 📠 Pretty and fast csv viewer for cli with cjk/emoji support
  • csvlens : csvlens is a command line CSV file viewer. It is like less but made for CSV.
  • madato : Markdown Cmd Line, Rust and JS library for Excel to Markdown Tables
  • tabiew : A lightweight TUI app to view and query CSV files
  • tv : 📺(tv) Tidy Viewer is a cross-platform CLI csv pretty printer that uses column styling to maximize viewer enjoyment.
  • xan : The CSV magician
  • xsv : A fast CSV command line toolkit written in Rust.  
  • xtab : CSV command line utilities
dna
  • fakit : fakit: a simple program for fasta file manipulation
  • filterx : process any file in tabular format. Fasta/fastq/GTF/GFF/VCF/SAM/BED
  • fq : Command line utility for manipulating Illumina-generated FASTQ files.
  • gsearch : Approximate nearest neighbour search for microbial genomes based on hash metric
  • Hyper-Gen : HyGen: Compact and Efficient Genome Sketching using Hyperdimensional Vectors
  • kfc : KFC (K-mer Fast Counter) is a fast and space-efficient k-mer counter based on hyper-k-mers.
  • ngs : Command line utility for working with next-generation sequencing files.
  • nail : Nail is an Alignment Inference tooL
  • palindrome-finder : A bioinformatics tool written in Rust to find palindromic sequences in DNA
  • poasta : Fast and exact gap-affine partial order alignment
  • rust-bio-tools : A set of command line utilities based on Rust-Bio.
  • skc : Shared k-mer content between two genomes
  • sketchy : Genomic neighbor typing of bacterial pathogens using MinHash 🐀
  • tidk : Identify and find telomeres, or telomeric repeats in a genome.
  • xgt : Efficient and fast querying and parsing of GTDB's data
fastq
  • fasten : 👷 Fasten toolkit, for streaming operations on fastq files
  • faster : A (very) fast program for getting statistics about a fastq file, the way I need them, written in Rust
  • fqgrep : Grep for FASTQ files
  • fqkit : 🦀 Fqkit: A simple and cross-platform program for fastq file manipulation  
  • fqtk : Fast FASTQ sample demultiplexing in Rust.
  • rasusa : Randomly subsample sequencing reads

format

  • bigtools : A high-performance BigWig and BigBed library in Rust
  • d4tools : The D4 Quantitative Data Format
  • gfa2bin : Convert various graph-related data to PLINK file. In addition, we offer multiple commands for filtering or modifying the generated PLINK files.
  • gia : gia: Genomic Interval Arithmetic
  • granges : A Rust library and command line tool for working with genomic ranges and their data.
  • intspan : Command line tools for IntSpan related bioinformatics operations
  • recmap : A command line tool and Rust library for working with recombination maps.
  • thirdkind : Drawing reconciled phylogenetic trees allowing 1, 2 or 3 reconcillation levels
gff3
  • atg : A Rust library and CLI tool to handle genomic transcripts
  • gffkit : a simple program for gff3 file manipulation
longreads
  • Autocycler : A tool for generating consensus long-read assemblies for bacterial genomes
  • chopper : Rust implementation of NanoFilt+NanoLyse, both originally written in Python. This tool, intended for long read sequencing such as PacBio or ONT, filters and trims a fastq file.
  • herro : HERRO is a highly-accurate, haplotype-aware, deep-learning tool for error correction of Nanopore R10.4.1 or R9.4.1 reads (read length of >= 10 kbps is recommended).
  • HiPhase : Small variant, structural variant, and short tandem repeat phasing tool for PacBio HiFi reads
  • longshot : diploid SNV caller for error-prone reads
  • lrge : Genome size estimation from long read overlaps
  • Polypolish : a short-read polishing tool for long-read assemblies
  • nextpolish2 : Repeat-aware polishing genomes assembled using HiFi long reads
  • nanoq : Minimal but speedy quality control for nanopore reads in Rust 🐻
  • smrest : Tumour-only somatic mutation calling using long reads
  • trgt : Tandem repeat genotyping and visualization from PacBio HiFi data
metagenomics
  • coverm : Read coverage calculator for metagenomics
  • kun_peng : Kun-peng: an ultra-fast, low-memory footprint and accurate taxonomy classifier for all
  • kmertools : kmer based feature extraction tool for bioinformatics, metagenomics, AI/ML and more
  • nohuman : Remove human reads from a sequencing run
  • skani : Fast, robust ANI and aligned fraction for (metagenomic) genomes and contigs.
  • sourmash : Quickly search, compare, and analyze genomic and metagenomic data sets.
  • sylph : ultrafast genome querying and taxonomic profiling for metagenomic samples by abundance-corrected minhash.
  • vircov : Viral genome coverage evaluation for metagenomic diagnostics 🩸
pangenomics
  • impg : implicit pangenome graph
  • panacus : Panacus is a tool for computing statistics for GFA-formatted pangenome graphs
phylogenomics
  • nextclade : Viral genome alignment, mutation calling, clade assignment, quality checks and phylogenetic placement
  • unicore : Universal and efficient core gene phylogeny with Foldseek and ProstT5
  • segul : An ultrafast and memory efficient tool for phylogenomics
proteomics
  • align-cli : A CLI for pairwise alignment of sequences, using both normal and mass based alignment.
  • foldmason : Foldmason builds multiple alignments of large structure sets.
  • sage : Proteomics search & quantification so fast that it feels like magic
rna
  • oarfish : long read RNA-seq quantification
  • rnapkin : drawing RNA secondary structure with style; instantly
  • R2Dtool : R2Dtool is a set of genomics utilities for handling, integrating, and viualising isoform-mapped RNA feature data.
  • squab : Alignment-based gene expression quantification
singlecell
  • alevin-fry : 🐟 🔬🦀 alevin-fry is an efficient and flexible tool for processing single-cell sequencing data, currently focused on single-cell transcriptomics and feature barcoding.
  • cellranger : 10x Genomics Single Cell Analysis
  • precellar : Single-cell genomics preprocessing package
  • SnapATAC2 : Single-cell epigenomics analysis tools
slurm
  • ssubmit : Submit slurm sbatch jobs without the need to create a script
vcf
  • echtvar : using all the bits for echt rapid variant annotation and filtering
  • mehari: VEP-like tool for sequence ontology and HGVS annotation of VCF files
  • vcf2parquet : Convert vcf in parquet
  • vcfexpress : expressions on VCFs
other
  • htsget-rs : A server implementation of the htsget protocol for bioinformatics in Rust
  • ibu : a rust library for high throughput binary encoding of genomic sequences
  • scidataflow: Command line scientific data management tool
  • sufr : Parallel Construction of Suffix Arrays in Rust

About

A collection of genomics software tools written in Rust

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published