Genomics

Pilon is a software tool which can be used to (1) automatically improve draft assemblies and (2) find variation among strains, including large event detection.

java -jar /opt/pilon-1.22.jar

QUAST

version 4.5

QUAST evaluates genome assemblies

quast

CGAL

CGAL is a tool for computing genome assembly likelihoods. It computes the likelihood of reads with respect to the assembly and a statistical model which can be used as a metric for evaluating assemblies.

unser construction

SOAPdenovo

Version 2.04

Next generation sequencing reads de novo assembler.

MIRA

version 4.0.0 (the newest version 4.0.2 is not working properly)

MIRA is a multi-pass DNA sequence data assembler/mapper for whole genome and EST/RNASeq projects.

cicuta only

edena

vesrion 3

de novo short reads assembler

cicuta only

velvet

version 1.2.10

Velvet is a de novo genomic assembler specially designed for short read sequencing technologies, such as Solexa or 454.

cicuta only

GRAbB

GRAbB (Genome Region Assembly by Baiting) is program designed to assemble selected regions of the genome or transcriptome using reference sequences and NGS data.

grabb

cicuta only

REAPR

version: 1.0.18

REAPR is a tool that evaluates the accuracy of a genome assembly using mapped paired end reads, without the use of a reference genome for comparison. It can be used in any stage of an assembly pipeline to automatically break incorrect scaffolds and flag other errors in an assembly for manual inspection. It reports mis-assemblies and other warnings, and produces a new broken assembly based on the error calls.

reapr

cicuta only

TransDecoder

TransDecoder identifies candidate coding regions within transcript sequences, such as those generated by de novo RNA-Seq transcript assembly using Trinity, or constructed based on RNA-Seq alignments to the genome using Tophat and Cufflinks.

/opt/TransDecoder-TransDecoder-v5.0.2/TransDecoder.LongOrfs

mapping

bowtie2

version 2.2.6

Bowtie 2 is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences.

tophat

version 2.1.0

TopHat is a fast splice junction mapper for RNA-Seq reads. Please note that TopHat has entered a low maintenance, low support stage as it is now largely superseded by HISAT2 (32).

version 0.7.12-r1039

BWA is a software package for mapping low-divergent sequences against a large reference genome, such as the human genome.

gmap

version 2015-12-31

GMAP: a genomic mapping and alignment program for mRNA and EST sequences

aligner tutorial

STAR

version 2.5.0a

STAR: ultrafast universal RNA-seq aligner

STAR

hisat2

version 2.1.0

HISAT2 is a fast and sensitive alignment program for mapping next-generation sequencing reads (both DNA and RNA) to a population of genomes (as well as to a single reference genome).

hisat2 hisat2-align-s hisat2-align-l hisat2-build hisat2-build-s hisat2-build-l hisat2-inspect hisat2-inspect-s hisat2-inspect-l

File processing

fastx-toolkit

version 0.0.14

The FASTX-Toolkit is a collection of command line tools for Short-Reads FASTA/FASTQ files preprocessing.

seqtk

version 1.0-r31

Seqtk is a fast and lightweight tool for processing sequences in the FASTA or FASTQ format.

seqtk

bamtools

version 2.4.0

Bamtools is a command-line toolkit for reading, writing, and manipulating BAM (genome alignment) files.

samtools

Version: 1.6

SAM Tools provide various utilities for manipulating alignments in the SAM format, including sorting, merging, indexing and generating alignments in a per-position format.

samtools cheatsheet

bedtools

version 2.25.0

prinseq

PRINSEQ-lite 0.20.4

PRINSEQ will help you to preprocess your genomic or metagenomic sequence data in FASTA or FASTQ format

BBMap

BBMap short read aligner, and other bioinformatic tools.

/opt/BBMap/

bcftools

BCFtools is a set of utilities that manipulate variant calls in the Variant Call Format (VCF) and its binary counterpart BCF.

Annotation

cicuta only

RepeatModeler

version open-1.0.11

RepeatModeler is a de-novo repeat family identification and modeling package. At the heart of RepeatModeler are two de-novo repeat finding programs ( RECON and RepeatScout ) which employ complementary computational methods for identifying repeat element boundaries and family relationships from sequence data.

RepeatModeler

RepeatMasker

version open-4.0.7

RepeatMasker is a program that screens DNA sequences for interspersed repeats and low complexity DNA sequences. The output of the program is a detailed annotation of the repeats that are present in the query sequence as well as a modified version of the query sequence in which all the annotated repeats have been masked (default: replaced by Ns).

RepeatMasker

CIRI

Version: 2.0.6

CIRI: an efficient and unbiased algorithm for de novo circular RNA identification

perl /opt/CIRI_v2.0.6/CIRI2.pl

STARChip

This software is designed to take the chimeric output from the STAR alignment tool and discover high confidence fusions and circular RNA in the data.

Anna Karnkowska

Department of Molecular Phylogenetics and Evolution repository

Genomics

quality control

genome and transcriptome assembly

mapping

File processing

Annotation