Quality control and
filtering data
- clean_reads clean_reads.
- condetri condetri.
- cutadapt cutadapt removes adapter
sequences from next-generation sequencing data (Illumina, SOLiD and 454).
It is used especially when the read length of the sequencing machine is
longer than the sequenced molecule, like the microRNA case.
- FastQC FastQC
is a quality control tool for high-throughput sequence data (Babraham Institute) and is developed in Java. Import of
data is possible from FastQ files, BAM or SAM format. This tool
provides an overview to inform about problematic areas, summary graphs and
tables to rapid assessment of data. Results are presented in HTML permanent
reports. FastQC can be run as a stand alone application or it can be integrated
into a larger pipeline solution. See also seqanswers/FastQC.
- FASTX FASTX Toolkit is a
set of command line tools to manipulate reads in files FASTA or FASTQ
format. These commands make possible preprocess the files before mapping
with tools like Bowtie. Some of the tasks allowed are: conversion from
FASTQ to FASTA format, information about statistics of quality, removing
sequencing adapters, filtering and cutting sequences based on quality or
conversion DNA/RNA.
- Flexbar Flexbar performs
removal of adapter sequences, trimming and filtering features.
- FreClu FreClu
improves overall alignment accuracy performing sequencing-error correction
by trimming short reads, based on a clustering methodology.
- HTSeq HTSeq.
- htSeqTools htSeqTools
is a Bioconductor package able to perform quality control, processing of
data and visualization. htSeqTools makes possible visualize sample
correlations, to remove over-amplification artifacts, to assess enrichment
efficiency, to correct strand bias and visualize hits.
- PRINSEQ PRINSEQ
generates statistics of your sequence data for sequence length, GC
content, quality scores, n-plicates, complexity, tag sequences, poly-A/T
tails, odds ratios.
- qrqc qrqc
is a Bioconductor package to quick read quality control.
- RNA-SeQC RNA-SeQC
is a tool with application in experiment design, process optimization and
quality control before computational analysis. Essentially, provides three
types of quality control: read counts (such as duplicate reads, mapped
reads and mapped unique reads, rRNA reads, transcript-annotated reads,
strand specificity), coverage (like mean coverage, mean coefficient of
variation, 5’/3’ coverage, gaps in coverage, GC bias) and expression
correlation (the tool provides RPKM-based estimation of expression
levels). RNA-SeQC is implemented in Java and is not required installation,
however can be run using the GenePattern web interface. The input could be
one or more BAM files. HTML reports are generated as output.
- RSeQC RSeQC analyzes diverse
aspects of RNA-Seq experiments: sequence quality, sequencing depth, strand
specificity, GC bias, read distribution over the genome structure and
coverage uniformity. The input can be SAM, BAM, FASTA, BED files or
Chromosome size file (two-column, plain text file). Visualization can be
performed by genome browsers like UCSC, IGB and IGV. However, R scripts
can also be used to visualization.
- Sabre sabre.
- SAMStat SAMStat identifies problems
and reports several statistics at different phases of the process. This
tool evaluates unmapped, poorly and accurately mapped sequences
independently to infer possible causes of poor mapping.
- Scythe scythe.
- SEECER seecer SEECER is a sequencing error
correction algorithm for RNA-seq data sets. It takes the raw read
sequences produced by a next generation sequencing platform like machines
from Illumina or Roche. SEECER removes mismatch and indel errors from the
raw reads and significantly improves downstream analysis of the data.
Especially if the RNA-Seq data is used to produce a de novo transcriptome
assembly, running SEECER can have tremendous impact on the quality of the
assembly.
- Sickle Sickle.
- ShortRead ShortRead
is a package provided in the R (programming language) / BioConductor environments
and allows input, manipulation, quality assessment and output of
next-generation sequencing data. This tool makes possible manipulation of
data, such as filter solutions to remove reads based on predefined
criteria. ShortRead could be complemented with several Bioconductor
packages to further analysis and visualization solutions (BioStrings, BSgenome, IRanges, and so on). See also seqanswers/ShortRead.
- Trimmomatic Trimmomatic
performs trimming for Illumina platforms and works with FASTQ reads
(single or pair-ended). Some of the tasks executed are: cut adapters, cut
bases in optional positions based on quality thresholds, cut reads to a
specific length, converts quality scores to Phred-33/64.
Pre-processing data
- DeconRNASeq DeconRNASeq
is an R package for deconvolution of heterogeneous tissues based on mRNA-Seq
data.
- FastQ
Screen FastQ
Screen screens FASTQ format sequences to a set of databases to
confirm that the sequences contain what is expected (such as species
content, adapters, vectors, etc).
- FLASH FLASH is a read
pre-processing tool. FLASH combines paired-end reads which overlap and
converts them to single long reads.
- IDCheck IDCheck.
Alignment Tools
Short (Unspliced)
aligners
- BFAST BFAST
aligns short reads to reference sequences and presents particular
sensitivity towards errors, SNPs, insertions and deletions. BFAST works
with the Smith-Waterman algorithm. See also seqanwers/BFAST.
- Bowtie Bowtie is a
fast short aligner using an algorithm based on the Burrows-Wheeler transform and the FM-index.
Bowtie tolerates a small number of mismatches. See also seqanswers/Bowtie.
- Burrows-Wheeler
Aligner (BWA) BWA
implements two algorithms, mainly based on Burrows–Wheeler transform. The first
algorithm is used with reads with low error rate (<3%). The second
algorithm was designed to handle more errors and implements a combined
strategy: Burrows–Wheeler transform and Smith-Waterman
method. BWA allows mismatches and small gaps (insertions and deletions).
The output is presented in SAM format. See also seqanswers/BWA.
- Short
Oligonucleotide Analysis Package (SOAP) SOAP.
- GNUMAP GNUMAP performs alignment
using a probabilistic Needleman-Wunsch algorithm. This tool is able
to handle alignment in repetitive regions of a genome without losing
information. The output of the program was developed to make possible easy
visualization using available software.
- Maq Maq first aligns
reads to reference sequences and after performs a consensus stage. On the
first stage performs only ungapped alignment and tolerates up to 3
mismatches. See also seqanswers/Maq.
- Mosaik Mosaik. Mosaik
is able to align reads containing short gaps using Smith-Waterman algorithm, ideal to
overcome SNPs, insertions and deletions. See also seqanswers/Mosaik.
- NovoAlign
(commercial) NovoAlign
is a short aligner to the Illumina platform based on Needleman-Wunsch algorithm. Novoalign tolerates
up to 8 mismatches per read, and up to 7bp of indels. It is able to deal
with bisulphite data. Output in SAM format. See also seqanswers/NovoAlign.
- RazerS RazerS. See also seqanswers/RazerS.
- SEAL SEAL uses a MapReduce
model to produce distributed computing on clusters of computers. Seal uses
BWA to perform alignment and Picard MarkDuplicates
to detection and duplicate read removal. See also seqanswers/SEAL.
- SeqMap SeqMap. See
also seqanswers/SeqMap.
- SHRiMP SHRiMP employs two
techniques to align short reads. Firstly, the q-gram
filtering technique based on multiple seeds identifies candidate regions.
Secondly, these regions are investigated in detail using Smith-Waterman
algorithm. See also seqanswers/SHRiMP.
- SMALT Smalt.
- Stampy Stampy combines the
sensitivity of hash tables and the speed of BWA. Stampy is prepared to
alignment of reads containing sequence variation like insertions and
deletions. It is able to deal with reads up to 4500 bases and presents the
output in SAM format. See also seqanswers/Stampy.
- ZOOM
(commercial) ZOOM is
a short aligner of the Illumina/Solexa 1G platform. ZOOM uses extended
spaced seeds methodology building hash tables for the reads, and tolerates
mismatches and insertions and deletions. See also
Spliced aligners
Aligners based on known splice junctions (annotation-guided
aligners)
- Erange Erange is a tool to
alignment and data quantification to mammalian transcriptomes. See also seqanswers/Erange.
- IsoformEx IsoformEx.
- MapAL MapAL.
- OSA OSA.
- RNA-MATE RNA-MATE is a
computational pipeline for alignment of data from Applied Biosystems SOLID system. Provides
the possibility of quality control and trimming of reads. The genome
alignments are performed using mapreads and the splice junctions
are identified based on a library of known exon-junction sequences. This
tool allows visualization of alignments and tag counting. See also seqanswers/RNA-MATE.
- RUM RUM performs alignment
based on a pipeline, being able to manipulate reads with splice junctions,
using Bowtie and Blat. The flowchart starts doing alignment against a
genome and a transcriptome database executed by Bowtie. The next step is
to perform alignment of unmapped sequences to the genome of reference
using BLAT. In the final step all alignments are merged to get the final
alignment. The input files can be in FASTA or FASTQ format. The output is
presented in RUM and SAM format.
- RNASEQR RNASEQR. See
also seqanswers/RNASEQR.
- SAMMate SAMMate. See also seqanswers/SAMMate.
- SpliceSeq SpliceSeq.
- X-Mate X-Mate.
De novo Splice Aligners
- ABMapper ABMapper. See
also seqanswers/ABMapper.
- ContextMap ContextMap
was developed to overcome some limitations of other mapping approaches,
such as resolution of ambiguities. The central idea of this tool is to
consider reads in gene expression context, improving this way alignment
accuracy. ContextMap can be used as a stand-alone program and supported by
mappers producing a SAM file in the output (e.g.: TopHat or MapSplice). In
stand-alone mode aligns reads to a genome, to a transcriptome database or
both.
- CRAC CRAC propose a novel way of
analyzing reads that integrates genomic locations and local coverage, and
detect candidate mutations, indels, splice or fusion junctions in each
single read. Importantly, CRAC improves its predictive performance when
supplied with e.g. 200 nt reads and should fit future needs of read analyses.
- GSNAP GSNAP. See also seqanswers/GSNAP.
- HMMSplicer HMMSplicer
can identify canonical and non-canonical splice junctions in short-reads.
Firstly, unspliced reads are removed with Bowtie. After that, the
remaining reads are one at a time divided in half, then each part is
seeded against a genome and the exon borders are determined based on the Hidden Markov Model . A quality score is
assigned to each junction, useful to detect false positive rates. See also
seqanswers/HMMSplicer.
- MapSplice MapSplice.
See also seqanswers/MapSplice.
- OLego OLego.
See also seqanswers/OLego.
- PALMapper PALMapper. See
also seqanswers/PALMapper.
- Pass Pass aligns
gapped, ungapped reads and also bisulfite sequencing data. It includes
the possibility to filter data before alignment (remotion of adapters).
Pass uses Needleman-Wunsch and Smith-Waterman
algorithms, and performs alignment in 3 stages: scanning positions of seed
sequences in the genome, testing the contiguous regions and finally
refining the alignment. See also seqanswers/Pass.
- PASSion PASSion.
- PASTA PASTA.
- QPALMA QPALMA predicts
splice junctions supported on machine learning algorithms. In this case the
training set is a set of spliced reads with quality information and
already known alignments. See also seqanswers/QPALMA.
- SeqSaw SeqSaw.
- SoapSplice SoapSplice.
- SpliceMap SpliceMap.
See also seqanswers/SpliceMap.
- SplitSeek SplitSeek. See
also seqanswers/SplitSeek.
- SuperSplat SuperSplat was
developed to find all type of splice junctions. The algorithm splits each
read in all possible two-chunk combinations in an iterative way, and
alignment is tried to each chunck. Output in “Supersplat” format. See also
seqanswers/SuperSplat.
- Subread Subread[2]
is a superfast, accurate and scalable read aligner. It uses the
seed-and-vote mapping paradigm to determine the mapping location of the
read by using its largest mappable region. It automatically decides whether
the read should be globally mapped or locally mapped. For RNA-seq data,
Subread should be used for the purpose of expression analysis. Subread is
very powerful in mapping gDNA-seq reads as well. See also seqanswers/Subread.
- Subjunc Subjunc[2]
is a specialized version of Subread. It uses all mappable regions in an
RNA-seq read to discover exons and exon-exon junctions. It uses the
donor/receptor signals to find the exact splicing locations. Subjunc
yields full alignments for every RNA-seq read including exon-spanning
reads, in addition to the discovered exon-exon junctions. Subjunc should
be used for the purpose of junction detection and genomic variation
detection in RNA-seq data. See also seqanswers/Subjunc.
- TrueSight TrueSight.
De novo Splice Aligners that also use annotation optionally
- GEM.
- MapNext MapNext.
See also seqanswers/MapNext.
- STAR STAR is an ultrafast
tool that employs “sequential maximum mappable seed search in uncompressed
suffix arrays followed by seed clustering and stitching procedure”,
detects canonical, non-canonical splices junctions and chimeric-fusion
sequences. It is already adapted to align long reads (third-generation
sequencing technologies) and can reach speeds of 45 million paired reads
per hour per processor.[3]
See also seqanswers/STAR.
- TopHat TopHat[4]
is prepared to find de novo junctions. TopHat aligns reads in two steps.
Firstly, unspliced reads are aligned with Bowtie. After, the aligned reads
are assembled with Maq resulting islands of sequences. Secondly, the
splice junctions are determined based on the initially unmapped reads and
the possible canonical donor and acceptor sites within the island
sequences. See also seqanswers/TopHat.
Other Spliced Aligners[edit]
G.Mo.R-Se G.Mo.R-Se is a
method that uses RNA-Seq reads to build de novo gene models.
Quantitative analysis
and Differential Expression
- ALDex ALDex.
- Alexa-Seq Alexa-Seq
is a pipeline that makes possible to perform gene expression analysis,
transcript specific expression analysis, exon junction expression and
quantitative alternative analysis. Allows wide alternative expression
visualization, statistics and graphs. See also seqanswers/Alexa-Seq.
- ASC ASC. See also seqanswers/ASC.
- BaySeq BaySeq
is a Bioconductor package to identify differential expression using
next-generation sequencing data, via empirical Bayesian methods. There is an option of using
the “snow” package for parallelisation of computer data processing,
recommended when dealing with large data sets. See also seqanswers/BaySeq.
- BBSeq BBSeq.
See also seqanswers/BBSeq.
- BitSeq BitSeq.
- CEDER CEDER.
- CPTRA CPTRA.
- casper casper
is a Bioconductor package to quantify expression at the isoform level. It
combines using informative data summaries, flexible estimation of
experimental biases and statistical precision considerations which
(reportedly) provide substantial reductions in estimation error.
- Cufflinks Cufflinks is appropriate to
measure global de novo transcript isoform expression. It performs
assembly of transcripts, estimation of abundances and determines
differential expression (Cuffdiff) and regulation in RNA-Seq samples. See
also seqanswers/Cufflinks
.[5]
- DESeq DESeq
is a Bioconductor package to perform differential gene expression analysis
based on negative binomial distribution. See also seqanswers/DESeq.
- DEGSeq DEGSeq.
See also seqanswers/DEGSeq.
- DEXSeq DEXSeq
is Bioconductor package that finds differential differential exon usage
based on RNA-Seq exon counts between samples. DEXSeq employs negative
binomial distribuition, provides options to visualization and exploration
of the results.
- DEXUS dexus
is a Bioconductor package that identifies differentially expressed genes
in RNA-Seq data under all possible study designs such as studies without
replicates, without sample groups, and with unknown conditions.[6]
In contrast to other methods, DEXUS does not need replicates to detect
differentially expressed transcripts, since the replicates (or conditions)
are estimated by the EM method for each transcript.
- DiffSplice DiffSplice
is a method for differential expression detection and visualization, not
dependent on gene annotations. This method is supported on identification
of alternative splicing modules (ASMs) that diverge in the different
isoforms. A non-parametric test is applied to each ASM to identify
significant differential transcription with a measured false discovery
rate.
- EBSeq EBSeq.
- EdgeR EdgeR
is a R package for analysis of differential expression of data from DNA
sequencing methods, like RNA-Seq, SAGE or ChIP-Seq data. edgeR employs
statistical methods supported on negative binomial distribution as a model
for count variability. See also seqanswers/EdgeR.
- ESAT ESAT
The End Sequence Analysis Toolkit (ESAT) is specially designing to be
applied for quantification of annotation of specialized RNA-Seq gene
libraries that target the 5' or 3' ends of transcripts.
- eXpress eXpress
performance includes transcript-level RNA-Seq quantification,
allele-specific and haplotype analysis and can estimate transcript
abundances of the multiple isoforms present in a gene. Although could be
coupled directly with aligners (like Bowtie), eXpress can also be used
with de novo assemblers and thus is not needed a reference genome to perform
alignment. It runs on Linux, Mac and Windows.
- ERANGE ERANGE performs
alignment, normalization and quantification of expressed genes. See also seqanswers/ERANGE.
- featureCounts featureCounts an
efficient general-purpose read quantifier. It is part of the SourceForge Subread package
and Bioconductor
Rsubread package.
- FDM FDM
- GPSeq GPSeq
- MATS MATS.
- MMSEQ MMSEQ is a pipeline
for estimating isoform expression and allelic imbalance in diploid
organisms based on RNA-Seq. The pipeline employs tools like Bowtie,
TopHat, ArrayExpressHTS and SAMtools. Also, edgeR or DESeq to perform
differential expression. See also seqanswers/MMSEQ.
- Myrna Myrna
is a pipeline tool that runs in a cloud environment (Elastic MapReduce)
or in a unique computer for estimating differential gene expression in
RNA-Seq datasets. Bowtie is employed for short read alignment and R
algorithms for interval calculations, normalization, and statistical
processing. See also seqanswers/Myrna.
- NEUMA NEUMA is a tool to estimate RNA
abundances using length normalization, based on uniquely aligned reads and
mRNA isoform models. NEUMA uses known transcriptome data available in
databases like RefSeq.
- NOISeq NOISeq. See
also seqanswers/NOISeq.
- NPEBseq NPEBseq is a
nonparametric empirical bayesian- based method for differential expression
analysis.
- NSMAP NSMAP allows
inference of isoforms as well estimation of expression levels, without
annotated information. The exons are aligned and splice junctions are
identified using TopHat. All the possible isoforms are computed by
combination of the detected exons.
- RNAeXpress RNAeXpress Can be run with Java GUI or
command line on Mac, Windows and Linux. Can be configured to perform read
counting, feature detection or GTF comparison on mapped rnaseq data.
- rSeq rSeq
- RSEM RSEM. See also seqanswers/RSEM.
- rQuant rQuant is a web
service (Galaxy (computational biology)
installation) that determines abundances of transcripts per gene locus,
based on quadratic programming. rQuant is able to
evaluate biases introduced by experimental conditions. A combination of
tools is employed: PALMapper (reads alignment), mTiM and mGene (inference
of new transcripts).
- Scotty Scotty
Performs power analysis to estimate the number of replicates and depth of
sequencing required to call differential expression.
- SpliceTrap SpliceTrap.
- SplicingCompass SplicingCompass.
Multi-tool solutions
DEB DEB is a
web-interface/pipeline that permits to compare results of significantly
expressed genes from different tools. Currently are available three algorithms:
edgeR, DESeq and bayseq.
Commercial Solutions
- Avadis NGS Avadis NGS.
- CLC
Genomics Workbench CLC Genomics
Workbench
- DNASTAR DNASTAR
- GeneSpring
GX GeneSpring
GX
- geospiza geospiza
- Golden
Helix Golden Helix
- NextGENe NextGENe
- Partek Partek
Open Source Solutions
- ArrayExpressHTS ArrayExpressHTS
(and ebi_ArrayExpressHTS)
is a BioConductor package that allows preprocessing, quality assessment
and estimation of expression of RNA-Seq datasets. It can be run remotely
at the European Bioinformatics Institute cloud or locally. The package
makes use of several tools: ShortRead (quality control), Bowtie, TopHat or
BWA (alignment to a reference genome), SAMtools format, Cufflinks or MMSEQ
(expression estimation). See also seqanswers/ArrayExpressHTS.
- Chipster Chipster.
- easyRNASeq easyRNASeq.
- ExpressionPlot ExpressionPlot.
- FX FX.
- Galaxy: Galaxy is a general purpose
workbench platform for computational biology. There are several publicly
accessible Galaxy servers that support RNA-Seq tools and workflows,
including NBIC's Andromeda, the CBIIT-Giga server, the Galaxy Project's public server, the GeneNetwork
Galaxy server, the University of Oslo's Genomic Hyperbrowser, URGI's server (which supports
S-MART), and many others.
- GENE-Counter GENE-Counter
is a Perl pipeline for RNA-Seq differential gene expression analyses.
Gene-counter performs alignments with CASHX, Bowtie, BWA or other SAM
output aligner. Differential gene expression is run with three optional
packages (NBPSeq, edgeR and DESeq) using negative binomial distribution
methods. Results are stored in a MySQL database
to make possible additional analyses.
- GenePattern GenePattern
offers integrated solutions to RNA-Seq analysis (Broad
Institute).
- GeneProf GeneProf: Freely accessible, easy to
use analysis pipelines for RNA-seq and ChIP-seq experiments.
- MultiExperiment
Viewer (MeV) MeV
is suitable to perform analysis, data mining and visualization of
large-scale genomic data. The MeV modules include a variety of algorithms
to execute tasks like Clustering and Classification, Student's t-test, Gene Set Enrichment
Analysis or Significance Analysis. MeV runs on Java. See also seqanswers/MeV.
- NGS-Trex NGS-Trex.
- NGSUtils NGSUtils.
- RobiNA RobiNA provides a
user graphical interface to deal with R/BioConductor packages. RobiNA
provides a package that automatically installs all required external tools
(R/Bioconductor frameworks and Bowtie). This
tool offers a diversity of quality control methods and the possibility to
produce many tables and plots supplying detailed results for differential
expression. Furthermore, the results can be visualized and manipulated
with MapMan
and PageMan.
RobiNA runs on Java
version 6.
- S-MART S-MART handles
mapped RNA-Seq data, and performs essentially data manipulation
(selection/exclusion of reads, clustering and differential expression
analysis) and visualization (read information, distribution, comparison
with epigenomic ChIP-Seq data). It can be run on any laptop by a person
without computer background. A friendly graphical user interface makes
easy the operation of the tools. See also seqanswers/S-MART.
- Taverna Taverna.
- TCW TCW. TCW is a
Transcriptome Computational Workbench.
- wapRNA wapRNA.
Alternative Splicing
Analysis
- Alt Event
Finder Alt Event
Finder.
- Asprofile asprofile.
- AStalavista AStalavista.
- MISO MISO quantifies the
expression level of splice variants from RNA-Seq data and is able to
recognize differentially regulated exons/isoforms across different
samples. MISO uses a probabilistic method (Bayesian inference) to
calculate the probability of the reads origin. See also seqanswers/MISO.
Bias Correction
- EDASeq EDASeq
is a Bioconductor package to perform GC-Content Normalization for RNA-Seq
Data.
- GeneScissors GeneScissors.
- SysCall SysCall is a
classifier tool to identification and correction of systematic error in
high-throughput sequence data
Fusion
genes/chimeras/translocation finders/structural variations
- BreakDancer BreakDancer. See also seqanswers/BreakDancer.
- ChimeraScan ChimeraScan.
- EBARDenovo EBARDenovo.
- FusionAnalyser FusionAnalyser.
- FusionCatcher FusionCatcher.
- FusionHunter FusionHunter
identifies fusion transcripts without depending on already known
annotations. It uses Bowtie as a first aligner and paired-end reads. See
also seqanswers/FusionHunter.
- FusionMap FusionMap.
- FusionSeq FusionSeq.
See also seqanswers/FusionSeq.
- SOAPFuse SOAPFuse.
- SOAPfusion Soapfusion.
- TopHat-Fusion TopHat-Fusion
is based on TopHat version and was developed to handle reads resulting
from fusion genes. It does not require previous data about known genes and
uses Bowtie to align continuous reads. See also seqanswers/TopHat-Fusion.
- ViralFusionSeq ViralFusionSeq
is high-throughput sequencing (HTS) tool for discovering viral integration
events and reconstruct fusion transcripts at single-base resolution. See
also hkbic/VFS
and SEQWiki/VFS.
- DeFuse DeFuse.
Copy Number Variation
identification
CNVseq CNVseq detects copy number variations supported on a
statistical model derived from array-comparative genomic
hybridization. Sequences alignment are performed by BLAT, calculations are
executed by R modules and is fully automated using Perl. See also seqanswers/CNVseq.
RNA-Seq simulators
- BEERS
Simulator BEERS
is formatted to mouse or human data, and paired-end reads sequenced on
Illumina platform. Beers generates reads starting from a pool of gene
models coming from different published annotation origins. Some genes are
chosen randomly and afterwards are introduced deliberately errors (like
indels, base changes and low quality tails), followed by construction of
novel splice junctions.
- dwgsim dwgsim.
- Flux
simulator Flux Simulator
implements a computer pipeline simulation to mimic a RNA-Seq experiment.
All component steps that influence RNA-Seq are taken into account (reverse
transcription, fragmentation, adapter ligation, PCR amplification, gel
segregation and sequencing) in the simulation. These steps present
experimental attributes that can be measured, and the approximate
experimental biases are captured. Flux Simulator allows joining each of
these steps as modules to analyse different type of protocols. See also seqanswers/Flux.
- rlsim rlsim is a software
package for simulating RNA-seq library preparation with parameter
estimation.
- RSEM Read
Simulator rsem-simulate-reads.
- RNASeqReadSimulator RNASeqReadSimulator
contains a set of simple Python scripts, command line driven. It generates
random expression levels of transcripts (single or paired-end), equally
simulates reads with a specific positional bias pattern and generates
random errors from sequencing platforms.
- RNA Seq
Simulator RNA
Seq Simulator.
Transcriptome
assemblers
Genome-Guided
assemblers
- Cufflinks Cufflinks.
- iReckon iReckon.
- IsoInfer IsoInfer.
- IsoLasso IsoLasso.
- RNAeXpress RNAeXpress.
- Scripture Scripture.
See also seqanswers/Scripture.
- SLIDE SLIDE.
Genome-Independent
assemblers
- KISSPLICE KISSPLICE.
- Oases Oases. See also seqanswers/Oases.
- Rnnotator.
- SOAPdenovo SOAPdenovo. See
also seqanswers/SOAPdenovo.
- Scaffolding
Translation Mapping (STM).
- Trans-ABySS Trans-AByss.
See also seqanswers/Trans-ABySS.
- Trinity Trinity. See also seqanswers/Trinity.
- Velvet Velvet (algorithm).[7]
Velvet(EMBL-EBI).
See also seqanswers/Velvet.
Visualization tools
- Artemis Artemis.
- Apollo Apollo.
- Degust Degust.
- EagleView EagleView.
- GBrowse GBrowse.
- Integrated
Genome Browser IGB.
- Integrative
Genomics Viewer (IGV) IGV.
- GenomeView genomeview.
- MapView MapView.
- Samscope Samscope.
- SeqMonk SeqMonk.
See also seqanswers/SeqMonk.
- Vespa Vespa.
Functional, Network
& Pathway Analysis Tools
- GAGE is
applicable independent of sample sizes, experimental design, assay
platforms, and other types of heterogeneity (paper). This
Biocondutor package also provides functions and data for pathway, GO and
gene set analysis in general. Tutorials describe both RNA-Seq
pathway analysis workflows and microarray
analysis workflows. The RNA-Seq workflows cover from preparation,
reads counting, data preprocessing, gene set test, to pathway
visualization in about 40 lines of codes.
- Ingenuity
Systems (commercial) iReport
& IPA: Ingenuity’s IPA and iReport applications enable you to
upload, analyze, and visualize RNA-Seq datasets, eliminating the obstacles
between data and biological insight. Both IPA and iReport support
identification, analysis and interpretation of differentially expressed
isoforms between condition and control samples, and support interpretation
and assessment of expression changes in the context of biological
processes, disease and cellular phenotypes, and molecular interactions.
Ingenuity iReport supports the upload of native Cuffdiff file format as
well as gene expression lists. IPA supports the upload of gene expression
lists.
- Gene Set
Association Analysis for RNA-Seq (GSAASeq):
GSAASeq are computational methods that assess the differential expression
of a pathway/gene set between two biological states based on sequence
count data.
Further annotation
tools for RNA-Seq data
- seq2HLA seq2HLA
is an annotation tool for obtaining an individual's HLA class I and II
type and expression using standard NGS RNA-Seq data in fastq format. It
comprises mapping RNA-Seq reads against a reference database of HLA
alleles using bowtie,
determining and reporting HLA type, confidence score and locus-specific
expression level. This tool is developed in Python and R. It is available as console tool or
Galaxy module.
See also seqanswers/seq2HLA.
- HLAminer HLAminer
is a computational method for identifying HLA alleles directly from whole
genome, exome and transcriptome shotgun sequence datasets. HLA allele
predictions are derived by targeted assembly of shotgun sequence data and
comparison to a database of reference allele sequences. This tool is
developed in perl
and it is available as console tool.
- pasa pasa.
RNA-Seq Databases
- queryable-rna-seq-database queryable-rna-seq-database.
- RNA-Seq
Atlas RNA-Seq Atlas.
- SRA SRA.
(ref: http://en.wikipedia.org/wiki/List_of_RNA-Seq_bioinformatics_tools#Alignment_Tools)
No comments:
Post a Comment