Child pages
  • Classify Sequences with MetaPhlAn2

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

WEVOTE (WEighted VOting Taxonomic idEntificationMetaPhlAn2 (METAgenomic PHyLogenetic ANalysis) is a metagenome shortgun sequencing DNA reads classifier based on an ensemble of other classification methods (Kraken, CLARK, etc.)tool for profiling the composition of microbial communities (bacteria, archaea, eukaryotes, and viruses) from whole-metagenome shotgun sequencing data.

Parameters in GUI

 

auto
ParameterDescriptionDefaultvalue
Penalty

Score penalty for disagreements (-k)

2
Number of agreed tools

Specify the minimum number of tools agreed onWEVOTEdecision (-a).

0

 

Score threshold

Score threshold (-s)

0
Number of threads

Use multiple threads (-n).

8
Output file

Specify the output text file name.

Input data

To classify single-end (SE) reads or contigs, received by reads de novo assembly, set this parameter to "SE reads or contigs".

To classify paired-end (PE) reads, set the value to "PE reads".

SE reads or contigs
Input file format

Set type of an input file (--input-type). Each input file will usually contain a lot of sequences that should be classified.

FASTA

 

Database

A path to a folder with MetaPhlAn2 database: BowTie2 index files, built from reference genomes, and *.pkl file (--mpa-pkl, --bowtie2db).

By default, "mpa_v20_m200" database is provided (if it has been downloaded). The database was built on ~1M unique clade-specific marker genes identified from ~17,000 reference genomes (~13,500 bacterial and archaeal, ~3,500 viral, and ~110 eukaryotic).

 
Number of threads

The number of CPUs to use for parallelizing the mapping (--nproc).

8
Analysis type

Specify the type of analysis to perform:

  • Relative abundance - profiling of metagenomes in terms of relative abundances (corresponds to "-t rel_ab")
  • Relative abundance with reads statistics - profiling of metagenomes in terms of relative abundances and estimate the number of reads coming from each clade ("-t rel_ab_w_read_stats")
  • Reads mapping - mapping from reads to clades, the output contains reads that hit a marker only ("-t reads_map")
  • Clade profiles - normalized marker counts for clades with at least a non - null marker("-t clade_profiles")
  • Marker abundance table - normalized marker counts: only when > 0.0 and optionally normalized by metagenome size ("-t marker_ab_table"), see also "Normalize by metagenome size" parameter
  • Marker presence table - list of markers present in the sample ("-t marker_pres_table"), see also "Presence threshold" parameter


Relative abundance
Tax levelThe taxonomic level for the relative abundance output: all, kingdoms (Bacteria and Archaea) only, phyla only, etc. (--tax_lev).All 
Bowtie2 output fileThe file for saving the output of BowTie2 (--bowtie2out). In case of PE reads one file is created per each pair of files.Auto 
Output fileMetaPhlAn2 output depends on the "Analysis type" parameter. By default, it is a tab-delimited file with the predicted taxon relative abundances.Auto 

Parameters in Workflow File

Type: wevotemetaphlan2-classify

number

Parameter

Parameter in the GUI

Type

penalty

Penalty

number

number-of-agreed-tools

Number of agreed tools

number

score-threshold

Score threshold

input-data

Input data

string

input-format

Input file format

string

database

Database

string

threads

Number of threads

number

analysis-type

Analysis type

string

tax-level

Tax level

string

bowtie2-output-url

Bowtie2 output file

string

output-url

Output file

string

Input/Output Ports

The element has 1 input port:

Name in GUI: Input classification CSV filesequences

Input a CSV file in the following format: 1) a sequence name 2) taxID from the first tool 3) taxID from the second tool 4) etc.URL(s) to FASTQ or FASTA file(s) should be provided. In case of SE reads or contigs use the "Input URL 1" slot only. In case of PE reads input "left" reads to "Input URL 1", "right" reads to "Input URL 2".See also the "Input data" parameter of the element

Name in Workflow File: in

Slots:

SlotInGUISlot in Workflow FileType
Input URLurlstring

The element has 1 output port:

Name in GUI: WEVOTE Classification:  

A map of sequence names with the associated taxonomy IDs.

Name in Workflow File: out

Slots:

SlotInGUISlot in Workflow FileType

Taxonomy classification data

tax-data

tax-classification