Child pages
  • Build CLARK Database
Skip to end of metadata
Go to start of metadata

Build a CLARK database from a set of reference sequences ("targets"). NCBI taxonomy data are used to map the accession number found in each reference sequence to its taxonomy ID.

Parameters in GUI

 

ParameterDescriptionDefault value
Database

A folder that should be used to store the database files.

 
Genomic library

Genomes that should be used to build the database ("targets"). The genomes should be specified in FASTA format.

There should be one FASTA file per reference sequence.

A sequence header must contain an accession number (i.e., >accession.number ... or >gi|number|ref|accession.number| ...).

 
Taxonomy rank

Set the taxonomy rank for the database. CLARK classifies metagenomic samples by using only one taxonomy rank.

So as a general rule, consider first the genus or species rank,

then if a high proportion of reads cannot be classified, reset your targets definition at a higher taxonomy rank (e.g., family or phylum).

Species

Parameters in Workflow File

Type: clark-build

ParameterParameter in the GUIType
databaseDatabase

string

taxonomyGenomic libraryurl-datasets
taxonomy-rankTaxonomy ranknumber

Input/Output Ports

The element has 1 output port:

Name in GUI: Output CLARK database

Name in Workflow File: out

Slots:

SlotInGUISlot in Workflow FileType
Output URLurlstring
  • No labels