Page tree
Skip to end of metadata
Go to start of metadata

Calls SNPs and INDELS with SAMtools mpileup and bcftools.

Element type: call_variants

Parameters

ParameterDescriptionDefault valueParameter in Workflow FileType
Output variants file

The url to the file with the extracted variations.




Reference

Specify a file with the reference sequence.

The sequence will be used as reference for all datasets with NGS assemblies.




Use reference from

Specify "File" to set a single reference sequence for all input NGS assemblies. The reference should be set in the "Reference" parameter.

Specify "Input port" to be able to set different references for difference NGS assemblies. The references should be input via the "Input sequences" port (e.g. use datasets in the "Read Sequence" element).

File

Illumina-1.3+ encodingAssume the quality is in the Illumina 1.3+ encoding (mpileup)(-6).False illumina13-encoding

boolean

Count anomalous read pairsDo not skip anomalous read pairs in variant calling (mpileup)(-A). Falseuse_orphanboolean
Disable BAQ computationDisable probabilistic realignment for the computation of base alignment quality (BAQ). BAQ is the Phred-scaled probability of a read base being misaligned. Applying this option greatly helps to reduce false SNPs caused by misalignments (mpileup)(-B). Falsedisable_baqboolean
Mapping quality downgrading coefficientCoefficient for downgrading mapping quality for reads containing excessive mismatches. Given a read with a phred-scaled probability q of being generated from the mapped position, the new mapping quality is about sqrt((INT-q)/INT)*INT. A zero value disables this functionality; if enabled, the recommended value for BWA is 50 (mpileup)(-C).0capq_thresnumeric
Max number of reads per input BAMAt a position, read maximally the number of reads per input BAM (mpileup)(-d).250max_depthnumeric
Extended BAQ computationExtended BAQ computation. This option helps sensitivity especially for MNPs, but may hurt specificity a little bit (mpileup)(-E).Falseext_baqboolean
BED or position list fileBED or position list file containing a list of regions or sites where pileup or BCF should be generated. (mpileup)(-l).
bedstring
Pileup regionOnly generate pileup in region STR (mpileup)(-r).
regstring
Minimum mapping qualityMinimum mapping quality for an alignment to be used (mpileup)(-q).min_mqnumeric
Minimum base qualityMinimum base quality for a base to be considered (mpileup)(-Q).13min_baseqnumeric
Gap extension errorPhred-scaled gap extension sequencing error probability. Reducing INT leads to longer indels (mpileup)(-e).20extQnumeric
Homopolymer errors coefficientCoefficient for modeling homopolymer errors. Given an l-long homopolymer run, the sequencing error of an indel of size s is modeled as INT*s/l. (mpileup)(-h).100tandemQnumeric
No INDELsDo not perform INDEL calling (mpileup)(-I).Falseno_indelboolean
Max INDEL depthSkip INDEL calling if the average per-sample depth is above INT (mpileup)(-L).250max_indel_depthnumeric
Gap open errorPhred-scaled gap open sequencing error probability. Reducing INT leads to more indel calls (mpileup)(-o).40openQnumeric
List of platforms for indelsComma dilimited list of platforms (determined by @RG-PL) from which indel candidates are obtained.It is recommended to collect indel candidates from sequencing technologies that have low indel error rate such as ILLUMINA. (mpileup)(-P).
pl_liststring
Retain all possible alternateRetain all possible alternate alleles at variant sites. By default, the view command discards unlikely alleles. (bcf view)(-A).Falsekeepaltboolean
Indicate PLIndicate PL is generated by r921 or before (ordering is different) (bcf view)(-F).Falsefix_plboolean
No genotype informationSuppress all individual genotype information (bcf view)(-G).Falseno_genoboolean
A/C/G/T onlySkip sites where the REF field is not A/C/G/T (bcf view)(-N).Falseacgt_onlyboolean
List of sitesList of sites at which information are outputted (bcf view)(-l).
bcf_bedstring
QCALL likelihoodOutput the QCALL likelihood format (bcf view)(-Q).Falseqcallboolean
List of samplesList of samples to use. The first column in the input gives the sample names and the second gives the ploidy, which can only be 1 or 2. When the 2nd column is absent, the sample ploidy is assumed to be 2. In the output, the ordering of samples will be identical to the one in FILE (bcf view)(-s).
samplesstring
Min samples fractionskip loci where the fraction of samples covered by reads is below FLOAT (bcf view)(-d).0min_smpl_fracnumeric
Per-sample genotypesCall per-sample genotypes at variant sites. (bcf view)(-g).Truecall_gtboolean
INDEL-to-SNP RatioRatio of INDEL-to-SNP mutation rate. (bcf view)(-i).-1indel_fracnumeric
Max P(ref|D)A site is considered to be a variant if P(ref|D)0.5prefnumeric
Prior allele frequency spectrumIf STR can be full, cond2, flat or the file consisting of error output from a previous variant calling run (bcf view)(-P).fullptypestring
Mutation rateScaled mutation rate for variant calling (bcf view)(-t).0.001thetanumeric
Pair/trio callingEnable pair/trio calling. For trio calling, option -s is usually needed to be applied to configure the trio members and their ordering. In the file supplied to the option -s, the first sample must be the child, the second the father and the third the mother. The valid values of STR are ‘pair’, ‘trioauto’, ‘trioxd’ and ‘trioxs’, where ‘pair’ calls differences between two input samples, and ‘trioxd’ (‘trioxs’)specifies that the input is from the X chromosome non-PAR regions and the child is a female (male) (bcf view)(-T).
ccallstring
N group-1 samplesNumber of group-1 samples. This option is used for dividing the samples into two groups for contrast SNP calling or association test. When this option is in use, the followingVCF INFO will be outputted: PC2, PCHI2 and QCHI2 (bcf view)(-1).0n1numeric
N permutationsNumber of permutations for association test (effective only with -1) (bcf view)(-U).0n_permnumeric
Min P(chi^2)Only perform permutations for P(chi^2).0.01min_perm_pnumeric
Minimum RMS qualityMinimum RMS mapping quality for SNPs (varFilter) (-Q).10min-qualnumeric
Minimum read depthMinimum read depth (varFilter) (-d).2min-depnumeric
Maximum read depthMaximum read depth (varFilter) (-D).10000000max-depnumeric
Alternate basesMinimum number of alternate bases (varFilter) (-a).2min-alt-basesnumeric
Gap sizeSNP within INT bp around a gap to be filtered (varFilter) (-w).3gap-sizenumeric
Window sizeWindow size for filtering adjacent gaps (varFilter) (-W).10window"numeric
Strand biasMinimum P-value for strand bias (given PV4) (varFilter) (-1).0.0001min-strandnumeric
BaseQ biasMinimum P-value for baseQ bias (varFilter) (-2).1e-100min-baseQstring
MapQ biasMinimum P-value for mapQ bias (varFilter) (-3).0min-mapQnumeric
End distance biasMinimum P-value for end distance bias (varFilter) (-4).0.0001min-end-distancenumeric
HWEMinimum P-value for HWE (plus F).0.0001min-hwenumeric
Log filteredPrint filtered variants into the log (varFilter) (-p).Falseprint-filteredboolean

Input/Output Ports

The element has 2 input ports:

Name in GUI: Input assembly

Name in Workflow File: in-assembly

Slots:

Slot In GUISlot in Workflow FileType
Dataset namedatasetstring
Source urlurlstring

Name in GUI: Input sequences

Name in Workflow File: in-sequence

Slots:

Slot In GUISlot in Workflow FileType
Source urlurlstring

And 1 output port:

Name in GUI: Output variations

Name in Workflow File: out-variations

Slots:

Slot In GUISlot in Workflow FileType
Variation trackvariation-trackvariation




  • No labels