Child pages
  • Raw DNA-Seq Data Processing

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

HTML
<center>
  <br>
  <img src="/wiki/download/attachments/14058845/Raw DNA-Seq Processing_1.png"/>
  <br> 
</center>
Workflow Wizard

The wizard for singleworkflows have the similar wizards. The wizard for paired-end reads has 5 page.

  1. Input data: On  On this tab page you need to must input the FASTQ files, obtained from the sequencer. One or several files can be specified as input.FASTQ file(s). 

    HTML
    <center>
      <br>
      <img src="/wiki/download/attachments/16122726/Raw DNA-Seq Processing_27.png"/>
      <br> 
    </center>
  2. Pre-processing: On this tab there are parameters for sequencing reads trimming and the default file with adapter sequences that should be cut. On this page you can modify filtration parameters. 

    HTML
    <center>
      <br>
      <img src="/wiki/download/attachments/16122726/Raw DNA-Seq Processing_38.png"/>
      <br> 
    </center>

    The following parameters are available:

    Quality threshold – trim all bases with quality score lower than the value of the parameter.

    Min length – the minimum length of a trimmed read. 

    Adapters – the list of adapter sequences to be trimmed by the cutadapt tool

    Base qualityQuality threshold for trimming.
    Reads lengthToo short reads are discarded by the filter.
    Trim both endsTrim the both ends of a read or not. Usually, you need to set True for Sanger sequencing and False for NGS

    Base quality for pairs

    Quality threshold for trimming.

     

    Reads length for pairs

    Too short reads are discarded by the filter.

     

    Trim both ends for pairs

    Trim the both ends of a read or not. Usually, you need to set True for Sanger sequencing and False for NGS

     

    Adapters

    A FASTA file with one or multiple sequences of adapter that were ligated to the 3' end. The adapter itself and anything that follows is trimmed. If the adapter sequence ends with the '$ character, the adapter is anchored to the end of the read and only found if it is a suffix of the read.

     

    Adapters for pairs

    A FASTA file with one or multiple sequences of adapter that were ligated to the 3' end. The adapter itself and anything that follows is trimmed. If the adapter sequence ends with the '$ character, the adapter is anchored to the end of the read and only found if it is a suffix of the read.

  3. Mapping: On this page you must input reference and optionally modify advanced parameters. 

    HTML
    <center>
      <br>
      <img src="/wiki/download/attachments/16122726/Raw DNA-Seq Processing_49.png"/>
      <br> 
    </center>
  4. Post-processing: On this page you can modify post-processing parameters. 

    HTML
    <center>
      <br>
      <img src="/wiki/download/attachments/16122726/Raw DNA-Seq Processing_5.png"/>
      <br> 
    </center>
  5. Output data: On this page you must input output parameters. 

    HTML
    <center>
      <br>
      <img src="/wiki/download/attachments/16122726/Raw DNA-Seq Processing_6.png"/>
      <br> 
    </center>

The wizard for paired-end reads has 5 page.

  1. Input data: On this page you must input FASTQ file(s). 

    HTML
    <center>
      <br>
      <img src="/wiki/download/attachments/16122726/Raw DNA-Seq Processing_7.png"/>
      <br> 
    </center>
  2. Pre-processing: On this page you can modify filtration parameters. 

    HTML
    <center>
      <br>
      <img src="/wiki/download/attachments/16122726/Raw DNA-Seq Processing_8.png"/>
      <br> 
    </center>
  3. Mapping: On this page you must input reference and optionally modify advanced parameters. 

    HTML
    <center>
      <br>
      <img src="/wiki/download/attachments/16122726/Raw DNA-Seq Processing_9.png"/>
      <br> 
    </center>

    The following parameters are available:

    Reference genomePath to indexed reference genome.
    Number of threadsNumber of threads (-t).
    Min seed lengthPath to indexed reference genome (-k).

    Band width

    Band width for banded alignment (-w).

    Dropoff

    Off-diagonal X-dropoff (-d).

    Internal seed length

    Look for internal seeds inside a seed longer than {-k} (-r).

    Skip seed threshold

    Skip seeds with more than INT occurrences (-c).

    Drop chain threshold

    Drop chains shorter than FLOAT fraction of the longest overlapping chain (-D).

    Rounds of mate rescuesPerform at most INT rounds of mate rescues for each read (-m).
    Skip mate rescueSkip mate rescue (-S).
    Skip pairingSkip pairing; mate rescue performed unless -S also in use (-P).
    Mismatch penaltyScore for a sequence match (-A).
    Mismatch penaltyPenalty for a mismatch (-B).
    Gap open penaltyGap open penalty (-O).
    Gap extention penaltyGap extension penalty; a gap of size k cost {-O} (-E).
    Penalty for clippingPenalty for clipping (-L).
    Penalty unpairedPenalty for an unpaired read pair (-U).
    Score thresholdMinimum score to output (-T).
  4. Post-processing: On this page you can modify post-processing parameters. 

    HTML
    <center>
      <br>
      <img src="/wiki/download/attachments/16122726/Raw DNA-Seq Processing_10.png"/>
      <br> 
    </center>

    The following parameters are available:

    MAPQ thresholdMinimum MAPQ quality score.
    Skip flagSkip alignment with the selected items. Select the items in the combobox to configure bit flag. Do not select the items to avoid filtration by this parameter.
    RegionRegions to filter. For BAM output only. chr2 to output the whole chr2. chr2:1000 to output regions of chr 2 starting from 1000. chr2:1000-2000 to ouput regions of chr2 between 1000 and 2000 including the end point. To input multiple regions use the space seprator (e.g. chr1 chr2 chr3:1000-2000).

    For single-end reads

    Remove duplicates for single-end reads.

  5. Output data: On this page you must input output parameters. 

    HTML
    <center>
      <br>
      <img src="/wiki/download/attachments/16122726/Raw DNA-Seq Processing_11.png"/>
      <br> 
    </center>

 

...