Suppose you have genomes and you want to characterize them. One of the ways to do that is to build a table of what genes are in each genome and what are not there.

  1. Create a local BLAST db of your genome sequence/contigs. One db per one genome.
  2. Create a file with sequences of genes you what to explore. This file will be the input file for the workflow.
  3. Setup location and name of BLAST db you created for the first genome.
  4. Setup output files: report location and output file with annotated (with BLAST) sequence. You might want to delete the "Write Sequence" element if you do not need output sequences.
  5. Run the workflow.
  6. Run the workflow on the same input and output files changing BLAST db for each genome that you have.

As the result you will get the report file. With "Yes" and "No" field. "Yes" answer means that the gene is in the genome. "No" answer MIGHT mean that there is no gene in the genome. It is a good idea to analyze all the "No" sequences using annotated files. Just open a file and find a sequence with a name of a gene that has "No" result.

If you haven't used the workflow samples in UGENE before, look at the "How to Use Sample Workflows" section of the documentation.

Workflow Sample Location

The workflow sample "Gene-by-gene Approach for Characterization of Genomes" can be found in the "Scenarios" section of the Workflow Designer samples.

Workflow Image

The workflow looks as follows:

<center>
  <br>
  <img src="/wiki/download/attachments/16122734/Gene-by-gene Approach for Characterization of Genomes.png"/>
  <br> 
</center>
Workflow Wizard

The wizard has 3 pages.

  1. Input sequence(s): On this page you must input sequence(s). 

    <center>
      <br>
      <img src="/wiki/download/attachments/16122734/Gene-by-gene Approach for Characterization of Genomes_1.png"/>
      <br> 
    </center>
  2. BLAST search: On this page you can modify BLAST search parameters. 

    <center>
      <br>
      <img src="/wiki/download/attachments/16122734/Gene-by-gene Approach for Characterization of Genomes_2.png"/>
      <br> 
    </center>

    The following parameters are available:

    Search typeSelect type of BLAST searches.
    Database PathPath with database files.
    Database NameBase name for BLAST DB files.
    Expected valueThis setting specifies the statistical significance threshold for reporting matches against database sequences.
    Annotate asName for annotations.

    Gapped alignment

    Perform gapped alignment.

     

    Tool Path

    External tool path.

     

    BLAST outputLocation of BLAST output file.
    BLAST output typeType of BLAST output file.
    Temporary directoryDirectory for temporary files.
    Gap costsCost to create and extend a gap in an alignment.
    Match scoresReward and penalty for matching and mismatching bases.
  3. Output data: On this page you can modify output parameters. 

    <center>
      <br>
      <img src="/wiki/download/attachments/16122734/Gene-by-gene Approach for Characterization of Genomes_3.png"/>
      <br> 
    </center>