This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| tutorials:eccb_t2_badasp [2012/09/07 09:31] – romainstuder | tutorials:eccb_t2_badasp [2012/09/08 16:15] (current) – romainstuder | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| ==== BADASP ==== | ==== BADASP ==== | ||
| + | |||
| + | BADASP can produce different measures: | ||
| + | |||
| + | * bad: similar the **Type II** of functional divergence. The threshold to choose depend if we want to be stringeant (i.e. BAD > 4) or more relaxed (BAD > 2). | ||
| + | * badn = BADN variant of BAD: similar the **Type I** of functional divergence, between __two__ groups. | ||
| + | * badx = BADX variant of BAD: similar the **Type II** of functional divergence, between __many__ groups. | ||
| + | * ssc = Livingstone & Barton method (SSC) => doesn' | ||
| + | * pdad = Property Difference After Duplication (PDAD) method | ||
| + | * eta = Basic Evolutionary Trace Analysis (ETA) => Strictly conserved residues = 1, else = 0. | ||
| + | * etaq = Quantitative variant of ETA | ||
| + | |||
| + | All these methods are described in details in the manual, **chapter 3.1: Functional Specificity Prediction**. | ||
| === Installation === | === Installation === | ||
| Download the badasp archive and unzip it: | Download the badasp archive and unzip it: | ||
| + | [[http:// | ||
| < | < | ||
| - | http:// | + | wget http:// |
| - | http:// | + | |
| unzip badasp.zip | unzip badasp.zip | ||
| </ | </ | ||
| - | === Execution | + | === Analysis of the V-type proton ATPase 116 kDa subunit a gene family |
| + | |||
| + | We want to identify the residues making differences between the **isoforms 1** and **isoforms 4** of the V-type proton ATPase 116 kDa subunit a. | ||
| + | |||
| + | First, visualise briefly the multiple alignment in Jalview. (File " | ||
| - | < | ||
| - | cd ./ | ||
| - | </ | ||
| Execute **badasp** by importing the multiple alignment in FASTA format (" | Execute **badasp** by importing the multiple alignment in FASTA format (" | ||
| - | < | + | |
| + | < | ||
| + | cd ./ | ||
| + | python badasp.py seqin=badasp_eg.fas i=1</ | ||
| Badasp will ask for the associated tree, in newick format (" | Badasp will ask for the associated tree, in newick format (" | ||
| Line 27: | Line 43: | ||
| => Press enter | => Press enter | ||
| + | </ | ||
| Display Tree, with two groups of sequences: | Display Tree, with two groups of sequences: | ||
| V-type proton ATPase 116 kDa subunit a | V-type proton ATPase 116 kDa subunit a | ||
| - | - VPP1 = VPP Isoform 1 (8 genes) | + | |
| - | - NVL = VPP Isoform 4 (3 genes) | + | * NVL = VPP Isoform 4 (3 genes) |
| + | < | ||
| Rooted Tree (1000 bootstraps). Branch Lengths given. 21 nodes. | Rooted Tree (1000 bootstraps). Branch Lengths given. 21 nodes. | ||
| => Press enter | => Press enter | ||
| Line 47: | Line 63: | ||
| </ | </ | ||
| - | We have a tree and we need to define the two groups to analyse: | + | The tree is now loaded |
| < | < | ||
| Line 55: | Line 71: | ||
| => Press enter | => Press enter | ||
| - | # We need to split the tree on the node 21, so we need to define two groups from the children nodes 20 (= VPP1 subfamily) and 19 (= VPP4 subfamily) . | + | # We need to split the tree on the node 21, |
| + | # so we need to define two groups from the children nodes 20 (= VPP1 subfamily) and 19 (= VPP4 subfamily) . | ||
| => Press M, then enter. | => Press M, then enter. | ||
| (Tree displayed) | (Tree displayed) | ||
| - | Choice? [default=Q]: | + | Choice? [default=Q]: |
| Node [default=0]: | Node [default=0]: | ||
| => Type VPP1, then Press enter | => Type VPP1, then Press enter | ||
| - | Choice? [default=Q]: | + | Choice? [default=Q]: |
| Node [default=0]: | Node [default=0]: | ||
| => Type VPP4, then Press enter | => Type VPP4, then Press enter | ||
| Line 80: | Line 97: | ||
| </ | </ | ||
| - | Badasp will now perform some computation. It will reconstruct the ancestral sequences at each node of the tree, using the [[http: | + | Badasp will now perform some computations. It will reconstruct the ancestral sequences at each node of the tree, using GASP (ref: http: |
| + | < | ||
| Making Ancestral Sequences - Variable PAM Weighting | Making Ancestral Sequences - Variable PAM Weighting | ||
| Reading PAM1 matrix from jones.pam | Reading PAM1 matrix from jones.pam | ||
| Line 149: | Line 166: | ||
| - | === Analysis | + | === Analysis |
| Open the file in your spreadsheet (or cut& | Open the file in your spreadsheet (or cut& | ||
| Line 166: | Line 183: | ||
| Put a vertical line a the root of the tree to split the tree in two. | Put a vertical line a the root of the tree to split the tree in two. | ||
| - | Positon 3 BAD | + | Some sites are interesting, |
| - | Position 762 BAD | + | |
| - | Position 223 BADX | + | * Position 762 BAD |
| + | * Position 223 BADX | ||
| + | |||
| + | There are only three genes in the group de VPP4, that explains why the BADX score are very close to the BAD score. | ||