The testing plugin is enabled and should be disabled.

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

tutorials:eccb_t2_badasp [2012/09/07 09:34]
romainstuder
tutorials:eccb_t2_badasp [2012/09/08 16:15] (current)
romainstuder
Line 1: Line 1:
 ==== BADASP ==== ==== BADASP ====
 +
 +BADASP can produce different measures:
 +
 +   * bad: similar the **Type II** of functional divergence. The threshold to choose depend if we want to be stringeant (i.e. BAD > 4) or more relaxed (BAD > 2).
 +   * badn = BADN variant of BAD: similar the **Type I** of functional divergence, between __two__ groups.
 +   * badx = BADX variant of BAD: similar the **Type II** of functional divergence, between __many__ groups.
 +   * ssc = Livingstone & Barton method (SSC) => doesn't use ancestral reconstruction. Was developed prior to BAD.
 +   * pdad = Property Difference After Duplication (PDAD) method
 +   * eta = Basic Evolutionary Trace Analysis (ETA) => Strictly conserved residues = 1, else = 0.
 +   * etaq = Quantitative variant of ETA
 +
 +All these methods are described in details in the manual, **chapter 3.1: Functional Specificity Prediction**.
  
 === Installation === === Installation ===
  
 Download the badasp archive and unzip it: Download the badasp archive and unzip it:
-http://www.southampton.ac.uk/~re1u06/software/badasp/index.html+[[http://www.southampton.ac.uk/~re1u06/software/badasp/index.html]]
 <code> <code>
-wget [[http://www.southampton.ac.uk/~re1u06/software/downloads/badasp.zip]]+wget http://www.southampton.ac.uk/~re1u06/software/downloads/badasp.zip
 unzip badasp.zip unzip badasp.zip
 </code> </code>
  
-=== Execution ===+=== Analysis of the V-type proton ATPase 116 kDa subunit a gene family === 
 + 
 +We want to identify the residues making differences between the **isoforms 1** and **isoforms 4** of the V-type proton ATPase 116 kDa subunit a. 
 + 
 +First, visualise briefly the multiple alignment in Jalview. (File "badasp_eg.fas" in the badasp folder.
  
-<code> 
-cd ./badasp  # Folder of installation 
-</code> 
  
 Execute **badasp** by importing the multiple alignment in FASTA format ("badasp_eg.fas") and activating the interactive mode (i=1): Execute **badasp** by importing the multiple alignment in FASTA format ("badasp_eg.fas") and activating the interactive mode (i=1):
-<code>python badasp.py seqin=badasp_eg.fas i=1</code>+ 
 +<code> 
 +cd ./badasp  # Folder of installation 
 +python badasp.py seqin=badasp_eg.fas i=1</code>
  
 Badasp will ask for the associated tree, in newick format ("badasp_eg.nsf"): Badasp will ask for the associated tree, in newick format ("badasp_eg.nsf"):
Line 27: Line 43:
  
 => Press enter => Press enter
 +</code>
 Display Tree, with two groups of sequences: Display Tree, with two groups of sequences:
 V-type proton ATPase 116 kDa subunit a V-type proton ATPase 116 kDa subunit a
-VPP1 = VPP Isoform 1 (8 genes) +   * VPP1 = VPP Isoform 1 (8 genes) 
-NVL = VPP Isoform 4 (3 genes) +   * NVL = VPP Isoform 4 (3 genes) 
 +<code>
 Rooted Tree (1000 bootstraps). Branch Lengths given. 21 nodes.  <ENTER> to continue. Rooted Tree (1000 bootstraps). Branch Lengths given. 21 nodes.  <ENTER> to continue.
 => Press enter => Press enter
Line 47: Line 63:
 </code> </code>
  
-We have a tree and we need to define the two groups to analyse:+The tree is now loaded and we need to define the two groups to analyse:
  
 <code> <code>
Line 55: Line 71:
 => Press enter => Press enter
  
-# We need to split the tree on the node 21, so we need to define two groups from the children nodes 20 (= VPP1 subfamily) and 19 (= VPP4 subfamily) .+# We need to split the tree on the node 21, 
 +so we need to define two groups from the children nodes 20 (= VPP1 subfamily) and 19 (= VPP4 subfamily) .
 => Press M, then enter.  # Manual grouping => Press M, then enter.  # Manual grouping
 (Tree displayed) (Tree displayed)
-Choice? [default=Q]:  c  # We collapse node+Choice? [default=Q]:  c  # We collapse nodes
 Node [default=0]: 20 Node [default=0]: 20
 => Type VPP1, then Press enter => Type VPP1, then Press enter
  
-Choice? [default=Q]:  c  # We collapse node+Choice? [default=Q]:  c  # We collapse nodes
 Node [default=0]:  19 Node [default=0]:  19
 => Type VPP4, then Press enter => Type VPP4, then Press enter
Line 80: Line 97:
 </code> </code>
  
-Badasp will now perform some computation. It will reconstruct the ancestral sequences at each node of the tree, using the [[http:dx.doi.org/10.1186/1471-2105-5-123|GASP (Gapped Ancestral Sequence Predictionmethod]]: +Badasp will now perform some computations. It will reconstruct the ancestral sequences at each node of the tree, using GASP (ref: http:dx.doi.org/10.1186/1471-2105-5-123 )
  
 +<code>
 Making Ancestral Sequences - Variable PAM Weighting Making Ancestral Sequences - Variable PAM Weighting
 Reading PAM1 matrix from jones.pam Reading PAM1 matrix from jones.pam
Line 149: Line 166:
  
  
-=== Analysis+=== Analysis ===
  
 Open the file in your spreadsheet (or cut&space). Open the file in your spreadsheet (or cut&space).
Line 166: Line 183:
 Put a vertical line a the root of the tree to split the tree in two. Put a vertical line a the root of the tree to split the tree in two.
  
-Positon 3 BAD +Some sites are interesting, i.e.: 
-Position 762 BAD +   * Positon 3 BAD 
-Position 223 BADX+   * Position 762 BAD 
 +   * Position 223 BADX 
 + 
 +There are only three genes in the group de VPP4, that explains why the BADX score are very close to the BAD score.
  
Print/export