Differences

This shows you the differences between two versions of the page.

--- tutorials:eccb_t2_badasp [2012/09/06 16:27] – created romainstuder
+++ tutorials:eccb_t2_badasp [2012/09/08 16:15] (current) – romainstuder
@@ Line 1: / Line 1: @@
+==== BADASP ====
-cd /home/bsm4/rstuder/Dropbox/ECCB2012/Tutorial/badasp  # Folder of installation
+BADASP can produce different measures:
-# Run badasp
+   * bad: similar the **Type II** of functional divergence. The threshold to choose depend if we want to be stringeant (i.e. BAD > 4) or more relaxed (BAD > 2).
-<code>python badasp.py seqin=badasp_eg.fas i=1</code>
+   * badn = BADN variant of BAD: similar the **Type I** of functional divergence, between __two__ groups.
+   * badx = BADX variant of BAD: similar the **Type II** of functional divergence, between __many__ groups.
+   * ssc = Livingstone & Barton method (SSC) => doesn't use ancestral reconstruction. Was developed prior to BAD.
+   * pdad = Property Difference After Duplication (PDAD) method
+   * eta = Basic Evolutionary Trace Analysis (ETA) => Strictly conserved residues = 1, else = 0.
+   * etaq = Quantitative variant of ETA
+All these methods are described in details in the manual, **chapter 3.1: Functional Specificity Prediction**.
+=== Installation ===
+Download the badasp archive and unzip it:
+[[http://www.southampton.ac.uk/~re1u06/software/badasp/index.html]]
+<code>
+wget http://www.southampton.ac.uk/~re1u06/software/downloads/badasp.zip
+unzip badasp.zip
+</code>
+=== Analysis of the V-type proton ATPase 116 kDa subunit a gene family ===
+We want to identify the residues making differences between the **isoforms 1** and **isoforms 4** of the V-type proton ATPase 116 kDa subunit a.
+First, visualise briefly the multiple alignment in Jalview. (File "badasp_eg.fas" in the badasp folder.
+Execute **badasp** by importing the multiple alignment in FASTA format ("badasp_eg.fas") and activating the interactive mode (i=1):
+<code>
+cd ./badasp  # Folder of installation
+python badasp.py seqin=badasp_eg.fas i=1</code>
+Badasp will ask for the associated tree, in newick format ("badasp_eg.nsf"):
 <code>
-# Ask for a tree
 Looking for treefile badasp_eg.nsf.
 Tree: ['seqin=badasp_eg.fas', 'i=1', 'nsfin=badasp_eg.nsf']  <ENTER> to continue
@@ Line 12: / Line 43: @@
 => Press enter
+</code>
 Display Tree, with two groups of sequences:
 V-type proton ATPase 116 kDa subunit a
-- VPP1 = VPP Isoform 1 (8 genes)
+   * VPP1 = VPP Isoform 1 (8 genes)
-- NVL = VPP Isoform 4 (3 genes)
+   * NVL = VPP Isoform 4 (3 genes)
+<code>
 Rooted Tree (1000 bootstraps). Branch Lengths given. 21 nodes.  <ENTER> to continue.
 => Press enter
 Tree is rooted at node 21 => perfect
@@ Line 31: / Line 61: @@
 Choice [default=Q]:  q
 Quit Tree Menu? (y/n) [default=Y]:  y
+</code>
+The tree is now loaded and we need to define the two groups to analyse:
+<code>
 #*# Grouping Summary #*#
@@ Line 37: / Line 71: @@
 => Press enter
-We need to split the tree on the node 21, so we need to define two groups from the children nodes 20 and 19.
+# We need to split the tree on the node 21,
+# so we need to define two groups from the children nodes 20 (= VPP1 subfamily) and 19 (= VPP4 subfamily) .
 => Press M, then enter.  # Manual grouping
 (Tree displayed)
-Choice? [default=Q]:  c  # We collapse node
+Choice? [default=Q]:  c  # We collapse nodes
 Node [default=0]: 20
 => Type VPP1, then Press enter
-Choice? [default=Q]:  c  # We collapse node
+Choice? [default=Q]:  c  # We collapse nodes
 Node [default=0]:  19
 => Type VPP4, then Press enter
@@ Line 60: / Line 95: @@
 Use badasp_eg for output filenames? (y/n) [default=Y]:  enter
 Use these parameters? (y/n) [default=Y]:  enter
+</code>
+Badasp will now perform some computations. It will reconstruct the ancestral sequences at each node of the tree, using GASP (ref: http:dx.doi.org/10.1186/1471-2105-5-123 )
+<code>
 Making Ancestral Sequences - Variable PAM Weighting
 Reading PAM1 matrix from jones.pam
@@ Line 73: / Line 111: @@
 ...Done!  <ENTER> to continue.
 ...win(0)  <ENTER> to continue. # (many times !)
+</code>
+Now, Badasp will ask you the kind of output you want.
+Let's say yes to everything.
+<code>
 Output additional, filtered results? (y/n) [default=N]:  y
 Name for partial results file? [default=badasp_eg.partial.badasp]: enter
@@ Line 84: / Line 128: @@
 Output PDAD results? (y/n) [default=Y]:  y
 Output ETA results? (y/n) [default=Y]:  y
 Output ETAQ results? (y/n) [default=Y]:  y
 Output Info results? (y/n) [default=Y]:  y
 Output PCon_Abs results? (y/n) [default=Y]:  y
 Output PCon_Mean results? (y/n) [default=Y]:  y
 Output QPCon_Mean results? (y/n) [default=Y]:  y
 Output QPCon_Abs results? (y/n) [default=Y]:  y
 Filter Rows by Results VALUES? (y/n) [default=Y]:  y
 Min. value for BAD? [default=-6.708333]:
 => New value = "-6.708333"? (y/n) [default=Y]:
 Min. value for BADN? [default=-6.708333]:
 => New value = "-6.708333"? (y/n) [default=Y]:
 Min. value for BADX? [default=-3.500000]:
 => New value = "-3.500000"? (y/n) [default=Y]:
 Min. value for SSC? [default=0.000000]:
 => New value = "0.000000"? (y/n) [default=Y]:
 Min. value for PDAD? [default=-0.297619]:
 => New value = "-0.297619"? (y/n) [default=Y]:
+ Min. value for ETA? [default=0.000000]:
-Min. value for ETA? [default=0.000000]:
 => New value = "0.000000"? (y/n) [default=Y]:
+ Min. value for ETAQ? [default=0.000000]:
-Min. value for ETAQ? [default=0.000000]:
 => New value = "0.000000"? (y/n) [default=Y]:
 Min. value for Info? [default=0.424111]:
 => New value = "0.424111"? (y/n) [default=Y]:
 Min. value for PCon_Abs? [default=1.000000]:
 => New value = "1.000000"? (y/n) [default=Y]:
 Min. value for PCon_Mean? [default=5.000000]:
 => New value = "5.000000"? (y/n) [default=Y]:
 Min. value for QPCon_Mean? [default=9.375000]:
 => New value = "9.375000"? (y/n) [default=Y]:
 Min. value for QPCon_Abs? [default=0.000000]:
 => New value = "0.000000"? (y/n) [default=Y]:
 BADASP Partial Results Output (badasp_eg.partial.badasp) ... Done!
 #LOG    00:23:06        BADASP V:1.3 End: Thu Sep  6 13:59:24 2012
+</code>
-### Analysis
+=== Analysis ===
 Open the file in your spreadsheet (or cut&space).
@@ Line 161: / Line 174: @@
 Color the "BAD", "BADN" and "BAD" columns with a conditional formating, with value > 3.
-</code>
 == In Jalview: ==
@@ Line 172: / Line 183: @@
 Put a vertical line a the root of the tree to split the tree in two.
+Some sites are interesting, i.e.:
+   * Positon 3 BAD
+   * Position 762 BAD
+   * Position 223 BADX
-Positon 3 BAD
+There are only three genes in the group de VPP4, that explains why the BADX score are very close to the BAD score.
-Position 762 BAD
-Position 223 BADX

CATH

User Tools

Site Tools

Differences

Page Tools