==== BADASP ====

BADASP can produce different measures:

   * bad: similar the **Type II** of functional divergence. The threshold to choose depend if we want to be stringeant (i.e. BAD > 4) or more relaxed (BAD > 2).
   * badn = BADN variant of BAD: similar the **Type I** of functional divergence, between __two__ groups.
   * badx = BADX variant of BAD: similar the **Type II** of functional divergence, between __many__ groups.
   * ssc = Livingstone & Barton method (SSC) => doesn't use ancestral reconstruction. Was developed prior to BAD.
   * pdad = Property Difference After Duplication (PDAD) method
   * eta = Basic Evolutionary Trace Analysis (ETA) => Strictly conserved residues = 1, else = 0.
   * etaq = Quantitative variant of ETA

All these methods are described in details in the manual, **chapter 3.1: Functional Specificity Prediction**.

=== Installation ===

Download the badasp archive and unzip it:
[[http://www.southampton.ac.uk/~re1u06/software/badasp/index.html]]
<code>
wget http://www.southampton.ac.uk/~re1u06/software/downloads/badasp.zip
unzip badasp.zip
</code>

=== Analysis of the V-type proton ATPase 116 kDa subunit a gene family ===

We want to identify the residues making differences between the **isoforms 1** and **isoforms 4** of the V-type proton ATPase 116 kDa subunit a.

First, visualise briefly the multiple alignment in Jalview. (File "badasp_eg.fas" in the badasp folder.


Execute **badasp** by importing the multiple alignment in FASTA format ("badasp_eg.fas") and activating the interactive mode (i=1):

<code>
cd ./badasp  # Folder of installation
python badasp.py seqin=badasp_eg.fas i=1</code>

Badasp will ask for the associated tree, in newick format ("badasp_eg.nsf"):
<code>
Looking for treefile badasp_eg.nsf.
Tree: ['seqin=badasp_eg.fas', 'i=1', 'nsfin=badasp_eg.nsf']  <ENTER> to continue

=> nsfin=badasp_eg.nsf

=> Press enter
</code>
Display Tree, with two groups of sequences:
V-type proton ATPase 116 kDa subunit a
   * VPP1 = VPP Isoform 1 (8 genes)
   * NVL = VPP Isoform 4 (3 genes)
<code>
Rooted Tree (1000 bootstraps). Branch Lengths given. 21 nodes.  <ENTER> to continue.
=> Press enter

Tree is rooted at node 21 => perfect
=> Press 0, then enter.

 *** Tree Menu *** 

Sequence Data are already imported => we quit the menu.

Choice [default=Q]:  q 
Quit Tree Menu? (y/n) [default=Y]:  y
</code>

The tree is now loaded and we need to define the two groups to analyse:

<code>
#*# Grouping Summary #*#

Currently 0 groups. (11 Orphans)
=> Press enter

# We need to split the tree on the node 21,
# so we need to define two groups from the children nodes 20 (= VPP1 subfamily) and 19 (= VPP4 subfamily) .
=> Press M, then enter.  # Manual grouping
(Tree displayed)
Choice? [default=Q]:  c  # We collapse nodes
Node [default=0]: 20
=> Type VPP1, then Press enter

Choice? [default=Q]:  c  # We collapse nodes
Node [default=0]:  19
=> Type VPP4, then Press enter

Choice? [default=Q]:  Q, then enter  # We collapse node
Quit Tree Edit? (y/n) [default=Y]:  y

#*# Grouping Summary #*#
ENTER> to continue.
Choice for Grouping? [default=K]: K, then enter
Keep Groups? (y/n) [default=Y]:  Y, then enter
Save groups? (y/n) [default=Y]:  y
Name of Groupfile? [default=badasp_eg.grp]:  enter
Write Group Names? (y/n) [default=N]:  N
Use badasp_eg for output filenames? (y/n) [default=Y]:  enter
Use these parameters? (y/n) [default=Y]:  enter
</code>

Badasp will now perform some computations. It will reconstruct the ancestral sequences at each node of the tree, using GASP (ref: http:dx.doi.org/10.1186/1471-2105-5-123 )

<code>
Making Ancestral Sequences - Variable PAM Weighting
Reading PAM1 matrix from jones.pam

# #Start computing
Saving Ancestral Sequences in badasp_eg.anc.fas...  <ENTER> to continue.
Method BADX needs query but none given. Drop BADX from specificity methods? (y/n) [default=Y]:  n
Method BADX needs query but none given. Use sequence 1 (vpp1_HUMAN/Q8N5G7)? (y/n) [default=N]: y 

Calculating ['BAD', 'BADN', 'BADX', 'SSC', 'PDAD', 'ETA', 'ETAQ'] scores... (849 residues) ...win(0)  <ENTER> to continue.
...Done!  <ENTER> to continue.
...win(0)  <ENTER> to continue. # (many times !)
</code>

Now, Badasp will ask you the kind of output you want.
Let's say yes to everything.

<code>
Output additional, filtered results? (y/n) [default=N]:  y
Name for partial results file? [default=badasp_eg.partial.badasp]: enter 

Output subfam 1 (VPP4) details (pos,aa & win)? (y/n) [default=Y]:  y
Output subfam 2 (VPP1) details (pos,aa & win)? (y/n) [default=Y]:  y
Output BAD results? (y/n) [default=Y]:  
Output BADN results? (y/n) [default=Y]:  y
Output BADX results? (y/n) [default=Y]:  y
Output SSC results? (y/n) [default=Y]:  y
Output PDAD results? (y/n) [default=Y]:  y
Output ETA results? (y/n) [default=Y]:  y
Output ETAQ results? (y/n) [default=Y]:  y
Output Info results? (y/n) [default=Y]:  y
Output PCon_Abs results? (y/n) [default=Y]:  y
Output PCon_Mean results? (y/n) [default=Y]:  y
Output QPCon_Mean results? (y/n) [default=Y]:  y
Output QPCon_Abs results? (y/n) [default=Y]:  y
Filter Rows by Results VALUES? (y/n) [default=Y]:  y
Min. value for BAD? [default=-6.708333]:  
=> New value = "-6.708333"? (y/n) [default=Y]:  
Min. value for BADN? [default=-6.708333]:  
=> New value = "-6.708333"? (y/n) [default=Y]:  
Min. value for BADX? [default=-3.500000]:  
=> New value = "-3.500000"? (y/n) [default=Y]:  
Min. value for SSC? [default=0.000000]:  
=> New value = "0.000000"? (y/n) [default=Y]:  
Min. value for PDAD? [default=-0.297619]:  
=> New value = "-0.297619"? (y/n) [default=Y]: 
 Min. value for ETA? [default=0.000000]:  
=> New value = "0.000000"? (y/n) [default=Y]: 
 Min. value for ETAQ? [default=0.000000]:  
=> New value = "0.000000"? (y/n) [default=Y]:  
Min. value for Info? [default=0.424111]:  
=> New value = "0.424111"? (y/n) [default=Y]:  
Min. value for PCon_Abs? [default=1.000000]:  
=> New value = "1.000000"? (y/n) [default=Y]:  
Min. value for PCon_Mean? [default=5.000000]:  
=> New value = "5.000000"? (y/n) [default=Y]:  
Min. value for QPCon_Mean? [default=9.375000]:  
=> New value = "9.375000"? (y/n) [default=Y]:  
Min. value for QPCon_Abs? [default=0.000000]:  
=> New value = "0.000000"? (y/n) [default=Y]:  

BADASP Partial Results Output (badasp_eg.partial.badasp) ... Done!

#LOG    00:23:06        BADASP V:1.3 End: Thu Sep  6 13:59:24 2012
</code>


=== Analysis ===

Open the file in your spreadsheet (or cut&space).

The columns are separated by a tab.

Color the "BAD", "BADN" and "BAD" columns with a conditional formating, with value > 3.


== In Jalview: ==

Load multiple alignment: badasp_eg.fas

Load tree: badasp_eg.nsf

Put a vertical line a the root of the tree to split the tree in two.

Some sites are interesting, i.e.:
   * Positon 3 BAD
   * Position 762 BAD
   * Position 223 BADX

There are only three genes in the group de VPP4, that explains why the BADX score are very close to the BAD score.