This is an old revision of the document!
TDG09
Installation
Download from here: http://www.homepages.ucl.ac.uk/~ucbtaut/
http://bit.ly/fVbnDr ⇒ tdg09.zip
unzip tdg09.zip
Detection of site in the Influenza Virus Hemagglutinin HA1 chain
cd ./Tutorial/tdg09 # Folder of installation
The execution of TDG09 with this dataset could take 20 minute or more, depending of the computation power.
Lauch TDG09 with this command (it takes time (20 minutes or more)):
With Linux or MacOSX, you can run the command like this:
./run.sh etc/H1.faa etc/H1.tree > tdg.out
With Windows, you need to execute the whole command:
java -cp lib\commons-lang-2.4.jar;lib\flanagan.jar;lib\pal-1.5.1.jar;dist\tdg09.jar models.MainAvHu09 etc\H1.faa etc\H1.tree
The output is put in the file tdg.out. We need to transform it with some unix tools (the four are one unique command):
grep "Site\|Parameters\|Log\-likelihood" tdg.out \ | tr '\n' ' ' | sed "s/Site: /\\`echo -e '\n\r'`/g" \ | awk '{$1=$1}1' OFS=" " \ | cut -d' ' -f1,4,7,10,13 > tdg2.out
Now, we need to load this file tdg2.out into R
# Launch R
R
And put these commands:
tdg.out <- read.csv('tdg2.out', sep=' ', header=F) # Load file tdg.out <- tdg.out[!is.na(tdg.out$V2),] # Remove conserved sites tdg.out$lrt <- pchisq(2 * (tdg.out$V5 - tdg.out$V3), df=(tdg.out$V4 - tdg.out$V2), lower.tail=F) # Perform likelihood test tdg.out$fdr <- tdg.out$lrt * length(tdg.out$lrt) / rank(tdg.out$lrt) # Get false discovery rate (FDR) tdg.out[tdg.out$fdr < 0.20, "V1"] # Print all sites under FDR=20% (very relaxed) tdg.out[tdg.out$fdr < 0.05, "V1"] # Print all sites under FDR=5% (medium) tdg.out[tdg.out$fdr < 0.01, "V1"] # Print all sites under FDR=1% (stringent) [1] 2 9 62 130 155 168 169 173 177 202 203 204 212 239 252 253 275 276 286 [20] 289 300 303 315 325 416 421 460 471
Jalview
Load multiple alignment: H1.faa (you will need to remove the first line, as it is in Phylip format. ⇒ Sort by ID.
Load tree: H1.tree Put a vertical line a the root of the tree to split the tree in two.
⇒ Visualise the position of these sites