The testing plugin is enabled and should be disabled.

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

data_curation:domain_chopping_documentation:index [2023/09/28 16:47]
vwaman
data_curation:domain_chopping_documentation:index [2023/09/29 14:40] (current)
vwaman
Line 1: Line 1:
-[[data_curation:domain_chopping_documentation:index| **Data curation : Domain Chopping (DomChop) tutorial**]] +==== Data curation : Domain Chopping (DomChop) tutorial (last updated August, 2023)==== 
- +(Tutorial documented by: Dr. Natalie Dawson and Dr. Vaishali Waman; DomChop webpage created by Dr. Ian Sillitoe)
-(last updated August, 2023)+
  
 ==== What is domain chopping? ==== ==== What is domain chopping? ====
Line 13: Line 12:
  
 • When prompted to login, make sure you select the 'Current CATH database (production)' option from the Database list • When prompted to login, make sure you select the 'Current CATH database (production)' option from the Database list
- 
-Once you've logged in, you will get a confirmation screen. Check that the login details table details the correct: 
-• database (cathdb_current) 
- 
-• host (rodan) 
- 
-• username 
  
 Select 'Continue', then select the 'DomChop' link. Select 'Continue', then select the 'DomChop' link.
Line 26: Line 18:
  
 Please note that it can take a few seconds for pages to be loaded due to database read/write processes. Please avoid clicking buttons multiple times if a page is still loading otherwise this can cause page errors. Please note that it can take a few seconds for pages to be loaded due to database read/write processes. Please avoid clicking buttons multiple times if a page is still loading otherwise this can cause page errors.
- 
-**RasMol setup** 
- 
-To view the putative domains suggested by each algorithm, RasMol needs to interpret the .rasscript file downloaded from the CATH DomChop pages. If you are using Windows, please create a .bat file containing the following: 
- 
-cd "c:\Program Files\RasWin" # <change to directory containing your RasMol program> 
- 
-raswin.exe ­script %1 
- 
-Please then use this .bat file to open any RasMol files downloaded from the DomChop pages. 
- 
-If you are using Linux, please create a file called rasscript.com, containing the following (assuming that the RasMol application is in your $PATH): 
- 
-#!/bin/sh 
- 
-rasmol ­script "$1" 
- 
-Please then use this rasscript.com file to open any RasMol files downloaded from the DomChop pages. 
-  
- 
- 
- 
- 
- 
-  
-RasMol colouring issues for lowercase chain ids 
- 
-Please note that there is a known issue with RasMol where the domain colouring is not accurate for PDB chains whose ids are lowercase. In these cases, please use either the 3D View tool or download the Pymol script to view with Pymol. 
- 
  
 **Checking the literature** **Checking the literature**
Line 62: Line 25:
  
  
-**How to identify domain boundaries, AKA choosing a chopping (quick overview) +**How to identify domain boundaries, AKA choosing a chopping (quick overview)** 
-** + 
-On the DomChop home page, select 'Get New Chain'. This will load a new chain for you to process.+On the DomChop home page, select '**Get New Chain**'. This will load a new chain for you to process.
  
 First, load the image of the chain to get an idea of how the structure looks. To view an illustration of the chain, select the RasMol icon in the top right-hand box that contains the chain's image. If the 3D structure is very unpacked and does not have a compact, globular structure, add the comment (under the 'Comments' tab) "Unpacked chain" and move onto the next chain. If the 3D structure consists of a fragment, for example a single helix (e.g. as in 5lv6A), add the comment "Fragment" and move onto the next chain. First, load the image of the chain to get an idea of how the structure looks. To view an illustration of the chain, select the RasMol icon in the top right-hand box that contains the chain's image. If the 3D structure is very unpacked and does not have a compact, globular structure, add the comment (under the 'Comments' tab) "Unpacked chain" and move onto the next chain. If the 3D structure consists of a fragment, for example a single helix (e.g. as in 5lv6A), add the comment "Fragment" and move onto the next chain.
Line 72: Line 35:
 Please note that the values and scores provided below are only guidelines. For example, even if the ChopClose result has a bad SSAP score, it could still be the case that it provides an accurate chopping for your chain. Please always view the 3D structure before making a decision on which result to choose. Please note that the values and scores provided below are only guidelines. For example, even if the ChopClose result has a bad SSAP score, it could still be the case that it provides an accurate chopping for your chain. Please always view the 3D structure before making a decision on which result to choose.
  
-1. ChopClose (CC)+1. **ChopClose (CC)**
 If there is a a CC result available, we would first look at the superposition of our query chain with the matching chopped chain from CATH. Typically we would expect a good superposition if the "NW sequence identity" field is at least 30%, if the SSAP score is >= 70 (preferably >= 80), and if the RMSD is <= 5 Angstroms. If there is a a CC result available, we would first look at the superposition of our query chain with the matching chopped chain from CATH. Typically we would expect a good superposition if the "NW sequence identity" field is at least 30%, if the SSAP score is >= 70 (preferably >= 80), and if the RMSD is <= 5 Angstroms.
  
Line 82: Line 45:
 The CC superpositions comprise the new query chain aligned with the best-matching chain that has already been chopped in CATH. The darker colours represent the new query and the lighter colours represent the best-match in CATH. The CC superpositions comprise the new query chain aligned with the best-matching chain that has already been chopped in CATH. The darker colours represent the new query and the lighter colours represent the best-match in CATH.
    
- +2. **CATHEDRAL**
- +
- +
-  +
-2. CATHEDRAL+
 This is the next result to check after CC. Any putative domains that match CATH domains with a SSAP over >= 70 (preferably >= 80) indicate a good match. This is the next result to check after CC. Any putative domains that match CATH domains with a SSAP over >= 70 (preferably >= 80) indicate a good match.
  
Line 92: Line 51:
 The CATHEDRAL superpositions comprise the new query chain aligned with the best-matching domains that have already been chopped in CATH. The darker colours represent the new query and the lighter colours represent the best-matching domains in CATH. The CATHEDRAL superpositions comprise the new query chain aligned with the best-matching domains that have already been chopped in CATH. The darker colours represent the new query and the lighter colours represent the best-matching domains in CATH.
  
-3. HMM+3. **HMM**
 Any putative domain that matches a CATH domain with an E-value below 1x10-05 represents a good match. Any putative domain that matches a CATH domain with an E-value below 1x10-05 represents a good match.
  
-4. PUU, Detective, Domak+4. **PUU, Detective, Domak**
 These are ab initio-based algorithms and do not produce scores. These algorithms are very useful in providing results when the query PDB chains do not have any closely-related matches in CATH. If you don't find any chopping you are happy with in the previous steps, have a look at these results. Sometimes, these three algorithms can help to confirm the above-mentioned results. These are ab initio-based algorithms and do not produce scores. These algorithms are very useful in providing results when the query PDB chains do not have any closely-related matches in CATH. If you don't find any chopping you are happy with in the previous steps, have a look at these results. Sometimes, these three algorithms can help to confirm the above-mentioned results.
  
  
-Submitting a chopping to the curator for review+**Submitting a chopping to the curator for review**
  
 If you are completely satisfied with a chopping proposed by one of the above algorithms, please select the 'Send for review' button next to the appropriate chopping. This chain will then be sent to the curator for reviewing. If you are completely satisfied with a chopping proposed by one of the above algorithms, please select the 'Send for review' button next to the appropriate chopping. This chain will then be sent to the curator for reviewing.
Line 142: Line 101:
 http://update.cathdb.info/cgi-bin/DomChop.pl?chain_id=4uj8B http://update.cathdb.info/cgi-bin/DomChop.pl?chain_id=4uj8B
  
-4uj8B +  - 4uj8B 
-4y25A +  - 4y25A 
-4znoB +  - 4znoB 
-5a57A +  - 5a57A 
-5a8jA +  - 5a8jA 
-5aoqA +  - 5aoqA 
-5axgA +  - 5axgA 
-5b04I +  - 5b04I 
-5c0xK +  - 5c0xK 
-5c14A +  - 5c14A 
-5c1fA +  - 5c1fA 
-5c1sA +  - 5c1sA 
-5c22C +  - 5c22C 
-5c2wD +  - 5c2wD 
-5c4nD +  - 5c4nD 
-5c6tA +  - 5c6tA 
-5cwwB +  - 5cwwB 
-5cylF +  - 5cylF 
-5cyxA +  - 5cyxA 
-5cz3A +  - 5cz3A 
-5dcpA +  - 5dcpA 
-5dcqF +  - 5dcqF 
-5dqrA +  - 5dqrA 
-5du3A +  - 5du3A 
-5fx0A+  - 5fx0A
  
 Chopping summary acronymns Chopping summary acronymns
Line 217: Line 176:
  
  
-==== Happy  domain chopping !!! ====+**Happy  domain chopping !!!!** 
  
  
Print/export