The testing plugin is enabled and should be disabled.
Differences
This shows you the differences between two versions of the page.
data_curation:domain_chopping_documentation:index [2023/09/29 13:26] vwaman |
data_curation:domain_chopping_documentation:index [2023/09/29 13:40] (current) vwaman |
||
---|---|---|---|
Line 1: | Line 1: | ||
==== Data curation : Domain Chopping (DomChop) tutorial (last updated August, 2023)==== | ==== Data curation : Domain Chopping (DomChop) tutorial (last updated August, 2023)==== | ||
+ | (Tutorial documented by: Dr. Natalie Dawson and Dr. Vaishali Waman; DomChop webpage created by Dr. Ian Sillitoe) | ||
==== What is domain chopping? ==== | ==== What is domain chopping? ==== | ||
Line 18: | Line 18: | ||
Please note that it can take a few seconds for pages to be loaded due to database read/write processes. Please avoid clicking buttons multiple times if a page is still loading otherwise this can cause page errors. | Please note that it can take a few seconds for pages to be loaded due to database read/write processes. Please avoid clicking buttons multiple times if a page is still loading otherwise this can cause page errors. | ||
- | |||
- | **RasMol setup** | ||
- | |||
- | To view the putative domains suggested by each algorithm, RasMol needs to interpret the .rasscript file downloaded from the CATH DomChop pages. If you are using Windows, please create a .bat file containing the following: | ||
- | |||
- | cd "c:\Program Files\RasWin" # <change to directory containing your RasMol program> | ||
- | |||
- | raswin.exe script %1 | ||
- | |||
- | Please then use this .bat file to open any RasMol files downloaded from the DomChop pages. | ||
- | |||
- | If you are using Linux, please create a file called rasscript.com, containing the following (assuming that the RasMol application is in your $PATH): | ||
- | |||
- | #!/bin/sh | ||
- | |||
- | rasmol script "$1" | ||
- | |||
- | Please then use this rasscript.com file to open any RasMol files downloaded from the DomChop pages. | ||
- | |||
- | |||
- | RasMol colouring issues for lowercase chain ids | ||
- | |||
- | Please note that there is a known issue with RasMol where the domain colouring is not accurate for PDB chains whose ids are lowercase. In these cases, please use either the 3D View tool or download the Pymol script to view with Pymol. | ||
- | |||
**Checking the literature** | **Checking the literature** | ||
Line 49: | Line 25: | ||
- | **How to identify domain boundaries, AKA choosing a chopping (quick overview) | + | **How to identify domain boundaries, AKA choosing a chopping (quick overview)** |
- | ** | + | |
- | On the DomChop home page, select 'Get New Chain'. This will load a new chain for you to process. | + | On the DomChop home page, select '**Get New Chain**'. This will load a new chain for you to process. |
First, load the image of the chain to get an idea of how the structure looks. To view an illustration of the chain, select the RasMol icon in the top right-hand box that contains the chain's image. If the 3D structure is very unpacked and does not have a compact, globular structure, add the comment (under the 'Comments' tab) "Unpacked chain" and move onto the next chain. If the 3D structure consists of a fragment, for example a single helix (e.g. as in 5lv6A), add the comment "Fragment" and move onto the next chain. | First, load the image of the chain to get an idea of how the structure looks. To view an illustration of the chain, select the RasMol icon in the top right-hand box that contains the chain's image. If the 3D structure is very unpacked and does not have a compact, globular structure, add the comment (under the 'Comments' tab) "Unpacked chain" and move onto the next chain. If the 3D structure consists of a fragment, for example a single helix (e.g. as in 5lv6A), add the comment "Fragment" and move onto the next chain. | ||
Line 59: | Line 35: | ||
Please note that the values and scores provided below are only guidelines. For example, even if the ChopClose result has a bad SSAP score, it could still be the case that it provides an accurate chopping for your chain. Please always view the 3D structure before making a decision on which result to choose. | Please note that the values and scores provided below are only guidelines. For example, even if the ChopClose result has a bad SSAP score, it could still be the case that it provides an accurate chopping for your chain. Please always view the 3D structure before making a decision on which result to choose. | ||
- | 1. ChopClose (CC) | + | 1. **ChopClose (CC)** |
If there is a a CC result available, we would first look at the superposition of our query chain with the matching chopped chain from CATH. Typically we would expect a good superposition if the "NW sequence identity" field is at least 30%, if the SSAP score is >= 70 (preferably >= 80), and if the RMSD is <= 5 Angstroms. | If there is a a CC result available, we would first look at the superposition of our query chain with the matching chopped chain from CATH. Typically we would expect a good superposition if the "NW sequence identity" field is at least 30%, if the SSAP score is >= 70 (preferably >= 80), and if the RMSD is <= 5 Angstroms. | ||
Line 69: | Line 45: | ||
The CC superpositions comprise the new query chain aligned with the best-matching chain that has already been chopped in CATH. The darker colours represent the new query and the lighter colours represent the best-match in CATH. | The CC superpositions comprise the new query chain aligned with the best-matching chain that has already been chopped in CATH. The darker colours represent the new query and the lighter colours represent the best-match in CATH. | ||
- | + | 2. **CATHEDRAL** | |
- | + | ||
- | + | ||
- | + | ||
- | 2. CATHEDRAL | + | |
This is the next result to check after CC. Any putative domains that match CATH domains with a SSAP over >= 70 (preferably >= 80) indicate a good match. | This is the next result to check after CC. Any putative domains that match CATH domains with a SSAP over >= 70 (preferably >= 80) indicate a good match. | ||
Line 79: | Line 51: | ||
The CATHEDRAL superpositions comprise the new query chain aligned with the best-matching domains that have already been chopped in CATH. The darker colours represent the new query and the lighter colours represent the best-matching domains in CATH. | The CATHEDRAL superpositions comprise the new query chain aligned with the best-matching domains that have already been chopped in CATH. The darker colours represent the new query and the lighter colours represent the best-matching domains in CATH. | ||
- | 3. HMM | + | 3. **HMM** |
Any putative domain that matches a CATH domain with an E-value below 1x10-05 represents a good match. | Any putative domain that matches a CATH domain with an E-value below 1x10-05 represents a good match. | ||
- | 4. PUU, Detective, Domak | + | 4. **PUU, Detective, Domak** |
These are ab initio-based algorithms and do not produce scores. These algorithms are very useful in providing results when the query PDB chains do not have any closely-related matches in CATH. If you don't find any chopping you are happy with in the previous steps, have a look at these results. Sometimes, these three algorithms can help to confirm the above-mentioned results. | These are ab initio-based algorithms and do not produce scores. These algorithms are very useful in providing results when the query PDB chains do not have any closely-related matches in CATH. If you don't find any chopping you are happy with in the previous steps, have a look at these results. Sometimes, these three algorithms can help to confirm the above-mentioned results. | ||
- | Submitting a chopping to the curator for review | + | **Submitting a chopping to the curator for review** |
If you are completely satisfied with a chopping proposed by one of the above algorithms, please select the 'Send for review' button next to the appropriate chopping. This chain will then be sent to the curator for reviewing. | If you are completely satisfied with a chopping proposed by one of the above algorithms, please select the 'Send for review' button next to the appropriate chopping. This chain will then be sent to the curator for reviewing. | ||
Line 129: | Line 101: | ||
http://update.cathdb.info/cgi-bin/DomChop.pl?chain_id=4uj8B | http://update.cathdb.info/cgi-bin/DomChop.pl?chain_id=4uj8B | ||
- | 4uj8B | + | - 4uj8B |
- | 4y25A | + | - 4y25A |
- | 4znoB | + | - 4znoB |
- | 5a57A | + | - 5a57A |
- | 5a8jA | + | - 5a8jA |
- | 5aoqA | + | - 5aoqA |
- | 5axgA | + | - 5axgA |
- | 5b04I | + | - 5b04I |
- | 5c0xK | + | - 5c0xK |
- | 5c14A | + | - 5c14A |
- | 5c1fA | + | - 5c1fA |
- | 5c1sA | + | - 5c1sA |
- | 5c22C | + | - 5c22C |
- | 5c2wD | + | - 5c2wD |
- | 5c4nD | + | - 5c4nD |
- | 5c6tA | + | - 5c6tA |
- | 5cwwB | + | - 5cwwB |
- | 5cylF | + | - 5cylF |
- | 5cyxA | + | - 5cyxA |
- | 5cz3A | + | - 5cz3A |
- | 5dcpA | + | - 5dcpA |
- | 5dcqF | + | - 5dcqF |
- | 5dqrA | + | - 5dqrA |
- | 5du3A | + | - 5du3A |
- | 5fx0A | + | - 5fx0A |
Chopping summary acronymns | Chopping summary acronymns | ||
Line 205: | Line 177: | ||
**Happy domain chopping !!!!** | **Happy domain chopping !!!!** | ||
+ | |||