The testing plugin is enabled and should be disabled.

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

data_curation:superfamily_naming_tutorial:index [2023/09/28 16:02]
vwaman
data_curation:superfamily_naming_tutorial:index [2023/09/28 16:24] (current)
vwaman
Line 1: Line 1:
-[[data_curation:superfamily_naming_tutorial:index|Superfamily Naming exercise (Last updated in Sept 2023)]] 
 === Superfamily Naming exercise (Last updated in Sept 2023) === === Superfamily Naming exercise (Last updated in Sept 2023) ===
  
-Useful websites: +**Useful websites:** 
 https://www.cathdb.info/ https://www.cathdb.info/
 http://sfam.cathdb.info/  http://sfam.cathdb.info/ 
  
 **Part I: Steps followed for naming a superfamily** **Part I: Steps followed for naming a superfamily**
-- Look through representative domains as: ‘domain only’ to understand common secondary structures; as ‘domain in chain’ to observe the location of the domain in the chain; as ‘domain in PDB’ to understand the domain’s function and location in the protein. +  * Look through representative domains as: ‘domain only’ to understand common secondary structures; as ‘domain in chain’ to observe the location of the domain in the chain; as ‘domain in PDB’ to understand the domain’s function and location in the protein. 
-- Check through FunFams/SwissProt/Keywords and refer to the most abundant name when naming. +  * Check through FunFams/SwissProt/Keywords and refer to the most abundant name when naming. 
-- Check through enzymes (EC number if available), GO terms  and species  to get a rough idea of domain function. +  * Check through enzymes (EC number if available), GO terms  and species  to get a rough idea of domain function. 
-- Refer to Pfam and InterPro  entries for general idea of protein domain function and/or structure. +  * Refer to Pfam and InterPro  entries for general idea of protein domain function and/or structure. 
-- Check through papers associated with PDB entry for better understanding of protein and protein domain structure and/or function . +  * Check through papers associated with PDB entry for better understanding of protein and protein domain structure and/or function . 
-- In ‘Description section’,  provide an overview of structure and function. In larger superfamilies, you may have to refer to specific PDB IDs. +  * In ‘Description section’,  provide an overview of structure and function. In larger superfamilies, you may have to refer to specific PDB IDs. 
-- Check references are correct: [InterPro:] [Pfam:] [PMID:] [DOI:] +  * Check references are correct: [InterPro:] [Pfam:] [PMID:] [DOI:] 
-- Check other names in the database, either to avoid duplicate names or to identify potential cross-hits +  * Check other names in the database, either to avoid duplicate names or to identify potential cross-hits 
-- Check names of other domains in the same chain to keep the name similar.+  * Check names of other domains in the same chain to keep the name similar.
  
-Part II: General observations and tips+**Part II: General observations and tips**
  
 Dos Dos
-Check other names in CATH to not make duplicates (i.e. make sure the assigned name is unique) +  * Check other names in CATH to not make duplicates (i.e. make sure the assigned name is unique) 
- Make superfamily names consistent with other domains of same protein +  * Make superfamily names consistent with other domains of same protein 
- Start with smaller families until you get the hang of it +  * Start with smaller families until you get the hang of it 
- For larger superfamily- it is a good idea to check FunFam  +  * For larger superfamily- it is a good idea to check FunFam  
- When looking at a protein on InterPro, see if there are other domains that don’t have a name yet on the same protein - it will be easy to name that one +  * When looking at a protein on InterPro, see if there are other domains that don’t have a name yet on the same protein - it will be easy to name that one 
- Work in groups for larger superfamilies +  * Work in groups for larger superfamilies 
- Choose superfamily entries with FunFams, Pfams, or InterPro associated+  * Choose superfamily entries with FunFams, Pfams, or InterPro associated
  
 Don’ts Don’ts
-Make description without sourcing references +  * Make description without sourcing references 
- Make description without actually really understanding it +  * Make description without actually really understanding it 
- Spend 3 hours on a very small superfamily +  * Spend 3 hours on a very small superfamily 
- Look at every single PDB for big superfamilies +  * Look at every single PDB for big superfamilies 
- For smaller representative domains, don’t put too much confidence in InterPro/Pfam - it may be better to look at PDB paper for the specific domain +  * For smaller representative domains, don’t put too much confidence in InterPro/Pfam - it may be better to look at PDB paper for the specific domain 
- Assume it is the exact same domain if it has good mapping to Pfam +  * Assume it is the exact same domain if it has good mapping to Pfam 
- Choose a superfamily entry with no annotation or too many annotation+  * Choose a superfamily entry with no annotation or too many annotation 
 + 
 +(Last updated in September 2023, Written by summer interns since 2020-2023 (Barbara, Oliver, Natalie, Charling, Ruiqi, Lorna, Katie, Charlotte, Hazuki) and CATH curators (Vaishali Waman, Ian Sillitoe) 
  
Print/export