User Tools

Site Tools


data_curation:superfamily_naming_tutorial:index

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
data_curation:superfamily_naming_tutorial:index [2023/09/28 16:02] vwamandata_curation:superfamily_naming_tutorial:index [2023/09/28 16:24] (current) vwaman
Line 1: Line 1:
-[[data_curation:superfamily_naming_tutorial:index|Superfamily Naming exercise (Last updated in Sept 2023)]] 
 === Superfamily Naming exercise (Last updated in Sept 2023) === === Superfamily Naming exercise (Last updated in Sept 2023) ===
  
-Useful websites: +**Useful websites:** 
 https://www.cathdb.info/ https://www.cathdb.info/
 http://sfam.cathdb.info/  http://sfam.cathdb.info/ 
  
 **Part I: Steps followed for naming a superfamily** **Part I: Steps followed for naming a superfamily**
-- Look through representative domains as: ‘domain only’ to understand common secondary structures; as ‘domain in chain’ to observe the location of the domain in the chain; as ‘domain in PDB’ to understand the domain’s function and location in the protein. +  * Look through representative domains as: ‘domain only’ to understand common secondary structures; as ‘domain in chain’ to observe the location of the domain in the chain; as ‘domain in PDB’ to understand the domain’s function and location in the protein. 
-- Check through FunFams/SwissProt/Keywords and refer to the most abundant name when naming. +  Check through FunFams/SwissProt/Keywords and refer to the most abundant name when naming. 
-- Check through enzymes (EC number if available), GO terms  and species  to get a rough idea of domain function. +  Check through enzymes (EC number if available), GO terms  and species  to get a rough idea of domain function. 
-- Refer to Pfam and InterPro  entries for general idea of protein domain function and/or structure. +  Refer to Pfam and InterPro  entries for general idea of protein domain function and/or structure. 
-- Check through papers associated with PDB entry for better understanding of protein and protein domain structure and/or function . +  Check through papers associated with PDB entry for better understanding of protein and protein domain structure and/or function . 
-- In ‘Description section’,  provide an overview of structure and function. In larger superfamilies, you may have to refer to specific PDB IDs. +  In ‘Description section’,  provide an overview of structure and function. In larger superfamilies, you may have to refer to specific PDB IDs. 
-- Check references are correct: [InterPro:] [Pfam:] [PMID:] [DOI:] +  Check references are correct: [InterPro:] [Pfam:] [PMID:] [DOI:] 
-- Check other names in the database, either to avoid duplicate names or to identify potential cross-hits +  Check other names in the database, either to avoid duplicate names or to identify potential cross-hits 
-- Check names of other domains in the same chain to keep the name similar.+  Check names of other domains in the same chain to keep the name similar.
  
-Part II: General observations and tips+**Part II: General observations and tips**
  
 Dos Dos
-Check other names in CATH to not make duplicates (i.e. make sure the assigned name is unique) +  * Check other names in CATH to not make duplicates (i.e. make sure the assigned name is unique) 
-Make superfamily names consistent with other domains of same protein +  Make superfamily names consistent with other domains of same protein 
-Start with smaller families until you get the hang of it +  Start with smaller families until you get the hang of it 
-For larger superfamily- it is a good idea to check FunFam  +  For larger superfamily- it is a good idea to check FunFam  
-When looking at a protein on InterPro, see if there are other domains that don’t have a name yet on the same protein - it will be easy to name that one +  When looking at a protein on InterPro, see if there are other domains that don’t have a name yet on the same protein - it will be easy to name that one 
-Work in groups for larger superfamilies +  Work in groups for larger superfamilies 
-Choose superfamily entries with FunFams, Pfams, or InterPro associated+  Choose superfamily entries with FunFams, Pfams, or InterPro associated
  
 Don’ts Don’ts
-Make description without sourcing references +  * Make description without sourcing references 
-Make description without actually really understanding it +  Make description without actually really understanding it 
-Spend 3 hours on a very small superfamily +  Spend 3 hours on a very small superfamily 
-Look at every single PDB for big superfamilies +  Look at every single PDB for big superfamilies 
-For smaller representative domains, don’t put too much confidence in InterPro/Pfam - it may be better to look at PDB paper for the specific domain +  For smaller representative domains, don’t put too much confidence in InterPro/Pfam - it may be better to look at PDB paper for the specific domain 
-Assume it is the exact same domain if it has good mapping to Pfam +  Assume it is the exact same domain if it has good mapping to Pfam 
-Choose a superfamily entry with no annotation or too many annotation+  Choose a superfamily entry with no annotation or too many annotation 
 + 
 +(Last updated in September 2023, Written by summer interns since 2020-2023 (Barbara, Oliver, Natalie, Charling, Ruiqi, Lorna, Katie, Charlotte, Hazuki) and CATH curators (Vaishali Waman, Ian Sillitoe) 
  
data_curation/superfamily_naming_tutorial/index.1695916973.txt.gz · Last modified: by vwaman