The testing plugin is enabled and should be disabled.

This is an old revision of the document!


Superfamily Naming exercise (Last updated in Sept 2023)

Superfamily Naming exercise (Last updated in Sept 2023)

Useful websites: https://www.cathdb.info/ http://sfam.cathdb.info/

Part I: Steps followed for naming a superfamily - Look through representative domains as: ‘domain only’ to understand common secondary structures; as ‘domain in chain’ to observe the location of the domain in the chain; as ‘domain in PDB’ to understand the domain’s function and location in the protein. - Check through FunFams/SwissProt/Keywords and refer to the most abundant name when naming. - Check through enzymes (EC number if available), GO terms and species to get a rough idea of domain function. - Refer to Pfam and InterPro entries for general idea of protein domain function and/or structure. - Check through papers associated with PDB entry for better understanding of protein and protein domain structure and/or function . - In ‘Description section’, provide an overview of structure and function. In larger superfamilies, you may have to refer to specific PDB IDs. - Check references are correct: [InterPro:] [Pfam:] [PMID:] [DOI:] - Check other names in the database, either to avoid duplicate names or to identify potential cross-hits - Check names of other domains in the same chain to keep the name similar.

Part II: General observations and tips

Dos • Check other names in CATH to not make duplicates (i.e. make sure the assigned name is unique) • Make superfamily names consistent with other domains of same protein • Start with smaller families until you get the hang of it • For larger superfamily- it is a good idea to check FunFam • When looking at a protein on InterPro, see if there are other domains that don’t have a name yet on the same protein - it will be easy to name that one • Work in groups for larger superfamilies • Choose superfamily entries with FunFams, Pfams, or InterPro associated

Don’ts • Make description without sourcing references • Make description without actually really understanding it • Spend 3 hours on a very small superfamily • Look at every single PDB for big superfamilies • For smaller representative domains, don’t put too much confidence in InterPro/Pfam - it may be better to look at PDB paper for the specific domain • Assume it is the exact same domain if it has good mapping to Pfam • Choose a superfamily entry with no annotation or too many annotation

Print/export