The testing plugin is enabled and should be disabled.


This shows you the differences between two versions of the page.

Link to this comparison view

index [2014/11/07 14:11]
index [2017/10/16 16:36] (current)
Line 1: Line 1:
-This is the home for all the external documentation for the CATH database. It is an area that we are looking to expand greatly in the future. If you have any questions about CATH that you can't find the answer to in the [[FAQ]] or through the [[search_wiki|wiki search]] then please [[contact|email us]] or post your suggestion [[suggestions|here]] and we will try to add the information as soon as we can.+===== What is CATH? =====
-===== General Contents =====+The CATH database is a free, publicly available online resource that provides  
 +information on the evolutionary relationships of protein domains. It was  
 +created in the mid-1990s by [[|Professor Christine Orengo]] and colleagues, and continues to be developed by the [[cathteam:index|Orengo group]] at University College London.
-  * [[FAQ]] - answers to the most frequently asked questions about CATH +===== How is CATH-Gene3D created? =====
-  * [[release_notes|Release Notes]] notes for each of the CATH releases +
-  * [[data:|File Formats]] - notes for CATH data files available for download +
-  * [[Tutorials:]] - tutorials on how to find useful information from CATH +
-  * [[Blog:]] - general news and information from members of the CATH team +
-  * [[Glossary:]] - description of terms frequently used in protein structure and CATH +
-  * [[Projects:]] - other projects going on in the Orengo group including [[Projects:publications | a list of group publications]] +
-  * [[Cathteam:]] - personal pages of the CATH team members +
-  * [[Suggestions]] - suggestions for the CATH database and wiki +
 +Experimentally-determined protein three-dimensional structures are obtained from
 +the Protein Data Bank and split into their consecutive polypeptide chains, where
 +applicable. Protein domains are identified within these chains using a mixture
 +of automatic methods and manual curation. The domains are then classified within
 +the CATH structural hierarchy: at the [[glossary:class|Class]] (C) level, domains are assigned
 +according to their secondary structure content, i.e. all alpha, all beta, a
 +mixture of alpha and beta, or little secondary structure; at the [[glossary:architecture|Architecture]]
 +(A) level, information on the secondary structure arrangement in
 +three-dimensional space is used for assignment; at the [[glossary:topology|Topology/fold]] (T) level,
 +information on how the secondary structure elements are connected and arranged
 +is used; assignments are made to the [[glossary:homologous_superfamily|Homologous superfamily]] (H) level if there is good evidence that the domains are related by evolution, i.e. they are
 +homologous. To browse the classification hierarchy, see [[|CATH hierarchy]].
 +Additional sequence data for domains with no experimentally determined
 +structures are provided by our sister resource, [[|Gene3D]], which are used to populate the homologous superfamilies. Protein sequences from UniProtKB and
 +Ensembl are scanned against CATH HMMs to predict domain sequence boundaries and
 +make homologous superfamily assignments.
 +===== CATH Releases =====
 +==== CATH (daily snapshot) ====
 +We provide a daily snapshot of the very latest classifications and annotations as they happen in our pipeline. This enables users to find the most up-to-date information about their particular structure of interest. The amount of data we provide at this stage is limited mainly to domain boundaries and superfamily classification.
 +==== CATH-Plus (full release) ====
 +We aim to provide full releases of CATH (CATH-Plus) every 12 months. CATH-Plus adds a significant amount of data on top of the core classification information available in CATH. The CATH-Plus release process includes a number of manual annotation checks in addition to adding a huge amount of information combining protein structure, sequence and function. As a result, there is a greater depth of information available in CATH-Plus, though it may be missing information on the most recent structures.
 +CATH-Plus data includes:
 +=== FunFams (Functional Families) ===
 +The homologous superfamilies in CATH-Gene3D can often be functionally and structurally diverse even though they share a conserved structural core. Therefore, the superfamilies have been sub-classified into functional families (FunFams) using a subclassification protocol purely based on sequence patterns. Relatives within these FunFams are likely to share highly similar structures and functions. The FunFams are useful in function prediction and in providing information on the evolution of function.
 +=== Structural clusters ===
 +The structures within a homologous superfamily have been clustered at < 9 Å RMSD to form structural clusters, also known as structurally-similar groups (SSGs). These structural clusters are useful for understanding the structural diversity of a superfamily.
 +=== Structural superpositions ===
 +The conserved structural core in the homologous superfamilies can be observed from the structural superpositions generated from its representative domains by [[cath_tools#cath_tools|CATH Tools]]. It is an effective way of observing the structural conservation and diversity across the superfamily.
 +See [[release_notes|release notes]] for information on the statistics for specific releases.
 +CATH and CATH-Plus data for all releases can be downloaded from [[data:index|Data Downloads]].
 +===== Open Source Software =====
 +CATH is proud to be a member of the open source software community. Our developers use and contribute towards the development and maintenance of a number of open source tools. For a full list of the open source software used in the making of this resource (both in the pipeline and our web pages), please visit the [[CATH tools]] page.
 +===== Contact us =====
 +If you have any comments/suggestions/criticisms, please let us know: