User Tools

Site Tools


blog:2008:08:22faq_questions_about_the_definition_of_cath_code

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
blog:2008:08:22faq_questions_about_the_definition_of_cath_code [2008/08/22 08:31] – created sillitoeblog:2008:08:22faq_questions_about_the_definition_of_cath_code [2015/08/17 18:36] (current) sillitoe
Line 1: Line 1:
 +====== FAQ: Questions about the definition of CATH code  ======
 +
 +The following is a reply from a recent email that may provide a useful explanation to others regarding the CATH code numbering with sequence clusters (SOLID).
 +
 +Received on Aug 21, 2008 (reproduced with permission)
 +
 +<code>
 +
 +Hi, CATH team:
 +
 +Upon reading the original paper published in 1997 and visiting your website, I am 
 +still confused about the definition of CATH code in sequence family levels. Take 
 +the CATH 3.1 reflected by these three below proteins as a example, my questions of 
 +them were as following:
 +
 +1) the code in CATHSOLI level of 2a8vA01 and 1a8vA01 are all the same, but why 
 +their codes in D level were different?
 +
 +    Does 2a8vA01 and 1a8vA01 not belong to the same s100 family?
 +
 +2) the code in CATHSO  ID level of 1a8vA01 and 1a62001 are all the same, but why 
 +their codes in L level were different?
 +
 +    If 1a8vA01 and 1a62001 are 100% sequence identical, why they were assigned to 
 +    different 95% sequence group? 
 +
 +Sincerely.
 +
 +backy
 +
 + 
 +                                   35%   60%   95%   100%
 +
 +            C    A              S                 
 +
 +2a8vA01        10   720    10                        47 2.400
 +1a8vA01        10   720    10                        49 2.000
 +1a62001        10   720    10                        44 1.550
 +
 +</code>
 +
 +The reply on Aug 21, 2008
 +
 +<code>
 +
 +Hi Backy,
 +
 +Thanks for getting in touch with us, hopefully I can answer your questions below:
 + 
 +    1) the code in CATHSOLI level of 2a8vA01 and 1a8vA01 are all the same, but why 
 +       their codes in D level were different?
 +
 +
 +The D level stands for "Domain Count" and is just there to provide a unique code 
 +for every domain - so if two domains are identical (i.e. they share everything up 
 +to the I, or 100% Identical, code) then we use the D level to differentiate 
 +between them - this is just a sequential counter.
 + 
 +
 +        Does 2a8vA01 and 1a8vA01 not belong to the same s100 family?
 +
 +
 +Yes, they do - they share up to the I count so they are 100% identical - as mentioned
 +above - the domain level is just a counter to differentiate between domains in the 
 +same I cluster.
 +
 +    2) the code in CATHSO  ID level of 1a8vA01 and 1a62001 are all the same, but 
 +       why their codes in L level were different?
 +
 +
 +You need to bear in mind that CATH is a tree-like hierarchy with the trunk of the
 +tree represented on the left of the CATHSOLID classification (e.g. the C code) and
 +the leaves of the tree on the right (e.g. the D code). In the example you give above 
 +- you have to read the CATH codes from left to right and stop the first time one of 
 +the codes differs. In this case, they differ at the 'L' code so they are in different 
 +S95% clusters. It doesn't matter that the numbers after this (I, D) are the same as 
 +they are talking about different branches of the tree.
 +
 +        If 1a8vA01 and 1a62001 are 100% sequence identical, why they were assigned 
 +        to different 95% sequence group? 
 +
 +The simple answer is that they aren't 100% identical - they have a seq id of 94.7% 
 +so they are in different L codes. As mentioned above, the I and D happen to be the 
 +same, but that doesn't mean anything if the L code is different (CATHSOLID needs to 
 +be read from left to right).
 +
 +So for the following three domains:
 +
 +2a8vA01 1.10.720.10.2.1.2.1.3
 +1a8vA01 1.10.720.10.2.1.2.1.1
 +1a62001 1.10.720.10.2.1.1.1.1
 +
 +The tree/hierarchy would look something like:
 +
 +C                   1
 +A                  10
 +T                  720
 +H                  10
 +S                   2
 +O                   1
 +L                              2
 +I                              1
 +D                        1           3
 +      1a62001           1a8vA01     2a8vA01
 +
 +This seems like a good question/answer to add to our FAQ section of the website - would you mind?
 +
 +Best wishes,
 +
 +Ian Sillitoe
 +CATH Team
 +</code>
 +