The testing plugin is enabled and should be disabled.
This is an old revision of the document!
Table of Contents
CATH Data
This page provides information on the data files that are available to download from the CATH FTP site:
ftp://ftp.biochem.ucl.ac.uk/cath/
For further information on these data files can be found in README.txt on the FTP site.
For information on the statistics from specific releases, see release notes.
Data related to the CATH classification
File name | Description |
---|---|
cath-chain-list-<version>.txt | Lists all of the PDB chain IDs in CATH, whether they are chopped into domains or not. |
cath-domain-boundaries-*-<version>.txt | Description of domain and segment boundaries for domains classified into CATH. |
cath-domain-description-file-<version>.txt | Description of each protein domain in CATH |
cath-domain-list-<S35%|S60|S95|S100|all>-<version>.txt | Lists of domains classified into CATH |
cath-domain-pdb-*-<version>.txt | Description of each domain PDB classified into CATH |
cath-names-<version>.txt | Name description of each node in the CATH hierarchy, along with an example domain |
cath-superfamily-list-<version>.txt | List of all the superfamilies in the CATH hierarchy |
cath-unclassified-list-<version>.txt | List of all unclassified protein chains and domains that are still being processed |
Data related to non-redundant data sets
File name | Description |
---|---|
cath-dataset-nonredundant-S[20|40]-v4_1_0.atom.fa | The ATOM sequences of the domains in the dataset (which only contain residues that have ATOM records in the PDB file) |
cath-dataset-nonredundant-S[20|40]-v4_1_0.fa | The sequences of the domains in the dataset |
cath-dataset-nonredundant-S[20|40]-v4_1_0.list | A list of the domains in the dataset; one domain ID per line |
cath-dataset-nonredundant-S[20|40]-v4_1_0.pdb.tgz | (A gzipped tar file containing) the PDB files of the domains in the data set |
Data related to sequence data
File name | Description |
---|---|
cath-domain-seqs-*-<version>.fa | Sequences for each CATH domain |
cath-S35-<version>-hmm3.lib.gz | HMMs for each CATH representative domain from the sequence clusters at 35% sequence identity |
funfam-hmm3-<version>.lib.gz | HMMs for each functional family (FunFam) |
cath-superfamily-seqs-<superfamily>-<version>.fa | Sequences for each CATH superfamily in FASTA format |