The testing plugin is enabled and should be disabled.

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

data:index [2016/07/12 17:07]
nataliedawson [CATH Data]
data:index [2017/10/14 15:28] (current)
sillitoe
Line 1: Line 1:
  
-====== CATH Data ======+====== CATH Data Downloads ======
  
-This page provides information on the data files that are available to download from the CATH FTP site:+This page provides information on the data files that are available to download from the [[ftp://orengoftp.biochem.ucl.ac.uk/cath | CATH FTP site]].
  
-[[ftp://ftp.biochem.ucl.ac.uk]]+See [[:index#cath_releases|CATH Releases]] for more information on CATH and CATH-Plus.
  
-For further information on these data files can be found in README.txt on the FTP site.+===== CATH (daily snapshot) =====
  
-For information on the statistics from specific releases, see [[../release_notes|release notes]].+ftp://orengoftp.biochem.ucl.ac.uk/cath/releases/daily-release/newest/
  
 +^ File name ^ Description ^
 +| cath-b-newest-all.gz | List the latest domain boundaries and superfamily (C.A.T.H) annotations for all CATH domains |
 +| cath-b-newest-names.gz | Provides the names for each node in the CATH hierarchy | 
 +| cath-b-newest-latest-release.gz | List the latest domain boundaries and superfamily annotations for CATH domains in the most recent release of CATH-Plus |
 +| cath-b-newest-putative.gz | List the latest domain boundaries and superfamily annotations for CATH domains released since the most release release of CATH-Plus |
 +| cath-b-s35-newest.gz | List the latest domain boundaries and sequence family (C.A.T.H.S) annotations for all non-redundant sequence representatives | 
  
-===== Data related to the CATH classification =====+===== CATH-Plus (full release) =====
  
 +ftp://orengoftp.biochem.ucl.ac.uk/cath/releases/latest-release/
 +
 +For information on the statistics for specific releases, see [[../release_notes|release notes]].
 +
 +==== CATH classification data ====
 +
 +ftp://orengoftp.biochem.ucl.ac.uk/cath/releases/latest-release/cath-classification-data/
  
 ^ File name ^ Description ^ ^ File name ^ Description ^
Line 24: Line 37:
 | cath-unclassified-list-<version>.txt | List of all unclassified protein chains and domains that are still being processed | | cath-unclassified-list-<version>.txt | List of all unclassified protein chains and domains that are still being processed |
  
-===== Data related to non-redundant data sets =====+==== Non-redundant data sets ==== 
 + 
 +ftp://orengoftp.biochem.ucl.ac.uk/cath/releases/latest-release/non-redundant-data-sets/ 
 ^ File name ^ Description ^ ^ File name ^ Description ^
-| cath-dataset-nonredundant-S[20%%|%%40]-v4_1_0.atom.fa | The ATOM sequences of the domains in the dataset (which only contain residues that have ATOM records in the PDB file) | +| cath-dataset-nonredundant-S[20%%|%%40].atom.fa | The ATOM sequences of the domains in the dataset (which only contain residues that have ATOM records in the PDB file) | 
-| cath-dataset-nonredundant-S[20%%|%%40]-v4_1_0.fa | The sequences of the domains in the dataset | +| cath-dataset-nonredundant-S[20%%|%%40].fa | The sequences of the domains in the dataset | 
-| cath-dataset-nonredundant-S[20%%|%%40]-v4_1_0.list | A list of the domains in the dataset; one domain ID per line | +| cath-dataset-nonredundant-S[20%%|%%40].list | A list of the domains in the dataset; one domain ID per line | 
-| cath-dataset-nonredundant-S[20%%|%%40]-v4_1_0.pdb.tgz | (A gzipped tar file containing) the PDB files of the domains in the data set |+| cath-dataset-nonredundant-S[20%%|%%40].pdb.tgz | (A gzipped tar file containing) the PDB files of the domains in the data set | 
 + 
 +==== Sequence data ====
  
-===== Data related to sequence data =====+ftp://orengoftp.biochem.ucl.ac.uk/cath/releases/latest-release/sequence-data/
  
 ^ File name ^ Description ^ ^ File name ^ Description ^
Print/export