The testing plugin is enabled and should be disabled.

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

data:index [2017/10/11 12:44]
sillitoe
data:index [2017/10/14 15:28] (current)
sillitoe
Line 4: Line 4:
 This page provides information on the data files that are available to download from the [[ftp://orengoftp.biochem.ucl.ac.uk/cath | CATH FTP site]]. This page provides information on the data files that are available to download from the [[ftp://orengoftp.biochem.ucl.ac.uk/cath | CATH FTP site]].
  
 +See [[:index#cath_releases|CATH Releases]] for more information on CATH and CATH-Plus.
  
 ===== CATH (daily snapshot) ===== ===== CATH (daily snapshot) =====
- 
-We provide a daily snapshot of the very latest classifications and annotations as they happen in our pipeline. This enables users to find the most up-to-date information about their particular structure of interest. The amount of data we provide at this stage is limited mainly to domain boundaries and superfamily classification. 
  
 ftp://orengoftp.biochem.ucl.ac.uk/cath/releases/daily-release/newest/ ftp://orengoftp.biochem.ucl.ac.uk/cath/releases/daily-release/newest/
Line 22: Line 20:
  
 ftp://orengoftp.biochem.ucl.ac.uk/cath/releases/latest-release/ ftp://orengoftp.biochem.ucl.ac.uk/cath/releases/latest-release/
- 
-CATH-Plus adds a significant amount of data on top of the core classification information available in CATH. The CATH-Plus release process includes a number of manual annotation checks in addition to adding a huge amount of information combining protein structure, sequence and function. As a result, there is a greater depth of information available in CATH-Plus, though it may be missing information on the most recent structures.  
  
 For information on the statistics for specific releases, see [[../release_notes|release notes]]. For information on the statistics for specific releases, see [[../release_notes|release notes]].
Line 46: Line 42:
  
 ^ File name ^ Description ^ ^ File name ^ Description ^
-| cath-dataset-nonredundant-S[20%%|%%40]-v4_1_0.atom.fa | The ATOM sequences of the domains in the dataset (which only contain residues that have ATOM records in the PDB file) | +| cath-dataset-nonredundant-S[20%%|%%40].atom.fa | The ATOM sequences of the domains in the dataset (which only contain residues that have ATOM records in the PDB file) | 
-| cath-dataset-nonredundant-S[20%%|%%40]-v4_1_0.fa | The sequences of the domains in the dataset | +| cath-dataset-nonredundant-S[20%%|%%40].fa | The sequences of the domains in the dataset | 
-| cath-dataset-nonredundant-S[20%%|%%40]-v4_1_0.list | A list of the domains in the dataset; one domain ID per line | +| cath-dataset-nonredundant-S[20%%|%%40].list | A list of the domains in the dataset; one domain ID per line | 
-| cath-dataset-nonredundant-S[20%%|%%40]-v4_1_0.pdb.tgz | (A gzipped tar file containing) the PDB files of the domains in the data set |+| cath-dataset-nonredundant-S[20%%|%%40].pdb.tgz | (A gzipped tar file containing) the PDB files of the domains in the data set |
  
 ==== Sequence data ==== ==== Sequence data ====
Print/export