The following gzipped tar files are available for download:
These archive contains the following files, with ## being the target sequence identity of the clustering and yyyy_mm the corresponding Uniprot release. It will be updated every two months.
Representative (=seed) sequences of every cluster in FASTA format.
Consensus sequences of every cluster in FASTA format. The sequence header starts with the Uniclust cluster identifier uc##-yymm-<number>, the UniProt accession code of the representative sequence, the size of the cluster, the up to 5 best functional annotations from cluster members, and UniProt identifiers of all cluster members.
Tab-separated list with two columns of UniProt accession codes, the first for the representative sequence of the cluster and the second for the member sequence.
archive containing three files with for Pfam, SCOP and PDB annotations, each formatted as tab-separated lists with nine columns: (1,2) identifiers for query and target, (3-5, 6-8) domain start and end-position and total sequence length for both UniProt and database sequence, (9) HHblits E-value.
Uniboost database files in compressed A3M alignment format, with additional support files for the HH-suite version 3.
Archive containing Uniclust multiple sequence alignments for all clusters in a3m format, generated with Clustal Omega, and additional support files for use with legacy HH-suite version 2 and current version 3.
All files are available under a Creative Commons Attribution-ShareAlike 4.0 International License.