Quantcast
Channel: Gene News
Viewing all articles
Browse latest Browse all 101

Updates to the UniProtKB FTP file

$
0
0

The gene_refseq_uniprotkb_collab.gz file on the Gene FTP site reports matched pairs of NCBI RefSeq and UniProtKB accessions. With a new process to find UniProtKB and RefSeq proteins related to each other, this file now reports data for over 170 million RefSeqs. This update introduces three additional columns.

First, columns are being added for both the NCBI TaxID and the UniProtKB TaxID for each match.

Second, a column is being added to indicate the method used to source each match, with one of these three values:

  • uniprot – matches imported from UniProt.
  • identical – matches where the protein sequence and assigned organism of the two accessions are identical to each other.
  • similar – matches where both proteins have the same assigned organism and share more than 90% sequence identity with more than 80% coverage.

The new column layout is:

  1. NCBI protein accession
  2. UniProtKB protein accession
  3. NCBI tax id
  4. UniProtKB tax id
  5. ​method​


Viewing all articles
Browse latest Browse all 101

Latest Images

Trending Articles





Latest Images