Publication Date: Tuesday, April 6, 2021

Defining the unique protein features of SARS-CoV-2, the viral agent causing Coronavirus Disease 2019, may guide efforts to control this pathogen. We examined proteins encoded by the Sarbecoviruses closest to SARS-CoV-2 using profile Hidden Markov Model similarities to identify features unique to SARS-CoV-2. Consistent with previous reports, a small set of bat and pangolin-derived Sarbecoviruses show the greatest similarity to SARS-CoV-2. The analysis provided a measure of total proteome similarity and showed that a small subset of bat Sarbecoviruses are closely related but unlikely to be the direct source of SARS-CoV-2. Spike analysis reveals that the current SARS-CoV-2 variants of concern have sampled only 36% of the possible spikes changes which have occurred historically in Sarbecovirus evolution. It is likely that new SARS-CoV-2 variants with changes in these regions are compatible with virus replication and are to be expected in the coming months, unless global viral replication is severely reduced.

Publisher: medRxiv and bioRxiv
MRC/UVRI Authors: Prof. Matthew Cotten