CD4 is under positive selection in primates.
Previous studies have identified CD4 as an HIV-1 cofactor that is evolving under positive (diversifying) selection (Meyerson et al., 2014; Zhang et al., 2008). However, these studies were limited in that they analyzed CD4 sequences from a narrow set of primate species (Zhang et al., 2008), or included a larger species panel but lacked the complete CD4 coding sequence (Meyerson et al., 2014). To extend on these studies, we collected full-length CD4 sequences from 25 primate species and tested for evidence of site-specific selective pressures using the codeml program on the Phylogenetic Analysis by Maximum Likelihood (PAML) package (Yang, 2007a). (A) Cladogram of the primate species (n = 25) analyzed in this study. (B) The most amino-terminal extracellular domain of CD4 (domain 1, D1) is bound by the primate lentivirus (HIV/SIV) envelope glycoprotein (Env) during entry (Bour et al., 1995). We next sought to assess whether D1 alone is evolving under positive selection (presumably due to selective pressures exerted by SIVs), or if other regions of CD4 are also experiencing selective pressures for diversification. Site-specific selective pressures in primate CD4 (full gene; top), the CD4 D1 domain alone (amino acids 26-123; middle), and CD4 minus the signal peptide and the D1 domain (amino acids 123-458; bottom) were detected using PAML (Yang, 2007a). Positive selection among amino acid sites was tested using two model comparisons, M7 vs. M8 and M8a vs. M8. In each of these comparisons, the null models (M7, M8a) do not allow for sites under positive selection, while the alternative model (M8) does. Tables summarize the likelihood ratio test between the M7-M8 and M8a-M8 models. The 2ΔlnL value (twice the difference in the natural log of the likelihoods) is shown, along with the p-value with which the neutral models (M7 or M8a) are rejected in favor of the model of positive selection (M8). (C) To further identify codon sites in CD4 under positive selection, we calculated the posterior probability of ω > 1 (where ω is the dN [nonsynonymous]/dS [synonymous] rate ratio, and values > 1 in the model M8 indicate sites under selection) using the Bayes empirical Bayes approach. Plot of posterior probabilities (ω>1 under maximum likelihood random-sites model M8) for all CD4 sites. Sites under positive selection (pω > 0.9) are shown in red. (D) The posterior mean of ω over a sliding window of 80 amino acids is shown (green line), along with the overall mean of ω across the entire gene (grey line). In both panels C and D, the amino acid positions are shown in relationship to human CD4, and the D1 domain of CD4 is highlighted in orange. (E) Cryo-EM structure of an HIV-1 Env trimer in complex with human CD4 (PDB 5U1F) was visualized in ChimeraX (Goddard et al., 2017). Individual gp120 and gp41 subunits are colored in light and dark blue, respectively. The CD4 D1 domain (red) and D2-D4 domains (gray) are shown, with sites under positive selection (Pω > 0.9) shown on the human sequence as red spheres. 9 of the 12 sites passing this stringent cutoff map to the Env-CD4 D1 domain interface.