Since a crystal structure is not available for mAb RV6-26, we built a homology model of the structure using the Web Antibody Modeling (WAM) [15]. WAM uses a large number of known antibody structures as the knowledge database for homology modeling, and then applies ab initio molecular modeling for those parts of the antibody that are too variable for homology methods. For all docking runs, we included an alignment of our Fab with antibody binding subsequences of known antibody-antigen complexes, which allows RosettaDock to restrict the antibody from assuming an unlikely orientation of its CDR loops [11].
Figure 1 illustrates the steps of the docking procedure. In the first step, we started with a complex obtained from a fit into a cryo-EM density with an approximate resolution of 22Å [5] and performed 1000 low-resolution Monte-Carlo simulations with RosettaDock, treating the antibody as a rigid body that diffuses toward the fixed VP6 trimer. In applications without a cryo-EM density to seed the simulations, it may be necessary to perform more than 1000 simulations. The starting structure was obtained by fitting the X-ray coordinates of the VP6 trimer into the three-dimensional cryo-EM reconstructions with the Situs suite of programs [16], and the X-ray coordinates of the RV6-26 model were fitted into the antibody portion of the density by visual inspection. Additionally, the CDR loops of the antibody were oriented toward the VP6. The low-resolution, residue-scale interaction potentials include residue-environment and residue-residue interaction terms derived from a database of interfaces, a contact score to reward contacting residues, a bump score to penalize overlapping residues, and an empirical score that rewards interface CDR residues that are known to make contact with antigens based on known antibody-antigen complexes.
In the second step, we used a somatic mutation score (described below) based on binding affinity changes measured when naturally occurring RV6-26 antibody mutations were back-mutated to germline. This score was used to filter the 1000 low-resolution RosettaDock decoys. The motivation for using this data as a filter is that affinity is a measure of the evolutionary fitness of each mutation that occurs over the course of antibody evolution. The adult RV6-26 antibody contained 13 somatic mutations within the heavy chain. To identify which of these mutations affected binding to VP6, mutant antibodies were produced corresponding to the reversion of each somatic mutation amino acid back to the germline amino acid. Somatic mutations also occur in the light chain, but we focused on the 13 heavy chain mutations because the VH1–46 is the known immunodominant region. Ref. [5] provides experimental details of the measured binding affinity changes between each mutant and the wild-type RV6-26 antibody. Briefly, each mutant antibody was created, expressed, and purified and a detailed kinetic analysis performed using surface plasmon resonance. Most of the mutants retained binding equivalent to that of the wild-type Fab except for the six amino acids colored red in Fig. 2. For the somatic mutation filter in the current study, we discretized the equilibrium binding affinity changes for each mutation into active and neutral states (red and green, respectively, in Fig. 2).
Figure 2 shows the heavy chain amino acid sequences of the germline and adult RV6-26 antibodies, and summarizes the binding enhancement conferred by each amino acid. The numbering scheme used in Fig. 2 is derived from the immunoglobulin variable (V) gene database (VBASE), in which a unique antibody amino acid numbering system was introduced [17]. The first profile (germ) of the alignment in Fig. 2 shows the germline heavy-chain sequence, where the residues highlighted in blue are the CDR regions. The second profile (6–26) shows the somatic mutations of the RV6-26 antibody color coded in terms of their effect on VP6 binding. Amino acids highlighted in red were associated with enhanced antiviral activity of RV6-26, while amino acids highlighted in green had a neutral effect.
The third profile (TFN) of the alignment in Fig. 2 is the CDR scoring profile that is part of the low-resolution score in RosettaDock [11]. For our application, we defined True (T, orange) residues as CDR residues that are rewarded for being in the interface; False (F) residues as non-CDR residues that have not been observed to make antigen contact in known complexes and are penalized for being in the interface; and Neutral (N) interface residues as rarely occurring contact residues and non-CDR active residues, which make no contribution to the score. Even though the RV6-26 residues Gly73, Leu90, and Ser92 are not CDR residues, they were labeled as Neutral in the TFN profile because they were experimentally found to be active somatic mutations. This neutral labeling prevents the CDR score from excluding these non-CDR active somatic mutations from the interface.
For each complex, we created a matrix D of pair-wise distances between the Cα atoms of the 13 residue mutations of the RV6-26 antibody and a collection of VP6 residues from the interface, chosen based on visual inspection of the cryo-EM density. Virus interface residues were selected for the filter from two of the three VP6 chains: B-chain residues 197–213, 253–282, 287–301, 304, and 308; and C-chain residues 157–173, 236–247, and 351–374. The rows of the matrix D correspond to theVP6 residues at the interface and the columns to the antibody mutations highlighted in Fig. 2. For a given complex, we can determine the shortest distance of each antibody mutation to the VP6 interface from the matrix D. These distances yield a vector d whose elements are given by
whose length is equal to the number of antibody mutations.
The filter score F of a complex is given by the following sum over all antibody mutations M (M = 13 for RV6-26)
where the contribution from each residue is
and d
c
is a distance cutoff in Angstroms that characterizes our measure of closeness between the antibody and antigen. We classify a somatic mutation as "active" if back-mutation to germline has a disruptive effect on binding to the antigen. "Neutral" somatic mutations are non-disruptive when mutated back to germline. In Eq. (3a), an active somatic mutation contributes 1 to the affinity filter score of a complex if this residue is within d
c
Angstroms of the antigen interface. In our application, we chose a relatively loose cutoff of d
c
= 12Å to allow all active mutations the possibility of contributing to the score, including ones that may be more distant from the interface. It has been observed that many affinity-maturing mutations in singe chain Fv antibodies correspond to residues that are more distant from the interface [18, 19]. A smaller cutoff would exclude the contribution to the score of residues that are involved in affinity maturation yet may not make direct contact with the antigen. In Eq. (3b), we allow a neutral or negative somatic mutation to contribute 1/2 to the score if it is more distant from the interface than d
c
. This essentially penalizes neutral somatic mutations that are closer to the interface than the cutoff distance. Of course, it is still possible for non-disruptive mutations to be near the interface, so this soft distance constraint penalizes but does not exclude neutral residues from contacting the antigen. In the final piece (Eq. 3c), all other somatic mutations do not contribute to the filter score of Eq. (2).
In the third step of Fig. 1, we performed a high-resolution docking refinement of the top filtered complexes. A backbone-dependent rotamer packing algorithm is used for side-chain repacking [20]. From each of the best low-resolution complexes ranked by filter score, we created 200 high-resolution decoys using the perturbation triplet (2Å, 2Å, 20°). This perturbation triplet represents a search volume with respect to the line connecting the protein centers. The first number refers to translation along the line, the second refers to translation in the plane perpendicular to the line, and the third refers both to rotation around the axis defined by the line and tilt relative to the axis.
The resulting high-resolution decoys were ranked according to their full-atom scores. Candidate complexes were determined by an additional round of refinement (1Å, 1Å, 10°) following k-medoids (k = 3) clustering of the top Rosetta-scoring complexes. Root-mean-square deviation (RMSD) of the alpha-carbon coordinates was used as the cluster metric, and the cluster location was given by the decoy with the lowest full-atom energy score within the cluster. We used k-medoids clustering because, unlike hierarchical clustering, it does not require linkage assumptions, and it is simpler than mixture model clustering. Unlike model-based clustering, which has the advantage of a statistical model that allows it to estimate the number of clusters, we must a priori choose the number of clusters in k-medoids; however, this choice is easily validated by visual inspection of the 3D complexes. The candidate binding sites were determined from the final candidate complexes by finding the VP6 residues within a 5Å radius of the RV6-26 Tyr66. This mutation was chosen as a computational probe because it had the largest effect on binding affinity as back-mutation of this residue resulted in an 83-fold decreased rate of dissociation [5].