Quantitative peptide binding motifs for 19 human and mouse MHC class I molecules derived using positional scanning combinatorial peptide libraries
© Sidney et al. 2008
Received: 20 October 2007
Accepted: 25 January 2008
Published: 25 January 2008
It has been previously shown that combinatorial peptide libraries are a useful tool to characterize the binding specificity of class I MHC molecules. Compared to other methodologies, such as pool sequencing or measuring the affinities of individual peptides, utilizing positional scanning combinatorial libraries provides a baseline characterization of MHC molecular specificity that is cost effective, quantitative and unbiased.
Here, we present a large-scale application of this technology to 19 different human and mouse class I alleles. These include very well characterized alleles (e.g. HLA A*0201), alleles with little previous data available (e.g. HLA A*3201), and alleles with conflicting previous reports on specificity (e.g. HLA A*3001). For all alleles, the positional scanning combinatorial libraries were able to elucidate distinct binding patterns defined with a uniform approach, which we make available here. We introduce a heuristic method to translate this data into classical definitions of main and secondary anchor positions and their preferred residues. Finally, we validate that these matrices can be used to identify candidate MHC binding peptides and T cell epitopes in the vaccinia virus and influenza virus systems, respectively.
These data confirm, on a large scale, including 15 human and 4 mouse class I alleles, the efficacy of the positional scanning combinatorial library approach for describing MHC class I binding specificity and identifying high affinity binding peptides. These libraries were shown to be useful for identifying specific primary and secondary anchor positions, and thereby simpler motifs, analogous to those described by other approaches. The present study also provides matrices useful for predicting high affinity binders for several alleles for which detailed quantitative descriptions of binding specificity were previously unavailable, including A*3001, A*3201, B*0801, B*1501 and B*1503.
T cells recognize a complex formed between a major histocompatibility complex (MHC) molecule and an antigenic peptide, or epitope. The identification of T cell epitopes is crucial to facilitate the study of the correlates of immunity. Different MHC molecules are associated with different peptide binding specificities, usually referred to as MHC peptide binding motifs. A large body of literature relates to the definition of MHC binding motifs for class I molecules of several different species, including humans, mice, chimpanzees and macaques (see, e.g., , for review). In general, class I MHC molecules recognize peptides of 9 to 10 residues in length and carrying residues with similar physiochemical specificity at main anchor positions. Typically, the main anchors are found in position 2 and at the C-terminus of the peptide ligand, although other anchor arrangements have been described for several alleles.
A variety of different methods are available to define MHC peptide binding motifs, each associated with its own advantages and disadvantages. The most common methods involve the pool sequencing of naturally presented MHC ligands or the evaluation of the binding capacity of individual peptide libraries. The pool sequencing approach is based on the bulk sequencing of peptides naturally bound to MHC following their elution with acidic buffers from the MHC peptide binding site. This is a remarkably simple and effective method, and has been applied with success in dozens of instances . It immediately and reliably identifies the most dominant binding requirements of an MHC molecule. An additional unique advantage of this approach is the fact that it is based on the characterization of physiologically processed ligands. Disadvantages associated with this method are that it is only semi-quantitative, and typically identifies only the most canonical (stringent) motifs. This can be a drawback in terms of utilizing this method for epitope predictions, since it has been shown that many dominant epitopes do not carry canonical pool sequencing defined motifs. For example, the prototypical dominant human leukocyte antigen (HLA) A*0201 restricted influenza matrix 58–66 epitope (sequence GILGFVFTL) [2, 3] does not contain the main anchor pattern associated with the HLA-A*0201 pool sequencing motif, which specifies the presence of L or M in position 2. Indeed, in a recent study we observed that 57% (8/14) of the HLA-A*0201 restricted vaccinia-derived epitopes identified did not conform with the A*0201 motif derived by pool sequencing analysis .
The most common alternative method for defining motifs is based on establishing quantitative MHC binding assays in vitro, and then testing series of individual peptides. These peptides are either single substitution analogs of high affinity binding epitopes or ligands, or large libraries of unrelated peptides. This method allows detailed probing of the relative role and chemical specificity of each position along the peptide sequence. A concern over this method when relying solely on single substitution analogs relates to the fact that it might reflect a binding mode specific to the particular parent peptide utilized as "wild type", although in practice the specificity patterns identified by single substitution analysis typically correspond well with those identified by other methodologies [5–19]. The same binding assay approach can be used to test large libraries of unrelated peptides (typically 100 or more) of a given size, and all carrying acceptable main anchor residues. As each peptide represents a unique sequence, this approach overcomes the concern associated with the single substitution approach that any pattern identified is dependent on the context of the specific "wild type" ligand.
Affinity data from individual peptides can be analyzed with different computational approaches to derive quantitative motifs that elucidate both primary and secondary influences on binding capacity with great detail (see, e.g., [1, 9, 20–38]). Predictions based on this type of data can give very accurate quantitative approximations of peptide binding, and can discriminate between candidate ligands bearing the same main anchor motifs. The most significant drawback of this approach is that it is dependent upon the availability of panels of several hundreds of allele specific peptides. As a result, this approach can be relatively labor intensive and expensive. Also, the selection of peptide sequences can introduce biases into the training data, for example by over or under representing residues at specific sequence positions.
An alternative approach to characterize the binding specificity of MHC molecules is based on the use of positional scanning combinatorial peptide libraries. Such libraries consists of combinatorial mixtures of large numbers of different peptides all sharing a single residue at a certain position. Measuring the affinity of such a library effectively evaluates the average influence of the shared residue on binding in a diverse set of surrounding sequences. Thus, an estimate of the binding contribution of all 20 residues in a 9-mer peptide can be derived by measuring the affinity of a set of 180 mixtures. This approach has been utilized successfully to determine specificity for several different applications, including analyses of the specificities associated with T cell receptor (TCR) recognition , proteosomal cleavage , and transporter associated with antigen processing (TAP) transport , as well as the identification of T cell epitopes [42, 43]. Their efficacy in characterizing MHC binding specificity was first explored in several studies starting over a decade ago [32, 44–46]. Matrices derived from analysis of combinatorial libraries have been found to perform well in the prediction of peptides with high MHC binding affinity [32, 47, 48]. Buus, in his visionary review of MHC studies, proposed the systematic use of combinatorial libraries a "Human MHC Project", directed at a complete mapping of human immune reactivities [49, 50].
Like the single substitution or peptide library approaches, data generated from positional scanning combinatorial library studies provide quantitative motifs. The unique advantage of using positional scanning combinatorial libraries is that they can be re-used for every allele, representing potentially very significant cost savings. Retesting the same probes for each allele also removes the risk of introducing bias into the set of tested ligands. These advantages have led us to systematically apply the use of combinatorial libraries to a set of 19 class I MHC alleles. In this large scale evaluation, we test if this approach works uniformly across different alleles. We compare its prediction performance to that of bioinformatics machine learning algorithms. We also developed a heuristic approach to convert the combinatorial library affinity data into a classical representation of primary and secondary anchor positions, which makes them directly comparable to those obtained in pool sequencing. Finally, we test the ability of these matrices in practical applications to identify MHC binding peptides and T cell epitopes.
Positional scanning combinatorial libraries and peptide synthesis
The combinatorial library was synthesized as previously described . Each pool in the library contains 9-mer peptides with one fixed residue at a single position. With each of the 20 naturally occurring residues represented at each position along the 9-mer backbone, the entire library consisted of 180 peptide mixtures.
Peptides utilized in screening studies were synthesized as described elsewhere , or purchased as crude material from Mimotopes (Minneapolis, MN/Clayton, Victoria, Australia), Pepscan Systems B.V. (Lelystad, Netherland) or A and A Labs (San Diego, CA). Peptides synthesized for use as radiolabeled ligands were synthesized by A and A Labs and purified to >95% homogeneity by reverse phase HPLC. Purity of these peptides was determined using analytical reverse-phase HPLC and amino acid analysis, sequencing, and/or mass spectrometry. Peptides were radiolabeled with the chloramine T method . Lyophilized peptides were re-suspended at 4–20 mg/ml in 100% DMSO, then diluted to required concentrations in PBS +0.05% (v/v) nonidet P40 (Fluka Biochemika, Buchs, Switzerland).
MHC purification and peptide binding assays
MHC purification and quantitative binding assays based on the inhibition of binding of a high affinity radiolabeled ligand were performed essentially as described elsewhere [18, 52]. HLA A*0201, A*6802, B*0702, B*0801, B*2705, B*3501, B*5101, B*5301, and B*5401 molecules were purified from EBV transformed homozygous B cell lines, as previously described [15, 16, 18, 52–55]. For A*3201, B*1501, B*5801 and B*5802, the WT47, SPACH, AP and 35841 cell lines were utilized, respectively. A*3001 molecules were obtained from the RSH cell line, or kindly provided by Dr. Soren Buus. B*1503 molecules were purchased from Pure Protein L.L.D. (Oklahoma City, OK), or kindly provided by Dr. Soren Buus. All HLA cell lines are from the IHWG cell bank (Fred Hutchinson Cancer Research Center). Mouse class I molecules were purified from P815 (H-2 Dd and Kd), CH27 (H-2 Kk), or EL-4 (H-2 Db) lines, as previously described [10, 52].
For the B*1501, B*1503, A*3201 and A*3001 assays, the artificial sequences AQIDNYNKF (peptide 3128.0001), YQAVVPLVY (peptide 3054.0065), RILHNFAYSL (peptide 1454.42) and KTKDYVNGL (peptide 1428.02) were utilized as the radiolabeled probes, respectively. Radiolabeled ligands for all other assays were as previously described [15, 16, 18, 52–55]. In competition assays, each mixture or individual peptide was tested in 3 or more independent experiments for its capacity to inhibit the binding of the radiolabeled peptide. The concentration of peptide yielding 50% inhibition of the binding of the radiolabeled peptide was calculated. Under the conditions utilized, where [label] < [MHC] and IC50 ≥ [MHC], the measured IC50 values are reasonable approximations of KD.
IC50 nM values for each mixture were standardized as a ratio to the geometric mean IC50 nM value of the entire set of 180 mixtures, and then normalized at each position as previously described [17, 18] so that the value associated with the optimal value at each position corresponds to 1. For each position, an average (geometric) relative binding affinity (ARB) was calculated, and then the ratio of the ARB for the entire library to the ARB for each position was derived. We have denominated this ratio, which describes the factor by which the normalized geometric average binding affinity associated with all 20 residues at a specified position differs from that of the average affinity of the entire library, as the specificity factor (SF). As calculated, positions with the highest specificity will have the highest SF value. Primary anchor positions were then defined as those associated with an SF > 2.4. This criterion identifies positions where the majority of residues are associated with significant decreases in binding capacity. Secondary anchors were identified based on the standard deviation of residue specific values at each position.
To identify predicted binders, all possible 9-mer peptides in vaccinia WR sequences were scored using the matrix values, where the final score for each peptide represents the product of the matrix value for the corresponding residue at each position. Algorithms derived by combining positional scanning combinatorial library and individual peptide data sets were generated using the stabilized matrix method (SMM) approach, as previously described .
Characteristics of the Study Population
Healthy males and females between 25 and 49 years of age were used in this study. Exclusion criteria were body weight of <45.4 kg and/or established pregnancy. Institutional Review Board approval and appropriate consent were obtained.
Peripheral Blood Mononuclear Cell (PBMC) Isolation and HLA Typing
PBMCs were isolated from heparinized blood by gradient centrifugation with a Histopaque-1077 (catalogue no. H8889, Sigma) , and the cells were cryopreserved in liquid nitrogen in 10% DMSO/FBS. Each donor's PBMCs were typed for HLA-A and -B by high-resolution PCR (Atria Genetics, San Francisco, CA).
Ex Vivo Primary ELISPOT Assay
Peptides were synthesized, and divided into groups according to their predicted HLA-A and HLA-B-restriction. PBMCs from individuals with the corresponding haplotype were incubated at 2 × 105 per well in the presence of individual peptides at 10 μg/ml, or a control pool with 24 peptides derived from commonly encountered pathogens (EBV, CMV, and influenza A virus) [58, 59]. The ELISPOT assays were performed as described previously . Responses against DMSO alone were subtracted from the experimental values. To assess statistical significance, a one-tailed Student t test was performed in which the triplicate values of each condition were compared with those of the negative controls. The criteria for positivity in a single experiment was set to ≥ 20 net spot-forming cells (SFCs)/106, a stimulation index (SI) ≥ 2.0, and p ≤ 0.05. Each experiment was performed twice. Epitopes were defined as peptides giving a positive response in 2/2 experiments using PBMC from a single donor.
Evaluation of the positional scanning combinatorial library approach for predicting HLA A*0201 binding peptides
Previous studies in other laboratories have demonstrated that the combinatorial approach performs well in predicting binders to several murine MHC class I molecules [32, 46, 47]. To verify that the same holds for human MHC molecules, we initially used the positional scanning combinatorial library with the best characterized human allele HLA A*0201, for which detailed primary and secondary anchor motifs have been described (see, e.g., [3, 9, 16, 34, 61, 62]). Also, several different predictive methods for this allele are widely available, and have been rigorously tested and compared (see, e.g., [36, 47]).
Performance of several methods for predicting A*0201 binders.
Positional scanning combinatorial library derived matrix describing 9-mer binding to HLA A*0201.
While information of training set size is unavailable for the remaining algorithms, given the general availability of A*0201 binding data, it is reasonable to assume that these algorithms have utilized training sets of similar size. Furthermore, the training set for the combinatorial approach does not overlap with the test set, and is thus completely unbiased, unlike the case for most of the tools utilized in the comparison . Taken together, the data presented in this section have provided further demonstration of the efficacy of using a positional scanning combinatorial library for identification of MHC class I binding peptides.
Generation and validation of positional scanning combinatorial library matrices for additional human and murine class I alleles
Encouraged by the results obtained in the context of A*0201, we derived, and make herein available to the scientific community, combinatorial library matrices for an additional 14 HLA (A*3001, A*3201, A*6802, B*0702, B*0801, B*1501, B*1503, B*2705, B*3501, B*5101, B*5301, B*5401, B*5801 and B*5802) and 4 mouse (H-2 Dd, Kd, Db and Kk) class I molecules. The measured IC50 values are provided in Additional file 1 [see Additional file 1], and will also be submitted to the Immune Epitope Database (IEDB) for hosting at the IEDB Analysis resource . The IC50 values for each mixture were normalized as described above. The resulting matrix values are tabulated in Additional file 2 [see Additional file 2]. For each allele, the matrices identified a reproducible, characteristic, binding pattern.
Identification of primary and secondary anchor positions by the positional scanning combinatorial library approach
To compare the results of combinatorial matrices with those from pool sequencing and single residue substitutions, and in order to meaningfully summarize the rather large amount of data in each scoring matrix, it is desirable to describe MHC binding in terms of simple motifs. The first step in defining such a motif is identifying the peptide positions that have the strongest influence on binding.
As before, A*0201 was first utilized as a model system. A*0201 binds peptides utilizing the peptide residues in position 2 and at the C-terminus as main anchors. At both main anchor positions hydrophobic or aliphatic residues are preferred or tolerated. Additional influences on binding capacity are contributed by residues at secondary positions, most prominently positions 1, 3, and 7, where both positive and deleterious influences can be noted .
Specificity factors derived from positional scanning combinatorial library matrices identify primary anchor positions.
Validation of the positional scanning combinatorial library approach: identifying primary anchor preferences
Comparison of main anchor motifs identified using positional scanning combinatorial libraries with those using other approaches.
Combinatorial library motif
The same criteria were then applied to the set of 18 additional alleles. Again, the patterns identified by the combinatorial libraries largely followed those previously described (Table 4). As was the case with A*0201, the 10-fold criteria applied to the combinatorial library data tended to identify a broader motif than identified by pool sequencing. However, when a more stringent threshold (e.g., 5-fold) is utilized, a narrower motif very similar to that described by pool sequencing is identified.
This analysis revealed several unexpected designations. The identification of position 3 as a main anchor for A*3001 binding, instead of position 2, is in disagreement with the published literature, but was not entirely unexpected based on analyses using single amino acid substitution peptides (Sidney and Sette, unpublished observations). The preference in position 2, identified as a dominant secondary anchor here, appears to be more towards small residues (V, T, and A) rather than aromatics, as indicated by pool sequencing, although these latter residues are still well tolerated. The preference at position 3 was found to be for basic residues. Pool sequencing had suggested a preference for hydrophobic residues at the C-terminus. While the combinatorial library generated motif is not in disagreement with this general specificity, the identification of an A3-supertype like preference for K was unexpected. However, subsequent MHC peptide binding studies by others (Harndahl and Buus et al, IEDB submission 1000945, ) and us (Sidney and Sette, unpublished observations) have confirmed this preference. Positions 1 and 2 were found to be dominant secondary anchors. This observation, in consideration with the discrepancy identified between the pool sequencing and positional scanning combinatorial libraries, suggests that A*3001 may be able to bind peptides using multiple different anchor arrangements.
In other cases, specifically B*1501, B*5802 and B*2705, no clear anchors were defined at either the N- or C-terminal end, where a more diffuse chemical specificity is apparent. A similar failure to identify dominant signals at more than one anchor residue has also been noted for several alleles when using pool sequencing methods (see e.g., ).
A detailed motif for A*3201 has not, to our knowledge, been previously made available. It has been suggested that this allele would be a member of the A1-supertype, and the motif identified herein is congruent with that association. However, peptide binding studies have not yet been able to confirm that this allele shares significant repertoire overlap with other alleles of this supertype.
Application of positional scanning combinatorial libraries: Predicting MHC binding peptides
Taken together, these data further validate the use of combinatorial libraries as a basis for predictive algorithms. Also, the present analysis has provided sets of high affinity binders derived from vaccinia WR for 5 relatively common HLA class I alleles.
Application of positional scanning combinatorial libraries: Predicting T cell epitope candidates
We initially wanted to test the sets of high affinity peptides identified from vaccinia virus in DryVax immunized donors, similar to a previous investigation with donors carrying HLA alleles from common supertypes . However, at the time of this study, we were unable to enroll a large enough number of newly vaccinated donors with the desired matching HLA alleles. We instead decided to validate the ability of combinatorial libraries to aid in the identification of T cell epitopes from influenza recognized in human donors for which we could enroll multiple donors with the HLA alleles A*3001, A*3201, and B*1501.
To make optimal use of both the combinatorial library data and the individual peptide binding data for these alleles, we utilized the SMM (stabilized matrix method) approach , which can combine these data to compute second generation matrices. These second generation matrices have been found to perform better than predictions based on either approach alone. All 9-mer peptides present in a representative set of Influenza A H1N1 and H3N2 strains were scored using the second generation matrices for A*3001, A*3201, and B*1501 as shown in Additional file 4 [see Additional file 4], and for each allele the top 100 scoring peptides were synthesized.
The predicted high-affinity-binding peptides were tested for their ability to elicit T cell responses from human donors with matching HLA. PBMCs from donors were isolated from leukopherisis or general blood donation volunteers, and HLA-typed by high-resolution PCR. In total, 13 healthy donors of 25–49 years of age were included in the study, including 3 donors for A*3001 and A*3201, and 8 for B*1501. Cryopreserved PBMCs were assayed with individual peptides from the set(s) corresponding to the donor's haplotype, and the reactivity was determined using IFNγ ELISPOT assays. Positive epitopes were defined as described in the Methods.
Because peptide binding to MHC is a requirement to elicit a T cell response, algorithm-based approaches predicting peptide binding are often utilized as a first screen to identify epitopes derived from large pathogens. In the present study, we have utilized 9-mer positional scanning combinatorial libraries to characterize the peptide binding specificities of several mouse and human class I alleles. When the corresponding positional scanning combinatorial library data were utilized to generate matrices for the prediction of binders derived from vaccinia, it was found that in all cases examined between 58 and 78% of the top 0.5% scoring peptides were high affinity binders, depending on the specific allele considered. The biological relevance of quantitative motifs derived from combinatorial library analyses were validated by identifying several epitopes derived from influenza A virus that were recognized by PBMCs from human donors. This study therefore provides a set of 19 uniformly generated matrices that can be directly applied to predict MHC peptide binding and T cell epitope candidates.
An implicit feature of the approach is that it provides a detailed quantitative motif for each MHC specificity examined. However, it is often useful to summarize MHC binding specificity in the more simple terms of primary anchor motifs. This "minimalist" approach dates back to the earliest studies of MHC binding, where specificities were defined using pool sequencing or single amino acid substitution analyses. These methods were very good at characterizing the most prominent features of allele specific motifs, and the resulting motifs have generally formed the syntax with which MHC binding is described. To extend the utility of the combinatorial approach, we have developed a heuristic approach to translate the matrix data generated by the combinatorial libraries into the more simple motifs that are the idiom of MHC studies. In the majority of cases, generalizable parameters could be defined that allowed the identification of main and secondary anchor positions congruent with those defined by other approaches.
The majority of HLA class I molecules whose binding specificity have been described by crystal structure, pool sequencing or peptide binding studies, the main anchor interactions of the peptide almost invariably involve the residues at position 2 and the C-terminus of the peptide. This pattern also appears to be true for most macaque and chimpanzee class I alleles studied to date. As evidenced by the cases of A*3001 and B*0801, the combinatorial library analysis suggests that the paradigm of position 2/C-terminus anchor spacing for MHC peptide binding is not always true. This has been reported previously in the case of B*0801 , where positions 3 and 5, in addition to the C-terminus, have been identified as primary anchors. Although this exact pattern was not duplicated by the combinatorial analysis, the present data do confirm the importance of positively charged residues in the middle of the peptide for conferring high affinity binding capacity. The ability to pick up unexpected binding patterns of MHC alleles is one of the key advantages of the combinatorial libraries, which have no prior expectations on which positions are likely to be important for MHC:peptide interactions.
In our previous HLA supertype classification study , B*0801 was considered an outlier on the basis of it's somewhat unique, for HLA, use of positions 3 and 5 as main anchor positions. This designation was also made by Lund  and Hertz . Others [74, 75] have classified it with alleles we  and others [72, 73] have assigned as members of the B7-supertype. In the present study, the combinatorial library analysis suggested that positions 5 and 6 are important (in addition to the C-terminus) for peptide binding. As such, the present analysis does not provide sufficient evidence to suggest that B*0801 should be assigned to a specific supertype, which are largely defined by position 2 (and C-terminus) specificity. We can note that proline, the B7-supertype associated position 2 specificity, does appear to be well tolerated in position 2 by B*0801. Also, our own unpublished binding data suggests that there may be some cross-reactivity between B*0702 and B*0801. However, this potential cross-reactivity has not been examined in enough detail at this point to draw any conclusions.
Previously, B*3501, B*5101, B*5301, and B*5401 were assigned to the B7-supertype [71–73], which describes a set of HLA alleles sharing a preference for proline as the position 2 main anchor. In the present study, the combinatorial library analysis confirmed this preference, but, surprisingly, also indicated that alanine was well tolerated by these alleles in position 2. This would suggest that overlap may exist in the repertoires of at least some B7-supertype alleles with alleles outside the B7-supertype, and in particular those associated with the B58- and or B62-supertypes. While the binding data we have to date suggest that the majority of instances of repertoire overlap for B7-supertype alleles will fall within the B7-supertype, and that proline is the most dominant preference in position 2, evidence for some cross-reactivity is also quite apparent. Indeed, a recent study  has found a high degree of cross-recognition of epitopes between alleles associated with different supertypes. Future studies will hopefully shed additional light on this issue.
In utilizing combinatorial libraries to characterize MHC specificity and identify binders, the approach we have implemented is computationally simple. We have largely utilized relative binding values for each residue/position coordinate. To predict binders, we have assumed the independent binding of peptide side chains, and represented the predicted binding propensity as a product of each coordinate. There are other ways to process the raw data for the purpose of generating prediction matrices, or to define anchor positions. To facilitate further investigation of prediction approaches by the bioinformatic community, we have here provided both the raw and processed data for over a dozen different HLA, and 4 H-2, class I alleles. We believe that this data will be of value to the community for the prediction of binders and epitopes, at least for several alleles not previously characterized in detail.
We compared the prediction performance of the combinatorial library with a set of 16 bioinformatic approaches for the best characterized human MHC allele, HLA A*0201. While several algorithms outperformed the combinatorial library, this has to be taken into perspective, as these algorithms are based on up to ten times more training data. Even more surprising, the combinatorial library nevertheless proved highly competitive, with a better prediction quality than 10 out of 16 algorithms. Taken together, the combinatorial libraries minimally provide a very solid baseline characterization of MHC binding specificity, which can be generated both quickly and with cost effectiveness.
Using combinatorial library based matrices to identify sets of candidate peptides, epitopes were successfully identified in patients typed as A*3001, A*3201 and B*1501. Notably, however, no peptide was recognized in more than one donor. This diversity of responses is similar to what was noted previously for mapping T cell responses to vaccinia-derived peptides in human donors. Because of the small number of A*3001 and A*3201 donors tested, it is possible we have under estimated the number of positive responses. Other factors may also be responsible for the lower response rates observed. The donor pool utilized represents an outbred population, and almost all of the donors were heterozygous at both the A and B loci. Thus, the diverse donor responses may reflect the different influences of other MHC alleles in shaping the overall T cell repertoire. Similarly, that dominant epitopes recognized in multiple donors were not identified may be due to the fact that the set of donors is representative of diverse histories of exposure to different viral strains.
The epitope identification aspect of the study was not pursued to the level and detail of our previous studies (e.g., ). Several factors are responsible for this, including the fact that while the alleles studied are not rare, neither are they prevalent, making it resource intensive to identify a sufficient number of additional donors. As a result, the identified peptides represent potential leads of a preliminary nature. At the same time, the data does help demonstrate that the matrices derived in the study are useful for epitope identification, even if the epitope identification study was not ideal. Furthermore, to the best of our knowledge, each of the epitopes identified in the present study represent novel epitopes, and together are the first set of influenza virus derived epitopes based on predictions for A*3001, A*3201 and B*1501.
The present study has extended observations from previous studies [32, 44, 46, 48, 49] showing the usefulness of positional scanning combinatorial libraries for identifying MHC class I binding peptides. Herein we have made available combinatorial library based matrices for 19 class I alleles of human and mouse origin, including several that have not previously been characterized in detail. These libraries have also been shown to be useful for identifying specific primary and secondary anchor positions, and thereby simpler motifs, analogous to those described by other approaches. For A*3001, A*3201, B*0801, B*1501 and B*1503, sets of vaccinia WR derived peptides that bind with high affinity have been identified. These peptides represent candidates for future studies towards the identification of epitopes derived from vaccinia, a virus of high interest for the development of viral vector based vaccines, in addition to its well-known use as a vaccine against smallpox. Finally, we have also identified several epitopes derived from influenza that are recognized in HLA A*3001, A*3201 and B*1501 donors.
The experiments described herein comply with the current laws of the United States of America. This work is supported by NIH NIAID contracts N01-AI-40023 (AS), N01-AI-40024 (AS), HHSN266200400006C (AS), and HHSN266200400080C/N01 AI40080 (CP). EA was supported by the Wenner-Gren Foundations.
- Rammensee H, Bachmann J, Emmerich NP, Bachor OA, Stevanovic S: SYFPEITHI: database for MHC ligands and peptide motifs. Immunogenetics 1999, 50:213–9.View ArticlePubMedGoogle Scholar
- Gotch F, Rothbard J, Howland K, Townsend A, McMichael A: Cytotoxic T lymphocytes recognize a fragment of influenza virus matrix protein in association with HLA-A2. Nature 1987, 326:881–2.View ArticlePubMedGoogle Scholar
- Falk K, Rotzschke O, Stevanovic S, Jung G, Rammensee HG: Allele-specific motifs revealed by sequencing of self-peptides eluted from MHC molecules. Nature 1991, 351:290–6.View ArticlePubMedGoogle Scholar
- Pasquetto V, Bui HH, Giannino R, Mirza F, Sidney J, Oseroff C, Tscharke DC, Irvine K, Bennink JR, Peters B, et al.: HLA-A* HLA-A*1101, and HLA-B*0702 Transgenic Mice Recognize Numerous Poxvirus Determinants from a Wide Variety of Viral Gene Products. J Immunol 0201, 175:5504–15.Google Scholar
- Allen TM, Sidney J, del Guercio MF, Glickman RL, Lensmeyer GL, Wiebe DA, DeMars R, Pauza CD, Johnson RP, Sette A, et al.: Characterization of the peptide binding motif of a rhesus MHC class I molecule (Mamu-A*01) that binds an immunodominant CTL epitope from simian immunodeficiency virus. J Immunol 1998, 160:6062–71.PubMedGoogle Scholar
- Loffredo JT, Sidney J, Piaskowski S, Szymanski A, Furlott J, Rudersdorf R, Reed J, Peters B, Hickman-Miller HD, Bardet W, et al.: The High Frequency Indian Rhesus Macaque MHC Class I Molecule, Mamu-B*01, Does Not Appear to Be Involved in CD8+ T Lymphocyte Responses to SIVmac239. J Immunol 2005, 175:5986–97.PubMedGoogle Scholar
- Loffredo JT, Sidney J, Wojewoda C, Dodds E, Reynolds MR, Napoe G, Mothe BR, O'Connor DH, Wilson NA, Watkins DI, et al.: Identification of seventeen new simian immunodeficiency virus-derived CD8+ T cell epitopes restricted by the high frequency molecule, Mamu-A*02, and potential escape from CTL recognition. J Immunol 2004, 173:5064–76.PubMedGoogle Scholar
- Mothe BR, Sidney J, Dzuris JL, Liebl ME, Fuenger S, Watkins DI, Sette A: Characterization of the peptide-binding specificity of Mamu-B*17 and identification of Mamu-B*17-restricted epitopes derived from simian immunodeficiency virus proteins. J Immunol 2002, 169:210–9.PubMedGoogle Scholar
- Ruppert J, Sidney J, Celis E, Kubo RT, Grey HM, Sette A: Prominent role of secondary anchor residues in peptide binding to HLA-A2.1 molecules. Cell 1993, 74:929–37.View ArticlePubMedGoogle Scholar
- Sette A, Sidney J, Bui HH, Del Guercio MF, Alexander J, Loffredo J, Watkins DI, Mothe BR: Characterization of the peptide-binding specificity of Mamu-A*11 results in the identification of SIV-derived epitopes and interspecies cross-reactivity. Immunogenetics 2005, 57:53–68.View ArticlePubMedGoogle Scholar
- Sidney J, del Guercio MF, Southwood S, Hermanson G, Maewal A, Appella E, Sette A: The HLA-A*0207 peptide binding repertoire is limited to a subset of the A*0201 repertoire. Hum Immunol 1997, 58:12–20.View ArticlePubMedGoogle Scholar
- Sidney J, Del Guercio MF, Southwood S, Sette A: The HLA Molecules DQA1*0501/B1*0201 and DQA1*0301/B1*0302 Share an Extensive Overlap in Peptide Binding Specificity. J Immunol 2002, 169:5098–108.PubMedGoogle Scholar
- Sidney J, Dzuris JL, Newman MJ, Johnson RP, Kaur A, Amitinder K, Walker CM, Appella E, Mothe B, Watkins DI, et al.: Definition of the Mamu A*01 peptide binding specificity: application to the identification of wild-type and optimized ligands from simian immunodeficiency virus regulatory proteins. J Immunol 2000, 165:6387–99.PubMedGoogle Scholar
- Sidney J, Grey HM, Southwood S, Celis E, Wentworth PA, del Guercio MF, Kubo RT, Chesnut RW, Sette A: Definition of an HLA-A3-like supermotif demonstrates the overlapping peptide-binding repertoires of common HLA molecules. Hum Immunol 1996, 45:79–93.View ArticlePubMedGoogle Scholar
- Sidney J, Southwood S, del Guercio MF, Grey HM, Chesnut RW, Kubo RT, Sette A: Specificity and degeneracy in peptide binding to HLA-B7-like class I molecules. J Immunol 1996, 157:3480–90.PubMedGoogle Scholar
- Sidney J, Southwood S, Mann DL, Fernandez-Vina MA, Newman MJ, Sette A: Majority of peptides binding HLA-A*0201 with high affinity crossreact with other A2-supertype molecules. Hum Immunol 2001, 62:1200–16.View ArticlePubMedGoogle Scholar
- Sidney J, Southwood S, Pasquetto V, Sette A: Simultaneous prediction of binding capacity for multiple molecules of the HLA B44-supertype. J Immunol 2003, 171:5964–5974.PubMedGoogle Scholar
- Sidney J, Southwood S, Sette A: Classification of A1- and A24-supertype molecules by analysis of their MHC-peptide binding repertoires. Immunogenetics 2005, 57:393–408.View ArticlePubMedGoogle Scholar
- Kubo RT, Sette A, Grey HM, Appella E, Sakaguchi K, Zhu NZ, Arnott D, Sherman N, Shabanowitz J, Michel H, et al.: Definition of specific peptide motifs for four major HLA-A alleles. J Immunol 1994, 152:3913–24.PubMedGoogle Scholar
- Donnes P, Elofsson A: Prediction of MHC class I binding peptides, using SVMHC. BMC Bioinformatics 2002, 3:25.View ArticlePubMedGoogle Scholar
- Reche PA, Glutting JP, Zhang H, Reinherz EL: Enhancement to the RANKPEP resource for the prediction of peptide binding to MHC molecules using profiles. Immunogenetics 2004, 56:405–19.View ArticlePubMedGoogle Scholar
- Schueler-Furman O, Altuvia Y, Sette A, Margalit H: Structure-based prediction of binding peptides to MHC class I molecules: application to a broad range of MHC alleles. Protein Sci 2000, 9:1838–46.View ArticlePubMedGoogle Scholar
- Zhang GL, Srinivasan KN, Veeramani A, August JT, Brusic V: PREDBALB/c: a system for the prediction of peptide binding to H2d molecules, a haplotype of the BALB/c mouse. Nucleic Acids Res 2005, 33:W180–3.View ArticlePubMedGoogle Scholar
- Hertz T, Yanover C: PepDist: a new framework for protein-peptide binding prediction based on learning peptide distance functions. BMC Bioinformatics 2006,7(Suppl 1):S3.View ArticlePubMedGoogle Scholar
- Buus S, Lauemoller SL, Worning P, Kesmir C, Frimurer T, Corbet S, Fomsgaard A, Hilden J, Holm A, Brunak S: Sensitive quantitative predictions of peptide-MHC binding by a 'Query by Committee' artificial neural network approach. Tissue Antigens 2003, 62:378–84.View ArticlePubMedGoogle Scholar
- Nielsen M, Lundegaard C, Worning P, Hvid CS, Lamberth K, Buus S, Brunak S, Lund O: Improved prediction of MHC class I and class II epitopes using a novel Gibbs sampling approach. Bioinformatics 2004, 20:1388–97.View ArticlePubMedGoogle Scholar
- Nielsen M, Lundegaard C, Worning P, Lauemoller SL, Lamberth K, Buus S, Brunak S, Lund O: Reliable prediction of T-cell epitopes using neural networks with novel sequence representations. Protein Sci 2003, 12:1007–17.View ArticlePubMedGoogle Scholar
- Zhang GL, Khan AM, Srinivasan KN, August JT, Brusic V: MULTIPRED: a computational system for prediction of promiscuous HLA binding peptides. Nucleic Acids Res 2005, 33:W172–9.View ArticlePubMedGoogle Scholar
- Guan P, Doytchinova IA, Zygouri C, Flower DR: MHCPred: bringing a quantitative dimension to the online prediction of MHC binding. Appl Bioinformatics 2003, 2:63–6.PubMedGoogle Scholar
- Tenzer S, Peters B, Bulik S, Schoor O, Lemmel C, Schatz MM, Kloetzel PM, Rammensee HG, Schild H, Holzhutter HG: Modeling the MHC class I pathway by combining predictions of proteasomal cleavage, TAP transport and MHC class I binding. Cell Mol Life Sci 2005, 62:1025–37.View ArticlePubMedGoogle Scholar
- Hakenberg J, Nussbaum AK, Schild H, Rammensee HG, Kuttler C, Holzhutter HG, Kloetzel PM, Kaufmann SH, Mollenkopf HJ: MAPPP: MHC class I antigenic peptide processing prediction. Appl Bioinformatics 2003, 2:155–8.PubMedGoogle Scholar
- Udaka K, Wiesmuller KH, Kienle S, Jung G, Tamamura H, Yamagishi H, Okumura K, Walden P, Suto T, Kawasaki T: An automated prediction of MHC class I-binding peptides based on positional scanning with peptide libraries. Immunogenetics 2000, 51:816–28.View ArticlePubMedGoogle Scholar
- Sathiamurthy M, Hickman HD, Cavett JW, Zahoor A, Prilliman K, Metcalf S, Fernandez Vina M, Hildebrand WH: Population of the HLA ligand database. Tissue Antigens 2003, 61:12–9.View ArticlePubMedGoogle Scholar
- Parker KC, Bednarek MA, Coligan JE: Scheme for ranking potential HLA-A2 binding peptides based on independent binding of individual peptide side-chains. J Immunol 1994, 152:163–75.PubMedGoogle Scholar
- Bui HH, Sidney J, Peters B, Sathiamurthy M, Sinichi A, Purton KA, Mothe BR, Chisari FV, Watkins DI, Sette A: Automated generation and evaluation of specific MHC binding predictive tools: ARB matrix applications. Immunogenetics 2005, 57:304–14.View ArticlePubMedGoogle Scholar
- Gulukota K, Sidney J, Sette A, DeLisi C: Two complementary methods for predicting peptides binding major histocompatibility complex molecules. J Mol Biol 1997, 267:1258–67.View ArticlePubMedGoogle Scholar
- Peters B, Bui HH, Sidney J, Weng Z, Loffredo JT, Watkins DI, Mothe BR, Sette A: A computational resource for the prediction of peptide binding to Indian rhesus macaque MHC class I molecules. Vaccine 2005, 23:5212–5224.View ArticlePubMedGoogle Scholar
- Peters B, Tong W, Sidney J, Sette A, Weng Z: Examining the independent binding assumption for binding of peptide epitopes to MHC-I molecules. Bioinformatics 2003, 19:1765–1772.View ArticlePubMedGoogle Scholar
- Pinilla C, Martin R, Gran B, Appel JR, Boggiano C, Wilson DB, Houghten RA: Exploring immunological specificity using synthetic peptide combinatorial libraries. Curr Opin Immunol 1999, 11:193–202.View ArticlePubMedGoogle Scholar
- Nazif T, Bogyo M: Global analysis of proteasomal substrate specificity using positional-scanning libraries of covalent inhibitors. Proc Natl Acad Sci USA 2001, 98:2967–72.View ArticlePubMedGoogle Scholar
- Uebel S, Kraas W, Kienle S, Wiesmuller KH, Jung G, Tampe R: Recognition principle of the TAP transporter disclosed by combinatorial peptide libraries. Proc Natl Acad Sci USA 1997, 94:8976–81.View ArticlePubMedGoogle Scholar
- Nino-Vasquez JJ, Allicotti G, Borras E, Wilson DB, Valmori D, Simon R, Martin R, Pinilla C: A powerful combination: the use of positional scanning libraries and biometrical analysis to identify cross-reactive T cell epitopes. Mol Immunol 2004, 40:1063–74.View ArticlePubMedGoogle Scholar
- Zhao Y, Gran B, Pinilla C, Markovic-Plese S, Hemmer B, Tzou A, Whitney LW, Biddison WE, Martin R, Simon R: Combinatorial peptide libraries and biometric score matrices permit the quantitative analysis of specific and degenerate interactions between clonotypic TCR and MHC peptide ligands. J Immunol 2001, 167:2130–41.PubMedGoogle Scholar
- Stryhn A, Pedersen LO, Romme T, Holm CB, Holm A, Buus S: Peptide binding specificity of major histocompatibility complex class I resolved into an array of apparently independent subspecificities: quantitation by peptide libraries and improved prediction of binding. Eur J Immunol 1996, 26:1911–8.View ArticlePubMedGoogle Scholar
- Udaka K, Wiesmuller KH, Kienle S, Jung G, Walden P: Decrypting the structure of major histocompatibility complex class I-restricted cytotoxic T lymphocyte epitopes with complex peptide libraries. J Exp Med 1995, 181:2097–108.View ArticlePubMedGoogle Scholar
- Lauemoller SL, Holm A, Hilden J, Brunak S, Holst Nissen M, Stryhn A, Ostergaard Pedersen L, Buus S: Quantitative predictions of peptide binding to MHC class I molecules using specificity matrices and anchor-stratified calibrations. Tissue Antigens 2001, 57:405–14.View ArticlePubMedGoogle Scholar
- Peters B, Bui HH, Frankild S, Nielson M, Lundegaard C, Kostem E, Basch D, Lamberth K, Harndahl M, Fleri W, et al.: A community resource benchmarking predictions of peptide binding to MHC-I molecules. PLoS Comput Biol 2006, 2:e65.View ArticlePubMedGoogle Scholar
- Sidney J, Peters B, Moore C, Pencille TJ, Ngo S, Masterman KA, Asabe S, Pinilla C, Chisari FV, Sette A: Characterization of the peptide-binding specificity of the chimpanzee class I alleles A*0301 and A*0401 using a combinatorial peptide library. Immunogenetics 2007.Google Scholar
- Lauemoller SL, Kesmir C, Corbet SL, Fomsgaard A, Holm A, Claesson MH, Brunak S, Buus S: Identifying cytotoxic T cell epitopes from genomic and proteomic information: "The human MHC project.". Rev Immunogenet 2000, 2:477–91.PubMedGoogle Scholar
- Buus S: Description and prediction of peptide-MHC binding: the 'human MHC project'. Curr Opin Immunol 1999, 11:209–13.View ArticlePubMedGoogle Scholar
- Pinilla C, Appel JR, Blanc P, Houghten RA: Rapid identification of high affinity peptide ligands using positional scanning synthetic peptide combinatorial libraries. Biotechniques 1992, 13:901–5.PubMedGoogle Scholar
- Sidney J, Southwood S, Oseroff C, Del Guercio MF, Sette A, Grey H: Measurement of MHC/Peptide Interactions by Gel Filtration. Current Protocols in Immunology John Wiley & Sons, Inc 1998, 18.3.1–18.3.19.Google Scholar
- Sette A, Sidney J, Livingston B, Dzuris J, Crimi C, Walker CM, Southwood S, Collins EJ, Hughes A: Class I molecules with similar peptide binding specificities are the result of both common ancestry and convergent evolution. Immunogenetics 2003, 54:830–841.PubMedGoogle Scholar
- van der Most RG, Concepcion RJ, Oseroff C, Alexander J, Southwood S, Sidney J, Chesnut RW, Ahmed R, Sette A: Uncovering subdominant cytotoxic T-lymphocyte responses in lymphocytic choriomeningitis virus-infected BALB/c mice. J Virol 1997, 71:5110–4.PubMedGoogle Scholar
- van der Most RG, Sette A, Oseroff C, Alexander J, Murali-Krishna K, Lau LL, Southwood S, Sidney J, Chesnut RW, Matloubian M, et al.: Analysis of cytotoxic T cell responses to dominant and subdominant epitopes during acute and chronic lymphocytic choriomeningitis virus infection. J Immunol 1996, 157:5543–54.PubMedGoogle Scholar
- Peters B, Sette A: Generating quantitative models describing the sequence specificity of biological processes with the stabilized matrix method. BMC Bioinformatics 2005, 6:132.View ArticlePubMedGoogle Scholar
- Kanof ME, Smith PD, Zola H: Current Protocols in Immunology Wiley, San Diego 1997, 2:7.1.1–7.1.5.Google Scholar
- Rothbard JB, Lechler RI, Howland K, Bal V, Eckels DD, Sekaly R, Long EO, Taylor WR, Lamb JR: Structural model of HLA-DR1 restricted T cell antigen recognition. Cell 1988, 52:515–23.View ArticlePubMedGoogle Scholar
- Currier JR, Kuta EG, Turk E, Earhart LB, Loomis-Price L, Janetzki S, Ferrari G, Birx DL, Cox JH: A panel of MHC class I restricted viral peptides for use as a quality control for vaccine trial ELISPOT assays. J Immunol Methods 2002, 260:157–72.View ArticlePubMedGoogle Scholar
- Tangri S, Ishioka GY, Huang X, Sidney J, Southwood S, Fikes J, Sette A: Structural features of peptide analogs of human histocompatibility leukocyte antigen class I epitopes that are more potent and immunogenic than wild-type peptide. J Exp Med 2001, 194:833–46.View ArticlePubMedGoogle Scholar
- Kast WM, Brandt RM, Sidney J, Drijfhout JW, Kubo RT, Grey HM, Melief CJ, Sette A: Role of HLA-A motifs in identification of potential CTL epitopes in human papillomavirus type 16 E6 and E7 proteins. J Immunol 1994, 152:3904–12.PubMedGoogle Scholar
- Parker KC, Bednarek MA, Hull LK, Utz U, Cunningham B, Zweerink HJ, Biddison WE, Coligan JE: Sequence motifs important for peptide binding to the human MHC class I molecule, HLA-A2. J Immunol 1992, 149:3580–7.PubMedGoogle Scholar
- The Immune Epitope Database and Analysis Resource [http://www.immuneepitope.org]
- Peters B, Sette A: Integrating epitope data into the emerging web of biomedical knowledge resources. Nat Rev Immunol 2007, 7:485–90.View ArticlePubMedGoogle Scholar
- Peters B, Sidney J, Bourne P, Bui HH, Buus S, Doh G, Fleri W, Kronenberg M, Kubo R, Lund O, et al.: The immune epitope database and analysis resource: from vision to blueprint. PLoS Biol 2005, 3:e91.View ArticlePubMedGoogle Scholar
- The Immune Epitope Database MHC Class I Binding Prediction Resource [http://mhcbindingpredictions.immuneepitope.org/]
- Kondo A, Sidney J, Southwood S, del Guercio MF, Appella E, Sakamoto H, Celis E, Grey HM, Chesnut RW, Kubo RT, et al.: Prominent roles of secondary anchor residues in peptide binding to HLA-A24 human class I molecules. J Immunol 1995, 155:4307–12.PubMedGoogle Scholar
- Barber LD, Gillece-Castro B, Percival L, Li X, Clayberger C, Parham P: Overlap in the repertoires of peptides bound in vivo by a group of related class I HLA-B allotypes. Curr Biol 1995, 5:179–90.View ArticlePubMedGoogle Scholar
- Oseroff C, Kos F, Bui HH, Peters B, Pasquetto V, Glenn J, Palmore T, Sidney J, Tscharke DC, Bennink JR, et al.: HLA class I-restricted responses to vaccinia recognize a broad array of proteins mainly involved in virulence and viral gene regulation. Proc Natl Acad Sci USA 2005, 102:13980–5.View ArticlePubMedGoogle Scholar
- DiBrino M, Parker KC, Shiloach J, Turner RV, Tsuchida T, Garfield M, Biddison WE, Coligan JE: Endogenous peptides with distinct amino acid anchor residue motifs bind to HLA-A1 and HLA-B8. J Immunol 1994, 152:620–31.PubMedGoogle Scholar
- Sette A, Sidney J: Nine major HLA class I supertypes account for the vast preponderance of HLA-A and -B polymorphism. Immunogenetics 1999, 50:201–12.View ArticlePubMedGoogle Scholar
- Lund O, Nielsen M, Kesmir C, Petersen AG, Lundegaard C, Worning P, Sylvester-Hvid C, Lamberth K, Roder G, Justesen S, et al.: Definition of supertypes for HLA molecules using clustering of specificity matrices. Immunogenetics 2004, 55:797–810.View ArticlePubMedGoogle Scholar
- Hertz T, Yanover C: Identifying HLA supertypes by learning distance functions. Bioinformatics 2007, 23:e148–55.View ArticlePubMedGoogle Scholar
- Tong JC, Tan TW, Ranganathan S: In silico grouping of peptide/HLA class I complexes using structural interaction characteristics. Bioinformatics 2007, 23:177–83.View ArticlePubMedGoogle Scholar
- Doytchinova IA, Guan P, Flower DR: Identifiying human MHC supertypes using bioinformatic methods. J Immunol 2004, 172:4314–23.PubMedGoogle Scholar
- Frahm N, Yusim K, Suscovich TJ, Adams S, Sidney J, Hraber P, Hewitt HS, Linde CH, Kavanagh DG, Woodberry T, et al.: Extensive HLA class I allele promiscuity among viral CTL epitopes. Eur J Immunol 2007, 37:2419–33.View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.