DeepMHCI: an anchor position-aware deep interaction model for accurate MHC-I peptide binding affinity prediction

Abstract Motivation Computationally predicting major histocompatibility complex class I (MHC-I) peptide binding affinity is an important problem in immunological bioinformatics, which is also crucial for the identification of neoantigens for personalized therapeutic cancer vaccines. Recent cutting-edge deep learning-based methods for this problem cannot achieve satisfactory performance, especially for non-9-mer peptides. This is because such methods generate the input by simply concatenating the two given sequences: a peptide and (the pseudo sequence of) an MHC class I molecule, which cannot precisely capture the anchor positions of the MHC binding motif for the peptides with variable lengths. We thus developed an anchor position-aware and high-performance deep model, DeepMHCI, with a position-wise gated layer and a residual binding interaction convolution layer. This allows the model to control the information flow in peptides to be aware of anchor positions and model the interactions between peptides and the MHC pseudo (binding) sequence directly with multiple convolutional kernels. Results The performance of DeepMHCI has been thoroughly validated by extensive experiments on four benchmark datasets under various settings, such as 5-fold cross-validation, validation with the independent testing set, external HPV vaccine identification, and external CD8+ epitope identification. Experimental results with visualization of binding motifs demonstrate that DeepMHCI outperformed all competing methods, especially on non-9-mer peptides binding prediction. Availability and implementation DeepMHCI is publicly available at https://github.com/ZhuLab-Fudan/DeepMHCI.

2 Experimental Results

Detailed Results of Five Fold Cross-Validation over BD2017
Table S3 S4 S5 S6 S7 show the detailed five-fold cross-validation perfomance of DeepMHCI and competing methods for all MHC-I molecules under 8mer, 9mer, 10mer, 11mer and ≥12mer, respectively.DeepMHCI outperformed all competing methods in both AUC and PCC under all lengths.

Detailed Results over Independent Test Set ID2022
Table S8 shows the detailed results of DeepMHCI and competing methods over each benchmark of Independent test set ID2022.DeepMHCI achieved the highest averaged SRCC (0.569) and AUC (0.846), respectively.We also compared with DBTpred using 5 models for ensemble over ID2022.Table S2 reports the average AUC and SRCC and Ov(AUC, SRCC) of DBTpred and DeepMHCI on the ID2022.DeepMHCI achieved better average SRCC of 0.563, which was 23.5% higher than DBTpred (0.456).Detailed results are shown in the Table S9.

Detailed Results of Epitope Classification over EP2017
Table S14 S15 S16 S17 S18 S19 S20 show the detailed results of DeepMHCI and competing methods over EP2017 for all MHC-I molecules under all lengths, 8mer, 9mer, 10mer, 11mer, 12mer and 13mer, respectively.DeepMHCI performed much better in longer epitopes classification.

Detailed motifs
Figure S2 shows the motifs of various lengths of molecules for which DeepMHCI and competing methods have significantly different performance over HPV2019.Since the majority of MHC-I molecules are biased to bind 9-mer-long peptides, the binding motif of 9-mer can be considered as a reference.As can be seen in Figure S2, all four methods found common anchor positions of 9-mer binding peptides for different MHC-I molecules.However, there are significant differences among these methods for non-9-mer motifs.We summarized two main phenomena that may account for the low performance of DeepAttentionPan, TransPHLA and NetMHCpan-3.0 on these two molecules.1) Wrong anchor positions discovery.In the motif of the HLA-A11:01 under 8-mer, both NetMHCpan-3.0and DeepMHCI discovered consistent anchor positions at sites 1, 2, 6 and 8, which also showed consistent amino acid types.However, the anchor positions identified by TransPHLA and DeepAttentionPan were sites 1, 2, 3 and 8, where shows kind of different amino acid preference.These differences can explain the experimental results in HPV2019 under 8mer: both NetMHCpan-3.0and DeepMHCI achieved extremely high performance (AUC 1.0 for both), much higher than those of TransPHLA (AUC 0.870) and DeepAttentionPan (AUC 0.704).
2) Shifted motif.In the motif of HLA-A24:02 under 11-mer, the amino acid preferences of NetMHCpan-3.0 at positions 3 and 4 are highly consistent with that of the second site, which is an anchor position of 9-mer.The shifted motif comes from the collate function for non-9-mer peptides of NetMHCpan-3.0,which generates pseudo 9-mer peptides by deleting two amino acids from 11-mer peptides.
We further selected three MHC-I molecules, HLA-A01:01, HLA-A29:02, and H-2-Db, for illustration, based on DeepMHCI's good performance under non-9-mer on the 5-fold cross-validation.Figure S3 presents the motifs of different methods on HLA-A01:01 under the 10-mer condition.It is evident that NetMHCpan-3.0 has replicated the phenomenon of anchor site duplication.Specifically, the amino acid preference for D at positions 4 duplicates the preference at position 3 (the actual anchor position), which should not occur.This phenomenon might explain why NetMHCpan-3.0achieved the lowest performances under 10-mer of HLA-A01:01 (AUC 0.910 and PCC 0.762), competing with those of TransPHLA (AUC 0.914 and PCC 0.785) and DeepMHCI (AUC 0.930 and PCC 0.804).Figure S4 illustrates the motifs on HLA-A29:02 under the 10-mer condition.Notably, DeepMHCI shows an additional amino acid preference for S, whereas other methods exhibit the same amino acid preferences for F, L, V, Y, T, I, and M at position 2. Similarly, only DeepMHCI displays an additional amino acid preference for M at position 10.Additionally, only TransPHLA exhibits an amino acid preference for Y at position 9, which is not considered an anchor position by DeepMHCI and NetMHCpan-3.0.These observed differences may account for the results where DeepMHCI achieved the best performances under the 10-mer condition for HLA-A29:02 (AUC 0.881 and PCC 0.766), competing with those of NetMHCpan-3.0(AUC 0.863 and PCC 0.735) and TransPHLA (AUC 0.862 and PCC 0.735).Figure S5 presents the motifs of different methods on H-2-Db under the 11-mer condition.Notably, TransPHLA and DeepAttentionPan do not exhibit a significant preference for amino acids at position 7, while DeepMHCI and NetMHCpan-3.0show a preference for amino acids N and D at this position.Additionally, only NetMHCpan-3.0displays a significant preference for amino acid N at position 6, which is not considered an anchor point by the other methods.Moreover, only DeepAttentionPan shows a preference for amino acid D at position 5, which is opposite to the preferences observed in other methods.These differences may explain the results, with DeepMHCI achieving the best performance (AUC 0.885 and PCC 0.595), followed by TransPHLA (AUC 0.847 and PCC 0.576), NetMHCpan-3.0(AUC 0.847 and PCC 0.556), and finally DeepAttentionPan (AUC 0.826 and PCC 0.556).

Motifs of HLA supertypes
We further conducted a comparative analysis of the generated motifs of HLA supertypes by DeepMHCI.We selected different HLA supertypes [4,1,5,2] to demonstrate this phenomenon: 1) the binding motifs of HLA molecules generated by DeepMHCI in the same supertype are very similar; 2) the motifs generated between different supertypes are quite different.In particular, we randomly selected five HLA molecules from HLA supertype A2 and B58 for illustration, respectively.Figure S6 illustrates the motifs generated by DeepMHCI.It is evident that the major anchor points are shared within the same HLA supertype: for the five molecules in A2, there is a preference for amino acids L, M, and I at the second position, and a preference for amino acids V, L, I, and A at the ninth position; for the five molecules in B2, there is a preference for amino acids S, A, and T at the second position and W, F, I, and L at the ninth position.Moreover, when comparing the two supertypes A2 and B58, significant differences can be observed in the generated motifs displayed by different supertypes.

Ablation Study
We also examined the performance of DeepMHCI using individual kernel sizes.We selected models with kernel sizes k = 9, 11, 13, which was named DeepMHCI k for comparison with DeepMHCI using all kernel size.Table S21 reports the average performance over each length of 10 times 5-fold CV by each model over BD2017.We found that DeepMHCI achieved the best performance, which suggests that mixing kernels of different sizes to cover and process peptides contributes to model robustness.
In addition, We explored the effect of different amino acid embedding options on the performance.TableS22 reports the average performance over each length of one time 5-fold CV by each option.Learnable embedding was much better than the BLOSUM50 matrix.Moreover, ESM-1 [3] did not achieve the expected high performance.This may be due to the lack of pMHC-related interaction information in the ESM-1 task, and the high-dimensional embedding information has become a bottleneck for training DeepMHCI.

Table S1 :
Data redundancy of BD2017

Table S2 :
Performance of DeepMHCI † and DBTpred † on ID2022.Both methods were only used 5 models to the ensemble.

Table S10 S11
S12 S13 show the detailed results of DeepMHCI and competing methods over HPV2019 for all MHC-I molecules under 8mer, 9mer, 10mer and 11mer, respectively.DeepMHCI achieved almost the best results at all lengths.Specifically, DeepMHCI achieved the best mean AUC under 10mer and 11mer and the second best mean AUC under 9mer.FigureS1plots ROC curves of the remaining MHC-I molecule on the HPV2019 dataset under 11-mer for DeepMHCI and comparison methods.

Table S3 :
Detailed five-fold cross-validation performance of DeepMHCI and competing methods under 8mer.

Table S4 :
Detailed five-fold cross-validation performance of DeepMHCI and competing methods under 9mer.

Table S8 :
Detailed results of DeepMHCI and competing methods on ID2022.

Table S14 :
Detailed results of DeepMHCI and competing methods on EP2017 with all lengths.

Table S15 :
Detailed results of DeepMHCI and competing methods on EP2017 under 8mer.

Table S16 :
Detailed results of DeepMHCI and competing methods on EP2017 under 9mer.

Table S17 :
Detailed results of DeepMHCI and competing methods on EP2017 under 10mer.

Table S18 :
Detailed results of DeepMHCI and competing methods on EP2017 under 11mer.

Table S19 :
Detailed results of DeepMHCI and competing methods on EP2017 under 12mer.

Table S20 :
Detailed results of DeepMHCI and competing methods on EP2017 under 13mer.

Table S21 :
Performance of DeepMHCI with different kernel sizes over BD2017.

Table S22 :
Performance of DeepMHCI with different embedding options over BD2017.