In-situ CLMS to study interferon stimulated proteins
Flo-1 cells are one of the most established in vitro Esophageal Adenocarcinoma models as they recapitulate key characteristics of tumours in the esophageal tube22,23. However, not all tumours are immunogenic, and to determine if Flo-1 cells show response to interferon treatment, we treated Flo-1 cells with 10 ng/ml IFNα for up to 72 h. Flo-1 cells showed early induction of pSTAT1 and IRF1 starting at 2 h after treatment, which was sustained over the 72 h time-course with a time-dependent decrease in IRF1 steady state levels (Fig. 1A). The ISGs (MX1, IFITM1, OAS1/2, and ISG15) were found to be highly induced after 6 h, mimicing classic intermediate and late responses to IFNα (Fig. 1A). These data together indicate that this cell model can be used to study the interferon response.
To capture the protein interaction landscape in-situ, we used DSS, a widely used cross-linking reagent, for its high membrane permeability and relatively short reaction time. The short reaction time helps to prevent formation of the large, cross-linked protein aggregates thereby maintaining the stability of the cross-linker. To determine the optimum concentration of the DSS and to avoid over cross-linking, we first treated the cells with 5, 2.5 and 1 mM DSS for 5, 10, 5 and 30 min each, and analysed the lysates by Coomassie staining SDS-PAGE (data not shown). The cell lysate appeared to be highly cross-linked at the lowest concentration and shortest time point. DSS was therefore titrated to 1, 0.5 and 0.1 mM for 5 min (Fig. 1B). Optimal cross-linking was observed with 0.5 mM DSS for 5 min and these conditions were selected for IFNα treated cells. Additionally, Fig. 1C represents an immunoblot probed with p53 (DO-1) antibody to assess the degree of protein cross-linking.
The Flo-1 cells were treated with 10 ng/ml IFNα for 24 h prior to cross-linker addition. Cross-linked cells were subsequently lysed using a two-step protein solubilization method and proteins were processed by the FASP method (Fig. 2)24,25. Cross-linked tryptic peptides were analyzed by mass spectrometry (Fig. 2). Next, the MS/MS spectra were aligned to protein sequences and quantitative analysis was carried out using MaxQuant26,27. The cross-linked peptides were identified from the obtained spectra using the SIM-XL program and individual linkages were merged into a complex network using an open-source computational software pipeline xQuest28 along with SIM-XL29 (Fig. 2). SIM-XL identifies protein–protein interaction, intra-links and mono-links in either simple or complex protein mixtures and provides scripts to visualize the interactions in the protein structure. In addition, it ranks each cross-link as an ID score depending on the quality of the MS/MS spectra29. Several high-confidence protein–protein interactions and complexes were identified, and a cohort of the novel interactions were further investigated using co-immunoprecipitation and the conformational changes of the complexes were studied using molecular dynamics (MD) simulation (Fig. 2)30,31.
Identification of cross-linked interferon induced proteins
A total of ~ 30,500 and ~ 28,500 peptides were detected in the unstimulated and IFNα stimulated samples, respectively (Supplementary Table S1, Fig. 3A) using MaxQuant. Peptide length distribution for both conditions showed a higher proportion of larger peptides that suggests the presence of cross-linked peptides (Fig. 3B,C). Moreover, in the IFNα treated samples, a higher proportion of larger peptides were present in the range of 40–55 (Fig. 3C). Mapping proteins, against log2 intensities, showed classic interferon-stimulated proteins as the most enriched compared to untreated samples, this included MX1, IFIT1/3, OAS2/3, DDX58, and HLA-F (Fig. 3D). Pathway analysis of proteins that were enriched more than threefold in response to IFNα treatment using Reactome pathway database showed MHC-I mediated antigen presentation and processing as the most dominant pathway (Fig. 3E). Consistent with earlier reports, OAS and ISG15 mediated antiviral response as well as IFNα/β and cytokine signalling were among the upregulated pathways. Further, lysine and serine specific cross-links of proteins were identified from the initially obtained MS/MS spectra using SIM-XL. A recent study has reported 104 ISGs by conducting a meta-analysis of single ISG overexpression studies performed in 5 cell types, covering 20 viruses from 9 virus classes9. However, to overcome the computational limitation of screening a big dataset, we started with a smaller dataset and explored possible interactions between the IRDS gene list reported in Padariya et al.28 out of which, the majority are ISGs.
Identification of a novel interferon-stimulated protein network based on in-situ cross-link
Interferon-mediated stimulation of ISGs is well documented, but at the molecular level how these proteins culminate in a wide range of biological functions is poorly understood. We looked at the high-confidence protein interactions between known ISGs. Interestingly, we identified a network involving MX1, USP18, ROBO1, OAS3, and STAT1 proteins that form a large complex in response to IFNα treatment (Fig. 4, Table S2)32,33,34. Most importantly, these interactions were detected in all three IFNα treated replicates, and were undetectable in untreated samples suggesting they form specifically in response to IFNα treatment. STAT1 is known to transcriptionally regulate expression of these ISGs, however, its interaction with ISGs at the protein level hasn’t been studied. The crystal structure of STAT1 reveals that its coiled-coil domain (CCD) is not involved in interactions with DNA or with the protomer when it forms dimer35. These α-helices form a coiled-coil structure that provides a predominantly hydrophilic surface area for interactions to take place35. In our CLMS data, we observed that most of the interactions with STAT1 are either in the CCD, linker domain or the SH2 domain prior to the C-terminal tail segment (residues 700–708) (Fig. 4A). A previous study reported that USP18 bound to the CCD and DNA binding domains (DBD) of STAT2 and was recruited to the type I IFN receptor subunit IFNAR2 to mediate suppression of type-I IFN signaling24. Our data also indicate that the catalytic domain of USP18 interacts with DBD of STAT1 (Fig. 4A,D) suggesting that STAT1 and STAT2 both may have a role in recruiting USP18 to IFNAR2.
There are two USP18 isoforms described in humans, full-length protein, which is mainly located in the nucleus, and an isoform lacking the N-terminal domain, USP18-sf that is evenly distributed in the cytoplasm and nucleus36. In addition, the N-terminus has been predicted to be unstructured and is not required for isopeptidase activity or ISG15 binding37. Most of the interactions identified in our study are situated in the N-terminus of the protein which suggests that these interactions involve full-length USP18 (Fig. 4A,D), and therefore, have a high probability of occurring in the nucleus. Moreover, our data also implies that the N-terminus is used exclusively for protein–protein interactions. The IFNAR2 binding site is located between residues 312–368 and it is interesting to note that none of the proteins in the complex bind to this region (Fig. 4A)37,38. Together the data suggest that the IFNAR2 binding region is used exclusively by the receptor protein. Additionally, only OAS3 and ROBO1 were found to be associated with both the N-terminus and the domain before the IFNAR2 binding site (Fig. 4A).
ROBO1 belongs to the immunoglobulin (Ig) superfamily of transmembrane signalling molecules and consists of five Ig and three fibronectin (Fn) domains in the extracellular region. These extracellular domains are followed by a membrane proximal region and a single transmembrane helix39. An unstructured intracellular region lies at the C-terminus, containing conserved sequence motifs that mediate the binding of effector proteins39. The region stretching from amino acids ~ 1100 to 1600 is mostly disordered. We found that MX1 interacted through ROBO1 via Ig, Fn, and intracellular domains while most of the interactions with STAT1 were between its CCD, linker domain, and the C-terminal of ROBO1 (Fig. 4A,E). On the other hand, interaction with DI, DIII, and the linker region of OAS3 was dispersed throughout the ROBO1 protein (Fig. 4A).
The oligoadenylate synthetase (OAS) family of proteins sense and bind to intracellular double-stranded RNA (dsRNA), undergo conformational change, and synthesize 2ʹ,5ʹ-linked oligoadenylates (2–5 As)40. Out of the three OASs, OAS3 has been found to display higher affinity for dsRNA and to synthesize minimal 2–5 As which can activate RNase L, and thereby, restrict viral replication41. The OAS family consists of polymerase beta (pol-β)-like nucleotidyl transferase domains. A previous study showed that the catalytic activity of the C-terminal domain (DIII) is dependent on the dsRNA-binding domain (DI) that is essential for activation of OAS342. We observed that the DI and DII domain of OAS3 interacted with the CCD and a small linker region between SH2 and TAD of STAT1 (Fig. 4A,F). Overlay of different cross-linked sites over the protein structures shows interaction between the β-sheets and loops of the STAT1 DBD with the exposed pocket or cavity formed by residues 60–75 in the DI domain of OAS3 (Fig. 4G). Orientation of the protein in the complex also showed that none of the interactions with OAS3 interfered with the DNA binding ability of its DI domain (Fig. S1A). In addition, the N-terminal GTPase domain of MX1 interacts extensively with both DI and DIII domains of OAS3 (Fig. 4A). We also observed an interaction between OAS1 and MX1 in all three IFNα treated replicates, where the only domain of OAS1, which is also catalytically active, interacts with all the three domains of MX1 (Fig. S2A,B).
MX proteins are part of large dynamin-like GTPase family that contains a N-terminal GTPase domain that binds and hydrolyses GTP, a self-assembly-mediating middle domain, and a C-terminal leucine zipper (LZ) domain which acts as a GTPase effector domain25,43. MX1 associates with subunits of the viral polymerase to block viral gene transcription43. A previously reported yeast two-hybrid screen revealed that MX1 bound to PIAS1 inhibits STAT1-mediated gene activation by blocking the DNA binding activity, and also has SUMO E3-ligase activity44,45. Here we demonstrate that MX1 binds to STAT1 (Fig. 4C,D), however, how this interaction affects STAT1-mediated gene activation in response to IFNα needs further investigation. Additionally, we also found MX1 interacting with IFIT3 and DDX60 in all three IFNα treated replicates (Fig. S2C).
DDX60, an IFN-inducible cytoplasmic helicase, has been previously reported playing a role in RIG-I-independent viral RNA degradation46. It interacts with RIG-I and activates its signalling in a ligand-specific manner46. DDX60 consists of a DEXD/H-Box helicase domain which binds to viral RNA and DNA, and a C-terminal helicase domain47. Most of its interactions with MX1 and IFIT3 were within the long N- and C-terminal regions with no typical domains or motifs (Fig. S2E,F). However, MX1 was also linked with the DEXD/H-Box helicase domain (Fig. S2E). IFIT family proteins have distinguished tandem copies of helix-turn-helix motifs called tetratrico-peptide repeats (TPRs). IFIT3 was found to be a positive modulator of RIG-I signaling, and hence, a component of MAVS complex48. Together, our data suggests that IFIT3 and DDX60 interact with each other mainly in the region between TPR 3–6 of IFIT3, and may have a role in RIG-I/MAVS signalling (Fig. S2F).
Antigen presenting MHC-I molecule and its interaction network
Considering proteome-wide screening is computationally demanding, we next screened the entire human UniProt database for one of the IFNα treated replicates. We found some high-confidence interaction networks for HLA-A in that replicate. The pathway analysis for identified proteins from MS/MS spectra revealed MHC-I based antigen processing and presentation as the dominant pathway induced by interferon (Fig. 3D). Therefore, we focused on exploring high-confidence protein interactions of MHC-I molecules across all the cross-linked samples. HLA consists of α1, α2, and α3 domains and a light chain, β2 microglobulin (β2m) being a constant protein partner49. HLA is unstable in the absence of peptide ligand following its assembly in the endoplasmic reticulum50. The peptide binding groove is formed by α1 and α2 domains that are highly polymorphic and unstructured in the peptide-free form, and a α3 domain that is comparatively less polymorphic51. In the presence of IFNα, we found two HLA-A complexes: one that interacts with HMGA1 and H2B (Fig. 5, Table S3) and another where it interacts with MDN1, LRCH4 along with H2B (Fig. 6).
Apart from maintaining genome integrity, histone H2B is involved in the transcriptional regulation. H2B protein is composed of a central histone-fold domain (HFD) formed by three alpha helices separated by loops, and a C-terminal tails41,52. The majority of interactions with H2B were in the α1-helix that mediates trimerization with the HFD heterodimer (Fig. 5A,B). Even though lysines are involved in DNA-binding, some of the lysines are also sites for alternative acetylation or methylation. For instance, the residues K43, K46 and K57 from the H2B are not involved in direct DNA-binding but are targets of different post-transcriptional modifications53. Similarly, K44, K47 and K57 residues in H2B may have an alternative role in presence of IFNα, which includes interacting with other proteins (Fig. 5A,B). Moreover, extrachromosomal histone H2B activates immune responses in various cell types acting as a cytosolic sensor to detect double-stranded DNA (dsDNA) fragments derived from infectious agents or damaged cells54. Depletion of H2B suppressed IFN-β production and STAT1 phosphorylation in presence of DNA viruses54. H2B is also known to travel in and out of the nucleus more rapidly than other core histones54. H2B interaction with MDN1 and with LRCH4 were also observed in individual untreated samples. We found HLA-A interacting with H2B in all three IFNα treated samples and one untreated replicate. This data reflects the role of H2B in alternative physiological functions, independent of transcriptional regulation.
HMGA1 (High Mobility Group AT-Hook 1), a small nuclear protein enriched in disorder-promoting amino acids, was identified in complex with HLA-A. It has an acidic C-terminal tail and three differentially spaced DBDs, called AT-hooks as they bind to minor grooves of AT-rich regions in dsDNA55,56. This binding induces bending or straightening of DNA allowing the access of canonical transcription factors to their consensus sequences. The C-terminal tail is assumed to be involved in protein–protein interaction and recruitment of transcription factors since a C-terminal deletion mutant fails to initiate transcription57. Moreover, this domain contains several conserved phosphorylation sites that are known kinase substrates58. We observed the interactions of HLA-A and H2B with HMGA1 outside the C-terminal domain, suggesting the C-terminal domain is mainly used for transcription factor binding (Fig. 5A,C). HMGA proteins compete with histone H1 for binding to linker DNA, thereby resulting in increased accessibility57. Similarly, it is plausible that HMGA interacts with histone H2B along linker DNA while competing with histone H1. HMGB1 induces HLA-A, -B, and -C expression in dendritic cells leading to their activation59, nevertheless, an interaction between HMGs and HLAs hasn’t been reported previously. We found that HMGA1 interacts with the α1 and α3 domains of HLA-A with most of the interaction outside its 3 DBDs (Fig. 5A,C). In our hands, HLA-A was found to be localized in the nucleus (data not shown), and given that H2B and HMGA1 also reside in the nucleus, there is a high probability that this interaction occurs in the nucleus. The specific adducts measured between H2B, HLA-A, and HMGA1 are described in Fig. 5D.
The majority of HLA-A interactions with other proteins are localized at its α1 and α2 domains as well as within the disordered C-terminal domain (Fig. 6). In one of these examples, we found that HLA-A interacts with the disordered N-terminal tail of LRCH4 (Fig. 6A,D). LRCH4 regulates TLR4 activation and cytokine induction by LPS, and therefore, regulates the innate immune responses60,61. It is a membrane protein with nine Leucine-rich repeats (LRRs) and calponin homology (CH) motif in its ectodomain, followed by a transmembrane domain (TMD)60,62. The CH domain is reported to mediate protein–protein interactions60. A stretch of around 300 amino acids between the LRR and CH domain is relatively accessible but disordered. In line with the function of disordered regions as mediators of protein–protein networks and vesicle trafficking63, we found most of the protein interactions in the disordered region. Interaction with MDN1 was dispersed throughout the protein’s length including LRR1, LRR6, CH domain, and disordered region while H2B was bound mostly with CH domain (Fig. 6A,B). It is noteworthy that none of the interactions involved the TMD which shows the specificity of the CLMS method (Fig. 6A,B).
MDN1 was also identified as a part of the HLA-A protein network (Fig. 6A). It belongs to the AAA protein family (ATPase associated with various activities). It’s identical N-terminal AAA domains orchestrate into hexameric rings and remove assembly factors from ribosomal 60S subunit64. Cryo-EM studies in yeast reveal that AAA domains are followed by six non-equivalent AAA domains linked in a single polypeptide which appears similar to dynein64,65,66. Further, a stretch of Asp/Glu-rich region is followed by a MIDAS (metal ion-dependent adhesion site) domain. Owing to MDN1’s large size (~ 5600 amino acids) and its limited homology to well-studied proteins, not a lot is known about its structure and function in humans. We identified HLA-A, H2B, and LRCH4 as binding partners of MDN1 and their orientation as a protein complex was revealed in PyMol (Fig. 6A,B). These three proteins interact with AAA domains, dynein-like linker domain and probable MIDAS domain of MDN1. In a previous report, affinity purification of the bait proteins identified MDN1 as the protein associated with histone H2B67. Moreover, a recent study has also reported interaction between MDN and HLA-B using affinity-purification mass spectrometry in HCT116 cells which supports our findings68. Identification of this complex in IFNα treated samples implies that MDN1 has a role in interferon signalling.
As HLA genes are highly polymorphic, we extracted sequencing reads mapping to HLA-A, -B and -C from the RNA-seq data of the Flo-1 cells (data not shown). Peptide sequences corresponding to sequencing reads showed a significant difference among HLA-A, -B and -C in the regions where the cross-linked peptides reside in HLA-A (Fig. S3). Moreover, we did not observe protein–protein cross-links for the HLA-B/C molecules with either of the H2B/HMGA1/MDN1/LRCH4 proteins. This suggests that the protein interaction found between HLA-A, MDN1, LRCH1, and HMGA1 is specific to HLA-A. Additionally, the proteomics analysis of non-cross-linked samples (Table S4) suggests that HLA-A is enriched with higher sequence coverage compared to that of the HLA-B or HLA-C. The peptides identified for HLA-A have high intensities in both IFNα treated and untreated samples.
Validation of novel HLA-A binding proteins
To ensure that the interactions identified here are not due to non-specific cross-linking of two proteins in close spatial proximity, we further validated two novel interactors of HLA-A by performing co-immunoprecipitation assay. Interactions of HLA-A with endogenous MDN1 and H2B were detected in IFNα treated and untreated Flo-1 cells (Fig. 7, Fig. S4). We confirmed that HLA-A was captured with H2B in immunoprecipitates, and this association was induced by IFNα treatment as HLA-A was absent in immunoprecipitated samples from untreated cells (Fig. 7A). However, our data demonstrate that IFNα differentially regulates HLA-A binding to H2B and MDN1. IFNα induced association between H2B and HLA-A but decreased its binding with MDN1. We found that MDN1 associates with HLA-A in control samples, while the addition of IFNα reduces this interaction irrespective of MDN1 induction by IFNα (Fig. 7B,C). Moreover, immunoprecipitation of HLA-A captured H2B in A549 cells (Fig. S4) suggesting this interaction is not cell type-dependent. Together, these results confirm interferon-mediated interaction of HLA-A with H2B and MDN1.
The conformational dynamics of the HLA-A binding proteins
Structural properties of one of the high-confidence cross-linked networks induced by interferon—H2B-HLA-A-HMGA1 was investigated. We took advantage of molecular dynamics simulation as an alternative approach to understand the conformational dynamics of the proteins involved in this complex (Fig. 8). Findings from CLMS data suggest the probability of different conformations between H2B, HLA-A, and HMGA1 proteins. Therefore, the following potential complexes were simulated in the solvent environment: H2B-HLA-A, HMGA1-HLA-A, and H2B-HLA-A-HMGA1. Initial protein–protein docking screening using the MOE package (Molecular Operating Environment; Chemical Computing Group Inc., Montreal, QC, Canada) proposed different possible conformations between these proteins (Fig. 8A). Visualization of docked protein complexes revealed several interactions and possible conformations (Figs. 5A, 8). As such, one of the possible conformations is represented in Fig. 8A (with labeled cross-links), which was further evaluated with the MD simulation pipeline. In addition, the binding energies of H2B or HMGA1 with HLA-A highlight that H2B has higher affinity with HLA-A (Fig. 8A).
Stability of the HLA-A molecule over time (root-mean-square deviation; RMSDs or root-mean-square fluctuations; RMSFs) suggests that the presence of H2B or HMGA1 protein in the complex stabilizes HLA-A (Fig. 8B, Fig. S5). The HMGA1 protein binding closely to the B2M site of HLA-A, induces stability in the HLA-A amino acids in both the complexes the HLA-A-HMGA1 or H2B-HLA-A-HMGA1 (Fig. 8B, Fig. S5). Particularly, HLA-A residues ~ 60–90 and ~ 180–210 were found exhibiting lesser flexibility in the presence of H2B (Fig. 8B). Both H2B and HMGA1 display better binding with the HLA-A in the H2B-HLA-A-HMGA1 complex, compared to HLA-A binding to H2B or HMGA1 alone (Fig. 8C,D; Table S5). Residues involved in hydrogen bond formation (MD simulation high occupancy ≥ 10 ns) coincide with CLMS cross-link (K or S residues) interaction sites in the complex which suggests high confidence in interactions identified through the CLMS method (Fig. 8E). HLA-A residues between ~ 190–210 and ~ 200–220 amino acids were found binding H2B and HMGA1, respectively, in the CLMS and MD simulation (Fig. 8E).