“Transexpression” or delivering proteins into mammalian cells
In contrast to RNA and DNA, which have been delivered into mammalian cells at different amounts by standardized methods over the last decades, to date, the delivery of exogenous proteins into mammalian cells has not been studied in detail neither mechanistically nor quantitatively. Because the process of introducing proteins into mammalian cells has not been semantically defined yet, and as first step towards its standardization and improvement, here we termed it “transexpression”. Transexpression refers therefore to the introduction of an “exogenous” protein into mammalian cells by any of the different experimental methods for such aim (Supplementary Fig. 1a). The method can be electroporation, cell penetrating peptides, cell membrane permeabilization by toxins, lipoparticles, or any other experimental method used to deliver proteins into mammalian cells. For “exogenous protein” it must be stated that the delivered protein necessarily needs to be obtained from an external source, that in most cases is an heterologous system. Upon intracellular delivery, the exogenous protein is now “transexpressed” in the recipient cell, forming part of its own proteome. As recipient cells do not express the exogenous protein but instead it is artificially acquired, the transexpression term fulfills the semantic requirements for such experimental method.
Development of a reporter system to evaluate protein delivery into mammalian cells termed “transexpression”
Aiming at establishing standardized methods for the controlled and efficient transexpression, we first developed a reporter system to quantitatively study how proteins can be introduced into mammalian cells. The system has two components; (i) a functional protein that can be efficiently expressed and purified in heterologous systems, and (ii) a reporter gene for quantifying the activity of the protein selected in (i) (Fig. 1a). We chose the chimeric transcription factor Gal4-VP16 (Sadowski et al., 1988) as the protein target because it has several favorable properties: it has been extensively used in mammalian cells, its activity can be easily assayed in vivo and in cell lysates, it has a molecular weight suitable for solution state NMR (∼19 kDa), a globular 3D structure as most mammalian proteins, and is absent in mammals precluding evolutionary conserved interactions within the mammalian cell that might interfere with the readouts used to evaluate delivery efficiency18. For (ii) we developed a DNA vector containing a reporter gene called pGal4-5XRE-eGFP, in which the mutant gene of the Aequorea victoria enhanced green fluorescent protein (eGFP) was cloned downstream of a minimal promoter containing five GAL4 binding sites or responsive elements (RE) (Fig. 1a). The reporter system works as follows: first pGal4-5XRE-eGFP is stably or transiently introduced in mammalian cells by standard transfection methods. In a second step, the recombinant protein Gal4-VP16 (rGal4-VP16) is delivered into these cells by any of the different methods of transexpression. As eGFP expression is induced by the exogenously added rGal4-VP16, the transexpression efficiency positively correlates with eGFP fluorescence signal intensity in these cells.
As proof of principle of our reporter system, we first investigated whether eGFP could be induced by transexpressing rGal4-VP16 into mammalian Cos7 cells. To this end, rGal-VP16 was expressed and purified from bacteria, and different amounts thereof were delivered into cells that had been previously transfected with pGal4-5XRE-eGFP. The transexpression method in this first experiment was electroporation; rGal-VP16 was first mixed with a cell pellet and then an electric pulse was applied to the mixture. After electroporation, the Cos7 cells were washed twice with fresh media in order to remove rGal4-VP16 that had not been internalized. The cells were then plated, grown at 37 °C for 24 h, harvested, and the obtained cell suspensions were subjected to eGFP signal intensity quantification. We found a substantial increase of eGFP-derived fluorescence in Cos7 cells electroporated with rGal4-VP16 (Fig. 1b). The signal intensity was proportionally higher when increasing amounts of the transcription factor were used for electroporation, validating the Gal4 system for the quantitative determination of protein delivery efficiency. The dynamic range in this experiment was from 30 to 150 μg and no saturation of rGal4-VP16 activity was observed within this range. Very low fluorescence levels were observed when only buffer was used in the electroporation step, presumably due to cell autofluorescence and uncontrolled expression of pGal4-5XRE-eGFP in these cells. Confocal microscopy analyses confirmed the intracellular and robust induction of eGFP in cells electroporated with rGal4-VP16 (Fig. 1b).
We next assayed whether this reporter system can be used to evaluate other transexpression methods such as those based on cell-penetrating peptides9 and cell membrane permeabilization by pore-forming toxins14. To this end, we fused the TAT HIV cell-penetrating peptide to the C-terminus of rGal4-VP169. The resulting protein, rGal4-VP16-TAT, was then produced in bacteria and used to treat Cos7 cells previously transfected with pGal4-5XRE-eGFP. We found that eGFP was induced in cells treated with this chimeric transcription factor but not with vehicle (Fig. 1c). Compared to the electroporation-based method, the signal intensity obtained was significantly lower and about one order of magnitude more recombinant transcription factor was needed to obtain the values reached with the electroporation procedure. The intracellular localization of eGFP was also confirmed in this experiment (Fig. 1c). Likewise, eGFP fluorescence was observed in pGal4-5XRE-eGFP-transfected Cos7 cells treated with the pore-forming toxin streptolysin-O (SLO) and rGal4-VP16, whereas only basal levels of fluorescence were found in cells treated with the toxin and vehicle only (Fig. 1d). The fluorescence obtained with the SLO method was around a factor of five lower when compared to electroporation, and higher amounts of rGal4-VP16 were required in this experiment. Altered cell morphology (Fig. 1d) and considerable cell death was observed when using SLO, presumably due to its intrinsic cytotoxic effect on mammalian cells.
Compared to the TAT and SLO-based methods, a more efficient intracellular delivery of rGal4-VP16 was thus obtained with electroporation. Electroporation does neither require fusion peptidic tags such as TAT that might affect protein structure and function, nor toxins such as SLO that might trigger cell responses aimed to counteract cytotoxicity. Moreover, the delivery of proteins by electroporation is fast allowing the immediate analysis of the delivered protein by NMR. Because of these major advantages, for subsequent experiments we decided to concentrate on electroporation as the transexpression method.
The reporter system was next challenged on three different cell lines (Cos7, A2780, and HeLa) using different protocols of electroporation, of which four are shown in Fig. 1e. These four protocols differed only in the composition of the buffer used in the electroporation step. In this case, we analyzed eGFP signal intensity by fluorescent activated cell sorting (FACS), as this methodology allowed us to visualize the number of cells expressing eGFP as well as the signal intensity distribution. We found that among the cell lines assayed A2780 and Cos7 are most suitable for rGal4-VP16 transexpression using these electroporation parameters (Fig. 1e and Supplementary Fig. 1b, c). Protocol 1 resulted in Cos7 cells with relatively low eGFP expression, while the majority of the Cos7 cells subjected to protocols 3 and 4 expressed high amounts of this protein. When using protocol 2, relatively similar numbers of Cos7 cells expressing all possible levels of eGFP were found. Surprisingly, eGFP signal intensity showed two peaks in A2780 cells, indicating that two populations of cells were obtained; cells with moderate eGFP levels and cells with high levels of this protein. The population corresponding to cells expressing high amounts of eGFP was favored with protocol 1. The electroporation conditions used in this experiment were less efficient for HeLa cells, which showed only low eGFP expression. Changing the electroporation parameters (e.g., pulse shape, length and voltage), however, allowed us to achieve efficient transexpression also for HeLa cells. Thus, we conclude that the reporter based on Gal4-VP16 is a valuable tool to find the optimal experimental conditions that allow us to deliver the desired amount of a given protein into mammalian cells.
Towards establishing a standardized method of transexpression for in-cell NMR studies, we next investigated how the amount of rGal4-VP16 and the number of cells used in the electroporation step can affect transexpression efficiency. To this end, we electroporated different amounts of rGal4-VP16 in samples containing increasing quantities of Cos7 cells. In this experiment the cells were previously transfected with the plasmid pG5-Luc (Promega), which contains five binding sites for Gal4 to drive the expression of the firefly luciferase (Luc) by rGal4-VP16. The advantage of using luciferase is that its activity can be determined in cell lysates and not in intact cells as it was done for eGFP signal intensity determinations (Fig. 1b–e). In agreement with the eGFP values shown in Fig. 1b, we found that luciferase transcriptional activity correlated positively with both the amount of rGal4-VP16 and the number of cells used for electroporation (Fig. 1f and Supplementary Fig. 1d). Interestingly, the relationship between transcriptional activity and protein amount and cell number was in all cases positive and appears to be steeper than linear. Because the contribution of the cell number on transexpression efficiency showed more than a positive linear response and appears to be of complex nature, we then investigated the impact of cell size on transexpression efficiency by comparing the intracellular delivery of rGAL4-VP16 in four lines of cells with different sizes, Cos7 and U2OS with an average size of 30–40 μm, and the two smaller cell types Hek-293 and A2780 with an average size of 10–15 μm (Supplementary Fig. 1e). Based on the requirements of in-cell NMR (see Fig. 2), after electroporation, the cells were collected and packed into glass tubes of 3 or 5 mm diameter till they filled the “NMR active” region of the tubes. This height was approximately 20 millimeters for the cryoprobe of our NMR spectrometers (Supplementary Fig. 1f). Packing the cells was carried out by a gentle centrifugation step (300 × g for 2 min) that preserves >95% of cell viability (see Fig. 3). As these cells are of different sizes, the number of Hek-293 and A2780 cells packed in these two fixed sample volumes was higher than for Cos7 and U2OS cells. In 5 mm tubes, for example, it was >2 times higher (Supplementary Fig. 1g). We found that at a fixed sample volume smaller cells displayed higher levels of luciferase activity compared to Cos7 and U2OS cells (Supplementary Fig. 1h). This is presumably due to the increased surface-to-volume ratio of smaller cells that would be favorable for protein entry by electroporation. Thus, small cells are a better option when methods with limited sample size (volume) are used such as NMR. These differences are lost, however, when normalization by the cell number is applied to luciferase activity levels.
Development of a transexpression method for highly sensitive in-cell NMR
The reporter system based on rGal4-VP16 allowed us to establish a highly efficient transexpression method for the analysis of proteins by in-cell NMR. The method consists in protein delivery by electroporation, followed by a washing step of the electroporated cells for the efficient removal of the non-internalized protein, a recovery phase, in which cells are re-plated and dead cells are discarded, and a final step where the cells are harvested and packed into the NMR tube (Fig. 2a).
Aiming at carrying out long (>16 h) in-cell NMR experiments, we investigated first how the temperature and the incubation time affect the viability of packed cells. To this end, mock-electroporated Hek-293 and A2780 cells were packed, incubated at different temperatures, and finally recovered at different time points to determine cell viability using the trypan blue exclusion test. We found that ∼ 95% of the cells that stayed in the tube for < 8 h were viable, regardless of the temperature used (Fig. 2b). For the two cell lines assayed, at 16 h and later time points, only cells incubated at 37 °C showed a small but significant reduction in cell viability, whereas after 24 h this effect was also observed in cells kept at 30 °C. For longer times, high viability was only observed at 10 °C and 25 °C, where ∼ 80% and ∼ 70% of the cells remained alive at 24 and 48 h, respectively.
We next analyzed the loss of the plasma membrane integrity, an important parameter for in-cell NMR experiments, as plasma membrane leakage might result in the release of the transexpressed protein to the media during the NMR measurements. Using the mammalian protein α-synuclein (αSyn), which was delivered using the transexpression protocol described above, we confirmed that at 10 °C the majority of the electroporated protein remained in the cells even 48 h after electroporation (Fig. 2c and Supplementary Fig. 2c). In agreement with the aforementioned trypan blue test, higher temperatures led to a faster release of this protein into the extracellular media. Thus, the data indicated that whereas at 10 °C high cell viability is obtained for short and long incubation times, caution has to be taken when higher temperatures and incubation times longer than 16 h are used. In this sense, incubation at 37 °C is limited to ≤ 8 h.
We next carried out in-cell NMR experiments on three proteins using the protocol described in Fig. 2a. We included recently published protocols19,20 for comparative purposes. The main differences between the protocol described in Fig. 2a and the previously published ones are the amounts of cells and recombinant protein needed for electroporation, the electroporation buffers, the electroporation parameters and the electroporation devices (see Materials and Methods). In this comparative experiment, however, the amounts of cells and recombinant protein were the same for the previously published protocol and the method described in Fig. 2a. The proteins of choice were the intrinsically disordered proteins (IDPs) αSyn, prothymosin-α (PTMA) and K18, a fragment of the tau protein21,22. IDPs were selected for initial experiments because their fast tumbling usually results in sharp, strong NMR signals. These three proteins were produced and purified from bacteria as 15N-labeled proteins, delivered into mammalian cells by the two electroporation protocols, and measured by two dimensional [15N,1H]-HMQC NMR experiments. Compared to the previously published protocols, our optimized protocol yielded on average 3 times higher signal-to-noise ratios (Fig. 2d compared to Supplementary Fig. 2b, Supplementary Fig. 2a and c). We confirmed this by comparing two electroporation buffers and devices in two cell lines (Supplementary Fig. 2d). While the Nucleofector IIb (AMAXA) yields good transexpression efficiency with the buffer “R”, the Neon (Invitrogen) is superior when PBS is used. These differences might be due to the fact that the two devices use different electroporation units (cuvettes versus tips). Thus, we conclude that a higher efficiency of transexpression is obtained with the new protocol.
The NMR spectra also revealed that PTMA and K18 remain disordered inside mammalian cells (Fig. 2d) as previously shown for αSyn13. Compared to the spectra obtained with the protein in buffer (Fig. 2d and Supplementary Fig. 2e, black), there was line broadening, signal attenuation, and chemical shift perturbation in the in-cell [15N, 1H]-HMQC spectra of the three proteins. For instance, quantifying peak intensities of the spectra obtained in cells and in buffer revealed several regions of αSyn with peak attenuations in the in-cell spectrum, in addition to previously reported N-terminal acetylation23,24. In αSyn, the N- and C-termini as well as the region around tyrosine 39 are strongly affected by the intracellular milieu (Supplementary Fig. 2f). These changes were recently attributed to transient interactions with cellular partners such as chaperones20. Likewise, the spectra of K18 tau displayed alterations in several NMR resonances in the C-terminal regions of exons 1-2, and in exon 3 (Supplementary Fig. 2f). These effects might be due to interaction of K18 with microtubules and lipids21,22,25. In agreement with previous work20, the electroporated αSyn was found in association with lysosomes in normal cells, as shown here by its co-localization with the lysosome-specific dye Lysotracker (Fig. 2e). Likewise, K18 is co-localized with microtubules25 as shown by double immunofluorescence using anti-tau and anti-β3-tubulin antibodies, and PTMA was found to be localized in the nucleus and associated to histone 2B as previously reported26 (Fig. 2e). The data altogether indicated that the electroporated proteins reach different destinations within the cell where they might play functional roles. Next, we analyzed the structural stability of the electroporated αSyn over time by carrying out in-cell NMR experiments at 10 °C, where cell survival and protein stability were maximal (Fig. 2b and Supplementary Fig. 2g, h). No significant chemical shift changes were observed over 16 h while some signal decay was detected. The data indicated that the IDP structure of αSyn remains unaltered in the NMR tube for at least 16 h (Supplementary Fig. 2e, g, h), allowing long NMR measurements under these experimental conditions.
In-cell NMR of folded proteins at physiological conditions
Anecdotal observations indicated that transexpression fails often for folded proteins. To counter this problem, we used our improved protocol of transexpression to carry out in-cell NMR experiments on five different folded proteins. We selected the β1 immunoglobulin binding domain (GB1) and the third IgG-binding domain from streptococcal protein G (GB3), the PDZ2 domain of the human tyrosine phosphatase 1E (PDZ), phosphoglycerate kinase 1 (PGK1), and wild type ubiquitin (Ub) because they represent a broad spectrum of folded proteins that includes non-mammalian proteins (GB1 and GB3), proteins with enzymatic activity (PGK1), a protein-protein interaction domain (PDZ), and a protein involved in signal transduction and post-translational modification of proteins (ubiquitin). Aiming at the analysis of proteins in physiological conditions, most experiments were conducted at 37 °C in relatively short NMR experiments. PDZ in contrast was analyzed at 25 °C because the effect of the intracellular milieu on this protein at 37 °C was within the NMR time scale (< 1 h) immediate. With the transexpression protocol shown in Fig. 2a, we were able to obtain high-quality spectra for all five proteins, as shown in Fig. 3a. Overall, all five proteins show a similar in-cell [15N,1H]-HMQC spectrum as the in vitro reference (Fig. 3a). Since the [15N,1H]-HMQC spectrum is a fingerprint of the protein structure it can be concluded that all five proteins have the same overall 3D structure in cells as in vitro. However, compared to the in vitro reference, substantial line broadening, signal attenuation, and chemical shift perturbations of cross-peaks were observed in the [15N,1H]-HMQC spectra for the five proteins. Compared to the three IDPs analyzed above, the peak intensity alterations were much more pronounced and widespread in the folded proteins, as shown in Fig. 3a. In all cases the two spectra (in buffer and in cells) show similar cross peak linewidths because the in-cell samples contain significantly more amount of protein than the one in buffer and the signal-to-noise was adjusted to illustrate the superposition of the spectra. At similar protein concentrations the in-cell spectra show substantial line broadening compared to the one in buffer, as shown for GB3 in Supplementary Fig. 3a.
The [15N,1H]-HMQC fingerprints of all eight proteins studied suggest functional integrity of the transexpressed proteins. It could however be, that a significant amount of transexpressed protein is deteriorated in the electroporation step and the NMR spectra collected originate from a small remaining soluble and functional fraction of the total delivered protein. We addressed this question by quantifying how much of the delivered protein is functional. To this end, the protein levels and transcriptional activity of the transexpressed rGAL4-VP16 was determined, and compared to a GAL-VP16 that is produced by the same host cells and therefore fully functional. We first generated Cos7 cell clones stably transfected with pGal4-5XRE-eGFP. These cells were then either electroporated with rGAL4-VP16 or transiently transfected with a mammalian expression vector encoding GAL4-VP16. Both cells were finally harvested and the eGFP signal intensity as well as the protein levels of both the transexpressed rGAL4-VP16 and the episomally-expressed GAL4-VP16 were both determined in whole cell lysates. Protein quantification was carried out by a targeted proteomics approach called Parallel Reaction Monitoring (PRM)-mass spectrometry27,28, which was used to quantify three different peptides of this chimeric transcription factor as surrogate for the total amount of this protein in these cells. We found a linear correlation between the amount of endogenously expressed GAL4-VP16 and eGFP signal intensity, confirming that the vast majority of the protein in the cells is functional (Fig. 3b, c). Importantly, the protein levels and transcriptional activity of the transexpressed rGAL4-VP16 fit into the curves of the transcription factor produced by the cells, indicating that the electroporated protein is also functional.
We next used hydrogen-deuterium (H/D) exchange experiments to investigate whether the transexpression process unfolds the electroporated protein. To this end, GB1 was expressed in bacteria as 15N-labeled protein, then unfolded and incubated in D2O buffer to replace the naturally occurring exchangeable 1H protons by deuterium followed by refolding. Transexpression into mammalian cells by electroporation using a buffer prepared with PBS salts dissolved in deuterium was performed followed by in-cell NMR experiments 4 and 12 h thereafter with the same sample recording the H/D exchange. If the protein unfolds during electroporation, all amide deuterons are expected to exchange fully back to protons before the NMR measurements requesting equivalent relative signal intensities both at 4 and 12 h after taking into account protein loss with time. In buffer and in cells, after 4 h most resonances of deuterated GB1 were already visible, with some of them at the levels of non-deuterated GB1. This indicated a fast H/D exchange for the latter residues, which comprised threonine 17 (T17), glutamate 28 (E28), and glutamate 16 (E16), among others (Fig. 3d and S3b and S3c). Of note, all these residues are solvent-exposed and thus expected to exchange fast. In contrast, some residues, such as threonines 19, 26, 45, and 54 (T19, T26, T45 and T54, respectively) displayed amides that had not exchanged fully neither in vitro nor in cells. These findings show that GB1 did not unfold during electroporation and thus was incorporated into the cells as a folded protein answering the critical question whether transexpression by electroporation may harm the integrity of the protein structure to be delivered. As expected for longer incubation times, after 12 h some of these amide moieties showed increased exchange (Fig. 3d). Interestingly, H/D exchange appeared to be faster in cells than in vitro. For instance, the intensities of the resonances corresponding to T19, T26, T45 and T54 were lower in vitro than in cells (Fig. 3d and Supplementary Fig. 3b, c). This is more evident upon normalization to the cross peaks of the fast-exchanging signals (i.e. E16, T17 and E28) correcting for time-related signal loss in cells at 12 h (Supplementary Fig. 3c). These data indicate that the protein structure is less stable in cells than in vitro, a phenomenon previously reported for ubiquitin9. This could result from chaperone interactions with GB1 to be explored9.
Structure determination in mammalian cells by NMR
Next, we wanted to test whether also a structure determination by in-cell NMR is feasible in our system of mammalian cell lines as documented for insect cells29 and in E. coli30. For this, GB1 with its outstanding spectral properties was selected and transexpressed in Hek-293 cells. These cells contained electroporated 13C,15N-labeled GB1 with a final concentration of 50 μM, estimated by comparison to a reference in vitro sample (Supplementary Fig. 4a). A 3D [15N,13C]-combined [1H,1H]-NOESY experiment with a mixing time of 200 milliseconds was measured for 1 day at 10 °C for the collection of the NOE-derived distance restraints. The low temperature was used to preserve cell integrity (Fig. 2). Time-dependent spectral changes were minor as indicated by a comparison of the [15N,1H]-HMQC spectra at various time points (Supplementary Fig. 4b). In addition, a 3D [15N,13C]-combined [1H,1H]-NOESY of 13C,15N-labeled GB1 at a concentration of 0.5 mM was measured in PBS buffer (pH 7.4) with the same setup. The sequential assignment as well as distance restraint collection of the latter positive control was obtained starting from the available chemical shift list30. We transferred the NOESY assignment (including the sequential assignment) to the NOESY spectrum measured in cells. This included minor adjustments on the chemical shifts as well as the loss of many NOE cross peaks due to the 10-fold lower concentration, yielding 592 meaningful NOE-derived distance restraints (Fig. 4a), which are 678 distance restraints less than detected in the in vitro control sample. The loss of cross peaks attributed to the lower concentration is manifest in a comparison of Fig. 4e with 4f, which show the 15N-1H strips of residues D23-V30 in vitro and in cells, respectively. Nevertheless, these NOE-derived distances were sufficient to determine the 3D structure (PDB 7QTR, BMRB 34700) with a backbone RMSD to the mean of 0.9 Å for residues 2–57 and an RMSD of 1.1 Å to the reference structure (Fig. 4a and Supplementary Table 1, PDB 2N9K). Also, the 25% most buried side chains superimpose well with the published in vitro structure as shown in Fig. 4a–d. Only around the N-terminus the structure does not superimpose well with the reference structure and is less well defined, which is attributed to the low number of distance restraints in this area (Fig. 4a–d). The CYANA target function, which is a measure of experimental distance restraint violations31 has a small value of 1.44 Å2, indicating that the experimental data are self-consistent (Supplementary Table 1).
It is evident that a concentration of 50 μM is well above the natural concentration of most mammalian proteins. Towards a more physiological situation, we prepared a set of 4 identical samples, each one containing 10 μM 13C,15N-labeled GB1 transexpressed in Hek-293 cells (Supplementary Fig. 4a). For each sample, we measured the same NOESY experiment with a duration of 1 day at 10 °C. The individual spectra were summed up and the same analysis as stated above was performed. This yielded 187 distance restraints as manifest in Fig. 4g. The structure calculation was only successful if using also torsion angle restraints derived by TALOS-N32 from the secondary chemical shifts that could also be extracted from the [15N,13C]-combined [1H,1H]-NOESY spectrum. The structure is of low quality but still of atomic resolution with an RMSD of 1.5 Å to the mean and 2.2 Å to the reference structure (Supplementary Table 1; PDB 7QTS, BMRB 34701). A comparison of this in-cell structure with the reference structure manifests that it is less compact, which is attributed to the low number of restraints that hold the structure less together than requested. In summary, for GB1 at near physiological concentration, a 3D structure at atomic resolution with an accuracy of ca. 2 Å was determined, which shows the correct fold as well as the side chain arrangement of the buried core residues.