Physicochemical characterization of recombinant proteins is an important aspect of interdisciplinary research in the drug discovery field. The use of sophisticated physicochemical techniques allows us to describe and characterize modified proteins, which are the products of modern genetic engineering methods.
Because the biotechnology industry is rapidly developing in recent years, recombinant proteins have gained a major significance in various fields, with much attention being turned to the exploitation of proteins as both targets and therapeutic agents. Proteins produced by molecular biology techniques often have considerable advantages in comparison to native proteins purified from the natural sources. Their special features include for example a reduced immunogenicity and high level of homogeneity. It is important to note that almost half of the new drugs recently approved by the United States Food and Drug Administration (FDA) are therapeutic proteins, and the drug-pipeline landscape is now shifting even more toward this class of drugs [1].
Because the development of biopharmaceuticals and biosimilars is a fairly complex undertaking, regulatory bodies such as the FDA and European Medicines Agency (EMA) require comprehensive drug substance characterization such as lot-to-lot and batch-to-batch comparisons, stability studies, impurity profiling, and protein aggregate elucidation [2,3].
Quality control of the heterologously produced and subsequently purified recombinant proteins is the final and critical check-point of any protein production process. The successful downstream application of a recombinant protein depends highly on its quality. Besides the influence of the production process, which is mostly conditioned by the host and conditions of the cell cultivation, the quality of a recombinant protein product relies mainly on the purification procedure. Thus, the purification strategy must be carefully designed based on available information from the molecular level. A poor purification design may result in misfolded or heterogeneous protein samples due to the interference of sequence additions, such as tags or extra amino acids resulting from the cloning procedure, and/or absence of refining steps in the purification procedure. However, even if the protein has the expected activity, characterization of a purified protein in some detail is a means to assess and ensure its quality and biological activity [4].
Following the production and purification of recombinant proteins, apart from their biological activity, there are three most important aspects that must be carefully evaluated and these include purity, homogeneity, and folding state. There is a number of physicochemical techniques for protein characterization, of which the most important, such as the size exclusion chromatography (SEC), dynamic light scattering (DLS), and differential scanning fluorimetry (DSF) — are presented below with focus on their main advantages and disadvantages.
SEC (Size Exclusion Chromatography)
One of the most common protein analyses after the production process is analytical size-exclusion chromatography (SEC), which is used to determine if a protein has a proper oligomeric state, for example does it exist in a monomeric form and maintain that structure throughout the manufacturing and formulation processes. This method is useful in assessing protein homogeneity (monodispersity) and its propensity to fragmentation and aggregation as well as for quantitation of the non-covalent aggregated forms and the non-aggregated protein populations based on molecular size (hydrodynamic radius). One of the main advantages of the SEC is that it is a fairly inexpensive technique with a potentially high sample throughput using a simple physical separation mechanism. SEC separates biomolecules according to their hydrodynamic radius and the stationary phase (column) consists of spherical porous particles with a carefully controlled pore size, through which the biomolecules diffuse based on their molecular size difference using an aqueous buffer as the mobile phase. Basically, SEC is an entropically controlled separation process in which molecules are separated on the basis of molecular size differences rather than by their chemical properties [1].
The elution order typically follows molecular weight and molecules with the highest molecular mass are eluted first because they are excluded from the pores. Smaller molecules, which are able to access pores within the resin particles, permeate a larger accessible volume within the column and are eluted later.
In SEC, the size-based separation allows the construction of a calibration curve based on a set of known analytes, which can be used to estimate the molecular mass of an unknown analyte and the typical calibration curves are based on proteins or polymers of known molecular masses. By plotting the log M vs the retention volume, one typically obtains a third order polynomial curve with a linear region providing the highest resolution and molecular mass accuracy. Although the SEC method is fairly commonly used due to its properties, it has some limitations. As proteins shapes could vary (e.g., globular, rodlike or flexible chains), their Stokes radii do not correlate exactly with molecular masses, which for example may lead to the erroneous assessments of the oligomeric state. Another source of error in the calibration curve is that non-ideal adsorption may alter the retention volume [1].
The application of SEC has an important role in advancing the characterization of the recombinant proteins including biopharmaceuticals and reducing some of their risks when used both as therapeutic agents and as crucial reagents in the drug discovery process.
DSF (Differential Scanning Fluorimetry)
A protein starts off in the cell as a long chain of amino acids. There are 22 different types of natural amino acids, and their ordering determines how the protein chain will fold. When folding, two types of structures usually form first: “alpha helices” and “beta sheets”. These two structures can interact to form more complex structures which allow proteins to perform their diverse jobs in the cell, e.g. enzymes form shapes with pockets called “active sites” that are perfectly shaped to bind to their substrates. Unfortunately, protein folding in general is a complex process that is prone to errors and can be affected by many factors especially during multi-stage purification process [5].
To determine folding state and thermal stability of purified proteins or to identify undesired aggregation, the Differential Scanning Fluorimetry (DSF) is employed. It is the fairly cost-effective, accessible and with the highest throughput biophysical technique available to researchers involved in the characterization of the recombinant proteins. The analysis can be performed in small volumes using a relatively small portion of protein (~2 μg per reaction) in any quantitative PCR instrument with the melt-curve protocol software [6].
In DSF analysis, a protein sample is heated in the presence of a fluorescent dye (to date, the most favored dye is SYPRO Orange), which alters its fluorescence upon binding to the hydrophobic amino acids. Upon heating and protein unfolding, the dye binds to the internal hydrophobic protein core which causes a significant increase in fluorescence emission. Maximal fluorescence signal is observed when the protein is unfolded completely, then the signal decreases due to massive protein aggregation and dye-protein dissociation (Figure 1A). The fluorescence signal is detected in real-time and during the unfolding process of the target protein the exposure of hydrophobic amino acids demonstrates the characteristic pattern in the fluorescence as a function of temperature (melting curve). The inflection point of the fluorescence plot corresponds to the Boltzmann melting temperature (Tm), at which 50% of the target protein is unfolded (Figure 1B) [6,7]. The changes in Tm can be correlated to changes in the protein stability (typically the higher Tm, the higher protein stability).

The DSF method is being adopted with additional applications beyond the measurement of protein stability, such as protein–ligand interactions in drug discovery or optimization of protein crystallization conditions, and many others [7,8]. However, it should be noted that DSF analysis has its limitations, e.g. the dye employed may interact with some sample buffer components (especially with detergents) and DSF analysis is not possible for proteins with relatively low percentage of hydrophobic amino acid residues. In such cases as an alternative, the NanoDSF technique can be implemented to determine melting temperatures of proteins. Similarly, a protein in solution is exposed to a temperature gradient causing unfolding of the protein. However, in contrast to the “standard” DSF technique, no dye is added. The measured intrinsic fluorescence of the protein mainly comes from the aromatic side-chains of tyrosine (Tyr) and tryptophan (Trp) residues (usually located within the hydrophobic protein core). Upon denaturation said residues become exposed and their fluorescence intensity increases. As before, Tm of the tested proteins can be calculated, but in a dye-free approach (Figure 2) [9,10].

DLS (Dynamic Light Scattering)
Once the oligomeric and folding state of the protein has been assessed, one has to ensure it is homogeneous. A protein sample is homogeneous if all molecules present have the same size, are fully folded in the native state and the sample is devoid of aggregates. The aggregation of therapeutic proteins is fairly common and unwanted phenomenon that can lead to many problems during manufacturing, storage and delivery. In addition, because the presence of even small amounts of aggregates can cause immunogenicity in humans it is of great importance to ensure the recombinantly produced biotherapeutics are aggregates-free. Since the separation and UV- based detection in SEC systems is in many cases insufficient to detect the presence of soluble protein aggregates in solution the Dynamic light scattering (DLS) is often used for these types of analyses.
The DLS technique, also known as photon correlation spectroscopy, has the advantage of combined sensitivity, reliability and broad applicability. Because of its rapidity and low sample consumption, DLS is a very convenient method to simultaneously determine the homogeneity of the proteins of interest, the presence of soluble high order assemblies and aggregates and to study protein interactions with other proteins, nucleic acid and other molecules (e.g. endotoxins) [11,12].
DLS measures Brownian motion, which is related to the size of the particles and is defined as: “The random movement of particles in a liquid due to the bombardment by the molecules that surround them”. The particles in a liquid move randomly and the velocity of the Brownian motion is defined by a translational diffusion coefficient that can be used to calculate the hydrodynamic radius (size), i.e. the radius of the sphere that would diffuse with the same rate as the molecule of interest. This is done by measuring, with an autocorrelator, the rate at which the intensity of the light scattered by the sample fluctuates [13]. The rate of fluctuation is directly related to the diffusion coefficient of the particles via a mathematical transform. The diffusion coefficient, in turn, may be converted to a
measure of size known as the hydrodynamic radius rh via the Stokes-Einstein equation [14]:

In Equation 1, kB is Boltzman’s constant, T the absolute temperature, η – the solution viscosity and Dt the translational diffusion coefficient determined by DLS. The hydrodynamic radius rh represents the radius of a sphere with the measured Dt. Using this observation and the relationship between diffusion speed and size, the size of proteins can be determined.
In should be noted that, DLS has also a number of limitations, for example the measurements are very sensitive to temperature and solvent viscosity (temperature must be kept constant, and solvent viscosity must be known for a reliable experiment) [11]. Moreover, it is a low-resolution method that often cannot distinguish close quaternary structures (e.g. monomer from dimer); this distinction should be done by previously described analytical SEC method. However, the technique is very well adapted for qualitative studies and can be performed over time and/or at different temperatures in order to test the stability of the protein preparation in different buffers.
The verification of protein quality and key biophysical properties is essential in taking proteins from the final purification step to their ultimate use (e g. as a therapeutic drugs). Also, a fundamental issue of good laboratory methods is that protein production needs to be highly reproducible. Determining the robustness of production/purification process and its capacity to reproducibly deliver samples of equivalent quality is therefore of paramount importance. Thus, the described methods are essential and should be used routinely for a quality control assessment of each preparation of the recombinantly produced biotherapeutics.
AUTHORS
Marta Kujda-Kruk
Senior Scientist I
Izabela Rajchel
Scientist I
LITERATURE
[1]. Fekete Sz., Veuthey JL., Guillarme D., New trends in reversed-phase liquid chromatographic separations of therapeutic peptides and proteins: Theory and applications, Journal of Pharmaceutical and Biomedical Analysis 69 (2012) 9 – 27.
[2]. Beck A., Reichert JM., Approval of the first biosimilar antibodies in Europe, mAbs 5, (2013) 621 -623.
[3]. Krull I.S., Rathore A., Wheat TE., Current Applications of UHPLC in Biotechnology Part I: Peptide mapping and amino acid analysis, LCGC North America 29, (2011) 838 – 848.
[4]. Oliveira C., Dominques L., Guidelines to reach high-quality purified recombinant proteins, Applied Microbiology and Biotechnology 102 (2018) 81 – 92.
[5] https://sitn.hms.harvard.edu/flash/2010/issue65/
[6] Vivoli, M., Novak, H.R., Littlechild, J.A., Harmer, N.J. Determination of Protein-ligand Interactions Using Differential Scanning Fluorimetry. J. Vis. Exp. 91 (2014).
[7] Gao K., Oerleman R., Groves MR., Theory and applications of differential scanning fluorimetry in early-stage drug discovery, Biophysical Reviews 12 (2020) 85 – 104.
[8] Sun Ch., Li Y., Yates EA., Fernig DG., Simple DSF viewer: A tool to analyze and view differential scanning fluorimetry data for characterizing protein thermal stability and interactions, The Protein Society, (1) (2019) 19 – 27.
[9] Misetic V., Reiners O., Krauss U., Jaeger KE., nanoDSF Thermal Unfolding Analysis of Proteins Without Tryptophan Residues, NanoTemper Technologies GmbH
[10] Magnusson AO., Szekrenyi A., Joosten HJ., Finnigan J., Charnock S., and Fessner WD., nanoDSF as screening tool for enzyme libraries and biotechnology development, The FEBS Journal 286 (2019) 184 – 204.
[11] Stetefeld J., McKenna S.A., & Patel T.R., Dynamic light scattering: a practical guide and applications in biomedical sciences, Biophys Rev 8 (2016) 409 – 427.
[12] Minton AP., Recent applications of light scattering measurement in the biological and biopharmaceutical sciences, Anal Biochem 501 (2016) 4 – 22.
[13] Zetasizer Nano user manual. Malvern Panalytical
[14] Some D., Razinkov V., High-throughput Analytical Light Scattering for Protein Quality Control and Characterization, Chapter in Methods in molecular biology (Clifton, N.J.) 2019.