Assessment of Interaction of Human OCT 1-3 Proteins and Metformin Using Silico Analyses

Metformin, a drug frequently used by diabetic patients as the first-line treatment worldwide, is positively charged and is transported into the cell through human organic cation transporter (hOCT 1-3) proteins. We aimed to mimic the cellular uptake of metformin by hOCT1-3 with various bioinformatics methods and tools. 3D structure of OCT1-3 proteins was predicted by considering the structures and function of these proteins. We predicted functional regions (active and ligand binding sites) of OCT1-3 and performed comparative bioinformatics analysis. The predicted structure of hOCT1-3 was then analyzed in the Blind Docking server and the results were confirmed with predicted binding site residues and conserved domain regions. We simulated the OCT1-3 and metformin docking and also validated the docking procedure with other substrates of HOCT1-3 proteins. We selected the best poses of metformin docking simulations as per binding energy (–5.27 to –4.60 kcal/mol). Lastly, we validated the static description of protein-ligand (OCT-Metformin) interactions by performing molecular dynamics simulation. Eventually, we obtained stable simulation of OCT-metformin interaction.


Introduction
Membrane proteins are associated with the prominent functions in the cell, approximately responsible for 30% genes in the human genome 1 and currently possessing 50% of pharmaceutical drug discovery. 2 Membrane proteins perform a broad variety of particular roles during cellular events. 3 Due to its structural and physicochemical properties, the plasma membrane has a selective permeability for organic and inorganic substances including cation and anion compounds. Hence, it assists to sustain the unique content both inside and outside of the cell.
One of the protein families that provide translocation of cationic organic and inorganic compounds that localized in the cell membrane is SLC (Solute carrier) family from the MFS superfamily. The SLC family is a 22-membered cell membrane transporter. A subfamily of the SLC family is SLC22A1,2 and 3 (cd17379: MFS_SLC22A1_2_3). 4 Besides many essential cation molecules for the cell, the SLC transporters are the target of drugs with high pharmacological value. The human body constitutes more than 400 important SLC transporters for a broad range of tasks including drug metabolism as absorption, distribution, and excretion. Hence, there is a growing interest in the effects of the drug on the development and progression of interactions with these transporters. 5 Metformin, categorized as an anti-diabetic medication, is uptaken by the cell via SLC transporter proteins encoded by SLC22A1-3 (also named OCT1-3) genes. 6 OCT1-3 membrane proteins are expressed at different levels in several tissues, to name a few, the renal expression level of OCT-2 is high, whereas OCT-3 is most commonly expressed in skeletal muscle, 7 and OCT-1 is expressed primarily in hepatocytes. 8 Metformin is hydrophilic (logD -6.13 pH 6.0) and its pKa (physiological pH) is 12.4. 9 Functional elimination of OCT-1 in pri-mary mouse hepatocyte culture and OCT-1 has been demonstrated to play an important role in metformin response in vivo mouse model. 10 Following the entry into the cell via HOCT1-3, metformin exhibits its anti-diabetic properties in several ways. It is broadly believed that the blood-glucose-lowering impact of metformin is mediated chiefly through the repression of hepatic glucose production by decreasing gluconeogenesis and blocking gluca-gon-mediated signaling in the liver. 11,12 Furthermore, some mechanisms have been suggested that metformin can activate AMPK (AMP-activated protein kinase) by the upstream liver kinase B1, 11 enhanced AMP/ATP rate hereby the restraint of mitochondrial respiratory chain complex I. 13 Metformin improves the activity of the insulin receptor and IRS-2 (insulin receptor substrate 2) and boosts glucose uptake through enhanced translocation of Figure 1. The identification of the molecular modeling of OCT1-3 and metformin. There are four main steps in the workflow. The first step is the prediction of the 3D structure of OCT1-3 proteins and quality control of the model protein structures. The second one is the identification of the template proteins with the VAST. The third one is the analysis of the sequence data by Jalview 2.11.0 while the fourth is the molecular docking by Achilles Blind Docking Server.
Akçeşme et al.: Assessment of Interaction of Human OCT 1-3 ... glucose transporters, such as GLUT-1, to the plasma membrane. 14 In the current study, we aimed to predict the three-dimensional structure of human OCT1-3 using various computational approaches through consideration of the current authenticated/trusted bioinformatics tools. The molecular docking was also performed under the inspiration of the in vitro and in vivo foundings. We also performed Molecular dynamics (MD) simulation of metformin and hOCT1-3 dockings at the atomic level for validation. The computational approaches to OCT1-3 proteins, the particular structure prediction of the proteins and the simulations using MD, have been broadly implemented for investigating their dynamic actions.
Given the pharmacological importance of SLC22A1,2 and 3 proteins in humans, determination of the structure of these proteins, the estimation of their active sites, and the definition of how the transport mechanism works have aroused great interest. In the present study, we have reported and defined the ligand-dependent interactions of hOCT1-3 with metformin and the other ligands utilizing computational approaches and explored the found interactions through comparative analysis, homology modeling, and molecular dynamic studies. This study is the first attempt to demonstrate OCT1-3 interaction with metformin. This interaction has characterized by docking analysis and the results were validated with MD simulations. This is an important study that uses the predicted structure of OCT proteins to stimulate the interaction which stays highly stable throughout the MD analysis.

1. 1. Prediction of Secondary and Tertiary Structure of OCT1-3 Proteins
We retrieved OCT1, 2, and 3 (Accession no: AAI26365.1, NP_003049.2, and NP_068812.1, respectively) from GenBank in FASTA format, predicted the secondary structure of the proteins using JPred4,1 5 which is the latest version of the JPred online prediction server supplying by the JNet algorithm.
Each of OCT1-3 protein structures was predicted on PHYRE 2 , 16 Robetta 17,18 and I-TASSER 19,21 (protein structure prediction servers). In these prediction tools, homology modeling (or comparative modeling) was used to compare experimentally determined proteins as templates.
To control the quality of the model proteins, we performed the local structural quality of transmembrane protein models analysis using (QMEANBrane) 22 and ProSAweb. 23 We compared each of the obtained models to all PDB proteins in MMDB (Molecular Modeling Database) to find 3D similar structures in VAST (Vector Alignment Search Tool). 24 The VAST analysis was contributed to the following proteins with the highest scores, PDB Id: 4zw9_A, 4zwc_A, and 5c65_A. Each of the OCT1-3 protein structures obtained from the Robetta server was then selected as a model since its neighboring proteins have the highest VAST scores and %Ids, compared to the other prediction tools (See Fig1).
The Computed Atlas of Surface Topography of proteins (CASTp) 25 3.0 was utilized to predict the surface of the binding pocket of the model proteins for the interaction with their substrates.

1. Sequence Analysis
We performed a pairwise in BLASTP 26 and multiple sequence alignment in Clustal OMEGA 27 for each of the OCT1, 2, and 3 proteins with the selected template proteins (4zw9, 4zwc, and 5c65) obtained from the VAST. Parameters for the alignment with the Clustal OMEGA were set as -GAPEXT :0.1, ENDGAPS: 0.5, GAPDİST: 1, GA-POPEN:10, and MATRIX: BLOSUM62. We analyzed and interpreted the results in the Jalview 2.11. 28 To analyze the feature of the sequences, functional annotations of template proteins were retrieved from the PDBe-KB database 29 , followed by the comparison of these proteins with the model proteins to identify the conserved regions-the sequence features. In this way, we assigned the predicted functional sites, predicted PTM sites, and predicted ligand binding sites, and ligand binding sites, and interaction interfaces for our model proteins.

1. 3. Visualization
The visualization of the primary and secondary structures of the proteins was performed using the Jalview 2.11. The PyMOL 30 software was utilized to represent and analyze the atomic structure of proteins.

1. 4. Molecular Docking Simulations
One of the most essential steps in this study is the molecular simulation as given in the workflow in Fig1. For the preparation of the docking process, hOct1-3 proteins were downloaded from the Robetta server in PDB format. All ligands (Metformin, Phenformin, and Norepinephrine) of hOct1-3 were retrieved within the SDF format from PubChem. 31 We removed the water, added the polar hydrogen to the model proteins. Then, we charged the model proteins and the ligands by the computation of Gasteiger before converting to pdbqt format using the AutoDock Vina. 32 Prior to the docking process, the 3D model protein structure was subjected to the energy minimization method in chimera 1.14 33 with the default parameters: the steepest descent:100 with 0.02 step sizes, without fixing any atoms, followed by 10 steps of conjugate gradient steps with 0.02 step size (Å) minimization.

1. 5. Molecular Dynamics (MD) Simulations
To evaluate the structural constancy and validate the static description of the protein-ligand (hOCT-metformin) interactions, we ran an MD simulation using the Desmond Software. 35 The dynamic nature of protein-ligand interactions has been studied and atomic-level interactions were investigated.

1. Alignments
We aligned a range of 146-445 aa of OCT-1 with 85-397 aa of 4ZW9 and 4ZWC as explained previously. The OCT-1 sequence has shown a 22.77% sequence identity with 4zw9 and 4zwc. We aligned a range of 146-540 aa of OCT-1 with of 63-477 aa of 5c65 by 22.51% of identity.
In the case of OCT-2, we aligned a range of 24-546 aa of OCT-2 with of 93-510 aa of 4ZW9 and 4ZWC, and 71-438 aa of 5c65 by the same percentage of identity ( 26.57% ).  In OCT-3, we aligned a range 60-353 aa of OCT-3 with of 86-513 aa of 4zw9 and 4zwc by 24.53% of identity, whereas we aligned a range of 84-353 aa of OCT-3 with of 64-473 aa of 5c65 by 24.53% of identity.

2. Analysis of Conserved Domain and Sequence Features of OCT1-3
After subjecting OCT1-3 and template proteins sequences to multiple alignments, we detected the conserved regions through the comparative analysis in the Jalview 2.11.0 (See Fig2). We summarized the results of the comparative analysis as a list in Table1.
The cellular and biological functions of a protein are highly related to its 3D structure. The pharmacodynamics of a drug on the cell decreases or has no effect if the functional parts of these proteins are mutated in the genome. On the other hand, defining protein-ligand binding sites and explaining functional parts of the protein are critical approaches for drug discovery. 36 Regarding the pharmaco-  Recent studies and meta-analyses have shown that patients with T2DM have a lower incidence of tumor development than healthy controls and cancer patient that use metformin has a lower risk of mortality. 37 Metformin takes more attention after the discovery of its role in cancer prevention and treatment has been revealed. Improving or managing cellular uptake of therapeutic entities is mostly related to the understanding of the molecular mechanism of the interaction with the components of the cell membrane and therapeutic entities. This paper aimed to predict the 3D structure of OCT1-3 protein and identify its role in the uptake of metformin into the cells that have been studied by in vitro and in vivo studies previously. 38,39 Sequence and structure analysis of proteins of unknown function with those of proteins of known function enable us to discover and deduce the function of the unknown proteins. Characterization of protein function by in vivo and in vitro studies is both time and labor-consuming. Furthermore, some proteins, especially membrane proteins are exceedingly difficult to be crystallized by experimental tools. In the modern genomic and proteomic era, a protein is mostly identified before its function is determined, therefore the role of in silico studies in structural analyses of proteins becomes more important in recent years.
The structure of OCT1-3 proteins has not been uncovered yet by any experimental tools although some of the protein`s structures have already known in the same protein family. This paper is important for being the first attempt to study and predict the 3D structure of OCTs to reveal the information about how these proteins facilitate the uptake of metformin into the cells. Even though our analysis indicates no significant similarity between OCTs and the proteins of the database at a sequence level, the predicted OCTs are similar with its conservative regions to some carrier proteins that share a similar function.
It is known that 30 percent of all sequences are membrane proteins. Unlike globular proteins, a 3D model for membrane proteins can hardly be computed. Another important aspect of this paper is presenting a new pipeline to stimulate the docking of protein molecules in the absence of a similar sequence in the database. The recent algorithms in 3D structure prediction of proteins enable us to predict the structure of proteins in high accuracy even in the absence of sequence similarity. In silico analyses helped us to stimulate this biological process and propose the uptake of metformin by OCTs as it is shown in Fig 3-5.
Dakal et al. modeled the 3D structures of hOCTs by only one tool-I-TASSER in 2017. 40 In Fig 1, four key steps of this pipeline have been shown as the workflow. One of the very critical points, the prediction of the 3D structure of the protein, was performed by three different tools; Iterative Threading ASSEmbly Refinement, Phyre2 that uses protein homology, and Robetta. The output model proteins were then exposed to all proteins in the PDB by the calculation in the VAST. This approach is reflected in our results through an elevation in the accuracy in the protein structure prediction. We were eager to increase the accuracy of the prediction through the validation of these structures using the experimentally determined proteins as templates. After obtaining the structure of the OCTs, the orientation of these molecules in the plasma membrane was predicted using the QMEANBrane scoring function.
Transmembrane proteins play vital roles in a diverse range of essentially biological processes. Knowing about the protein position within the lipid bilayer is important and requires a computational approach, since identifying the correct orientation is possible by defining the relationship between sequence, structure, and the lipid environment. One of the commonly used tools to localize the structure of proteins within the lipid player by knowledge-based statistical potential, QMEANBrane was used and the predicted position as exhibited in Fig 3-5. As a result, all model proteins are within the expected range of transmembrane structures.
Models obtained from the other tools were determined to be inapplicable for the docking process. The Robetta is continuously evaluated with CAMEO (Continuous Automated Model EvaluatiOn), which constantly assesses the accuracy and reliability of the prediction. Among other prediction tools, CAMEO, Robetta, and QMEANBrane are the first-line with time-based statistical confidence and they show reliable performance. We also used the ProSAweb to verify the quality of the model protein structures. The Z-score designates the entire model-quality for OCT1-3, (Z-score:-8.59, -7.04, and -5.95, respectively) as shown in Figure 6.
To analyze sequence features, functional annotations of template proteins were retrieved from the PDBe-KB database. The recently released database, PDBe-KB, give us a great opportunity to analyze and visualize sequence features of the similar proteins that are used as the template to assign a novel function to our sequence of interest. Even though the sequence similarity is low, as shown in the results, there are significantly conserved regions. In this way, we assigned the predicted functional sites, predicted PTM sites, and predicted ligand binding sites, and ligand binding sites, and interaction interfaces to OCTs.
Representation of molecular modeling of OCTs and metformin was performed using the Blind Docking server. The server mainly utilizes a customized version of Autodock Vina 32 for the blind docking calculations. We obtained binding energy plots, and, in this way, the most energetically favorable dockings have been selected as the first best pose according to binding energy frequencies (See supplementary Fig1). Taking into account the model protein uncertainty as well as the small size of the metformin molecule, it is not surprising that many different ligands pose with similar scores. To cope with this, we used the CASTp bioinformatics tool and compared the predicted active sites of model proteins with the first best poses as the docking results. Interestingly, both output results from two servers were similar. Besides, for the results to be more meaningful, the pharmacologically important phenformin from metformin analogs was validated by the docking study of OCT-1 and OCT-2, whereas OCT-3 by the norepinephrine compound. We also combined these outputs with outcomes from exported functional annotations of the template proteins from PDBe-KB. We visualized the interaction of metformin, phenformin, and norepinephrine, and OCTs in PyMOL to better examine the poses and extract our images.
As listed in Table 1, OCT-1 forms hydrogen bonds with docked ligand molecules with the residue number of PRO481, ARG488, and GLN152, and ASN156, and GLN362. The other four residues in the predicted site (LYS214, TRP354, and ASP357, and ILE446) interacted by hydrophobic and salt-bridge bonds. Chen et al. 41 have re-ported that OCT-1 interacts with its ligands by hydrogen binding and non-covalent interaction through the ASP357, TRP354, and ASN156, and ILE446 residues among their predicted residues. OCT-2 interacted with both metformin and phenformin through ASN157, CYS474, and ASP475 residues with noncovalent interactions such as hydrogen bond, salt bridge, and hydrophobic interaction (See Figure 4). OCT-3 protein contacts with norepinephrine and metformin in the same residue (ASN162 and ARG212) by hydrogen bonds. Given the extensive hydrogen bonding motif of metformin, water may be involved. This may significantly impact and alter the results and conclusions. Thus, we have considered performing the classic MD simulation for the docked complexes.
One of the residues that OCT-1 interacts with phenformin is GLN152 but OCT-3 interacts with norepinephrine through GLN158 as the same residue. The difference in the number of the residue is due to the setting of the sequenced reference. Our results suggest that human OCT proteins are predominantly expressed in different tissues of the human body and the active binding sites of these proteins also vary.
Although the methodology, including template definition, comparative protein modeling, and structure analysis, and molecular docking, seems pretty standard and employed in hundreds of research projects as in our workflow, there is a validation such as the quality control of the model proteins using Web services at almost every stage to increase reliability in achieving and evaluating meaningful results. Thus, the described pipeline is highly useful due to its ability to integrate the ligand-binding site and interaction interface information that is obtained from the PD-Be-KB database to the information that is derived from similarity analysis and prediction tools. This pipeline is also promising to assign a function to predict the 3D structure even in the absence of any sequence similarity.

MD Simulations
Root mean square deviation (RMD) of protein and ligand was calculated during the MDS concerning their initial structure. RMSD of the OCTs shows its stable conformation throughout the simulation which indicates the stability of the interaction with metformin and phenformin. Besides OCT3 was stable with norepinephrine throughout the simulation.
Each OCT proteins attained equilibrium in a few nsec and remain stable throughout the simulation time up to 100 nces. Initially, the RMSD plot for metformin attained equilibrium in a few nsec as well and remain stable throughout to stimulation. Some deviations observed but no bigger changes of the order of 1-3 Å are seen in our analysis. Similar RMSD scores were recorded with the phenformin interactions as well. OCT1-3 interactions with the ligands were monitored throughout the simulation. The interactions are rich with H bonds these play significant roles in ligand binding.
Even though the involvement of human OCT in the uptake/transport of metformin is already mentioned in the literature, this study is demonstrating the hOCTs-metformin interaction at the atomic level for the first time and describing how and where the binding occurs. Mimicking this binding with the absence of the structural information of the protein was possible with the unique approach that was described in the pipeline in Figure 1.

Conclusion
The three-dimensional structure of a protein is a direct association with its comprehensive cellular and biological function. To investigate the tertiary atomic structure of OCT1-3 proteins and their localization in the cell membrane, it is significant to evaluate the pharmacodynamics of metformin, frequently preferred in Type 2 Diabetes Mellitus medication, through the determination of the residues which interact with metformin in the case of cell translocation of these proteins. One of the important limitations of mimicking protein-ligand interactions is the absence of the protein 3D structure. The presented new pipeline is promising especially for the interaction simulation studies that are conducted with proteins with unknown structures. To determine the therapeutic effect of Metformin or other life-saving drugs into the cells, further studies are needed to examine genetic variants of human OCTs in specific patient populations. Analyzing insertions, deletions, and other genetic variants effects on hOCTs in structure level are important to explore the role of these proteins in metformin pharmacokinetics and response. Our study could be a front preparation and inspiration for future positively charged drug discoveries and development by examining the atomic level of OCT proteins.