Assessment of the Interaction of Aggregatin Protein with Amyloid-Beta (AÎ˛) at the Molecular Level via In Silico Analysis.

Alzheimer's disease is a major neurodegenerative illness whose prevalence is increasing worldwide but the molecular mechanism remains unclear. There is some scientific evidence that the molecular complexity of Alzheimer's pathophysiology is associated with the formation of extracellular amyloid-beta plaques in the brain. A novel cross- phenotype association analysis of imaging genetics reported a brain atrophy susceptibility gene, namely FAM222A and the protein Aggregatin encoded by FAM222A interacts with amyloid-beta (A?)-peptide (1-42) through its N-terminal A? binding domain and facilitates A? aggregation. The function of Aggregatin protein is unknown, and its three-dimensional structure has not been analyzed experimentally yet. Our goal was to investigate the interaction of Aggregatin with A? in detail by in silico analysis, including the 3D structure prediction analysis of Aggregatin protein by homology modeling. Our analysis verified the interaction of the C-terminal domain of model protein with the N-terminal domain of A?. This is the first attempt to demonstrate the interaction of Aggregatin with the A?. These results confirmed in vitro and in vivo study reports claiming FAM222A helping to ease the aggregating of the A?-peptide.


Introduction
Alzheimer's disease (AD) is observed as a widespread and incurable ailment worldwide because of the elevated average human lifespan in recent years. Although the risk factors described as an increased lifetime and aging; the classic Mendelian inheritance with autosomal dominant pattern and multi-factorial features have also suggested as possible risk factors. The elevation of AD patient number in the population has become a social and an economic problem as an AD patient becomes dependent on another person. 1,2 The global number of AD patients who have dementia was estimated to be 43·8 million worldwide 3 and over 5 million in the USA. Through the middle of the century, this number may dramatically increase in the USA to 13.8 million people with Alzheimer's dementia. 4 The neuropathology indicator of AD, which in turn leads to dementia, is the presence of neurofibrillary tangles inside the cell, and the formation of amyloid-beta (Aβ) peptides out of the cell, resulting in cerebral atrophy. 5 One of these main pathologic characteristics of AD, Aβ aggre-gation results from the presence of oligomeric Aβ due to the processing of proteolytic lysis of the amyloid precursor protein (APP) in an inaccurate form and finally aggregation in Aβ fibrils and plaques as an intracellular lesion. 6 It is unclear what the exact function of APP is, however, its role in cell growth and biological activities including signal transduction and neuronal development has been shown in several studies. 7 Understanding the crucial patterns of the origin of the Aβ pathology is based on figuring out the mechanisms that show how the monomers that build up the Aβ aggregates are formed and how oligomeric clusters form the lesions. Many erroneous peptides formed by proteolytic processing of APP might be the major basis on the neuronal dysfunction in AD. These peptides are mostly being encountered in the hippocampus region in the brain. 8 The cleavage of APP may be processed in two alternative pathways, non-amyloidogenic and amyloidogenic, respectively. In the non-amyloidogenic proteolytic pathway, for example, APP is normally processed through α-secretase and λ-secretase, producing soluble peptides, while it forms indissoluble fragments as amyloid-beta peptides that aggregate in the amyloidogenic process. 6,9 Further investigations are required to clarify the mechanisms associated with AD in the case of aggregation of these peptides after proteolytic processing.
The amino acid(aa) sequence of Aβ42 from amyloid plaques was initially uncovered in the 1980s for the first time. 10 Aβ is commonly thought to be intrinsically unstructured and therefore cannot be crystallized by standard techniques. Hence, various studies were deduced on the structure optimization that can preserve Aβ peptides. The 3D structure of various Aβ peptides was identified by the experimental tools including nuclear magnetic resonance (NMR, PDB: 1AMC, 1AMB, 1BA4, 1IYT, 1QWP, 1Z0Q) and X-ray crystallography (PDB: 2Y29, 4M1C, 4MVI, 4MVK, 4MVL). Not interestingly, most of the information about the structure of Aβ was gained from NMR and molecular dynamics. 9 Attained models of Aβ peptide structure (1-28) by NMR represented a conversion-folding of α-helix into the β-sheet structure-taking place during the early stages of amyloid deposition in the AD. 10 Aβ peptide  is the main part of the amyloid plaques in AD and histidine-13 and lysine-16 of its chains are on the same face of the helix. Also, Aβ (1-40) peptide in the physiological condition is present in an α-helical structure whilst amyloid fibrils by these proteins shaped β-sheet structures. This structural modulation from α-helix to β-sheet is considered as the critical step in the formation of aggregation. 9 Most of the studies carried out in vivo and in vitro so far have been focused on elucidating their molecular complexity concerning the accumulation of amyloid-beta and the hyperphosphorylated microtubule protein tau in the pathology of Alzheimer. In a very recent study, Yan and et al. have reported that a brain atrophy susceptibility gene-FAM222A (Family 222 member A) agglomerates in amyloid deposits, interacting by amyloid-β (Aβ) via its N-terminal Aβ binding domain. The expression of the protein synthesized termed as Aggregatin was typically detected in the brain and spinal cord of the central nervous system (CNS) using a specific antibody in vitro. The length of Aggregatin is 452 aa long and its function is still uncertain. In this survey, they showed how FAM222A accumulation interacts physically with amyloid-β via its N-terminal Aβ binding domain. This is one of the most critical studies performed on a patient with AD and in an AD mouse model that shed a light on the pathophysiology of AD. 5 In the current study, we aimed to predict the three-dimensional structure of Aggregatin using various approaches considering the findings that show the interaction of the molecules of interest. The molecular docking was also performed based on the inspiration of this founding. We used homology modeling to obtain model proteins and then performed a protein-peptide docking study with several approaches. The prediction of protein structure by homology modeling has been performed broadly for folding proteins whereas it is limited in misfolded pro-tein and aggregate applications. As summarized in Figure  1, we used all authenticated/trusted bioinformatics tools to predict the three-dimensional structure of Aggregatin protein. To increase the accuracy and docking performance of the model protein, we subjected the model proteins from each bioinformatics server to two different tools for the structural-quality analysis and preferred the higher quality one. Besides, we monitored the domain analysis on the primer structure of Aggregatin and predicted which part of its amino acids related to the localization in the plasma membrane. On the other hand, we investigated functionally similar genes with the FAM222A gene.
More mechanistic studies are necessary to get sufficient information about the 3D structure and the characterization of the FAM222A gene product association with Alzheimer's pathology. Considering the critical roles of Aggregatin, it is fundamental to pinpoint its physicochemical characteristics at the atomic structure level. Our computational approach based results will broaden the horizon of our knowledge on the pathogenesis of AD and support to clarify a candidate protein to play a possible critical role in amyloidosis.

1. Prediction of the 3D Structure of Aggregatin by Homology Modeling
All the work-flow was summarized in Fig.1. The protein sequence encoded by Human FAM222A (Reference Sequence: NP_116218.2) and FAM222B (Family 222 member B) (UniProtKB/Swiss-Prot: Q8WU58.1) from NCBI in FASTA format were fetched from NCBI in FASTA format, PSIPRED 11 was used to predict the secondary structure of the Aggregatin. The amino acid sequence was subjected to I-TASSER 12 , PHYRE 2 13 , Robetta 14,15 common tools for based-homology modeling to have a tertiary structure of Aggregatin as a model,. The Qualitative Model Energy Analysis (QMEANDisCo) 16 and ProSAweb 17 tools were used for the quality control of model proteins.

Sequence Analysis
PSI-BLAST (Position-Specific Iterated BLAST) 18 was performed for pairwise. All general and scoring parameters including MATRIX: BLOSUM62 and the threshold value (0.005) were left as the default settings. The primary structure of the sequence was predicted using DomPred 19 (Protein Domain Prediction) and the outputs were interpreted in Jalview 2.11.0. 20

3. Visualization of Molecular Docking
The primary structure of Aggregatin was visualized using Jalview 2.11 and the PyMOL 21 software was used to represent the tertiary structure proteins-peptides and analyze the docking results at the atomic structure level. The NMR monomer structures of Amyloid Beta-Peptide (1-28 Aβ) (PDB ID:1AMB), and (1-42 Aβ) (PDB:1IYT) from PDB (Protein Data Bank) at http://www.rcsb.org/ were retrieved for the protein-peptide interaction study. Fig.3 was  retrieved from the Protein Data Bank in Europe (PDBe) which is available at https://www.ebi.ac.uk/pdbe/ to visualize the N-terminal and C-terminal domain of Amyloid-beta peptide 42. All docking complexes were conducted by InterEvDock2 22-24 online server via using the FRODOCK2 25 and SOAP_PP. 26

1. Domain Prediction Analysis
In the domain prediction analysis with DomPred, the parts to be localized in the transmembrane region in certain series domains were predicted as hydrophobic amino acids represented by the Kyte-Doolittle scale denotes. According to the primary structure of the Aggregatin, the range 147-299 aa is a proline-rich region and bound to the membrane in a helix form, and the remained part extends as the extracellular part. Besides, its N-terminal part is in the cytoplasm, whereas the C-terminal is outside the cell and the sequence of 244-259 aa -the pore-lining part-is in the membrane (See Fig.2).
We also performed the domain prediction of the sequence of FAM222B protein. We used the GeneMANIA 29 online server to search whether FAM222A is functionally related to any gene. We found that FAM222A has a shared domain with FAM222B. Thus, we also performed the domain prediction of the sequence of FAM222B protein, which -we think-would be a clue for in vitro studies (see Fig. 3). The FAM222A and FAM222B are localized in the nucleoplasm, share the same domain, and belong to the same protein family, and their alignments result per ident 49.04% in the blastp. Although the FAM222B has been confirmed in vitro to be localized merely in the nucleoplasm, it's transmembrane and extracellular part, the glycine-rich region, and the signal peptide part domains were predicted to be in the cell membrane (not shown).

2. The PDB sequence viewer of Amyloid-Beta 42
To find out which residues is N-or C-terminal domain, thus the interaction between β-Amyloid 42 peptide and Aggregatin at the atomic side level, we attained the Aβ 42 chart retrieved from PDBe.
According to this chart, the N-terminal domain of the Aβ 42 sequence was the first 27 residues, owing to the combination of the data from several databases such as CATH 31 and SCOPE 32 (See Fig.4).

The Selection of Model Aggregatin Protein
QMEAN tool was used to control the quality of the model protein before the docking process and determine the accuracy rate of the three bioinformatics tools selected for Aggregatin. As a result, the model protein from the Robetta was subjected to protein-peptide docking. Models retrieved from the other tools were determined to be inapplicable for the docking process.

4. Alignment
Although the sequence per identity retrieved from the homology modeling by PSI-BLAST is 31-26% (see Table 1), there are sufficiently acceptable docking results as shown in Fig. 6-10. Besides, the sequence similarity under 30% does not mean that the model protein retrieved from the comparative analysis will have low reliability. Some primary sequences may have been conserved, and homology modeling can predict as accurately as experimental low-resolution models. 34

5. Docking Outcomes of Aggregatin and Amyloid-Beta Peptides
The docking score and interaction outcomes are listed in Table 2-3. As shown in the representative models from 8 best clusters in Table 2, IES1_A and FRODO-CK1_C by higher scores and SOAP_A by lower energy score are top consensus complexes. According to the docking online server, in IES6_ B docking complex, the residues of GLN442, HIS443 (Aggregatin chain) and HIS13, PHE20 (Aβ peptide) between the top 5 residues (on each chain) predicted to be involved in contacts based on the consensus of top 10 models from each method (see Figure 6.A, B, C). As an overall result of docking, Aggregatin protein contacted the residues in the C-terminal region from the N-terminal region of the amyloid-beta 42.
Representative models from 7 best clusters were shown in Table 3, FRODOCK2_A and IES2_B 1 docking complexes, and SOAP_A 2 are the top consensus complexes. According to the docking online server, IES6_ B docking complex interacts the residue ARG447, GLY429 (Aggregatin chain in C-terminal region) and HIS13, LYS16 (Aβ peptide in N-terminal region) between the top 5 residues (on each chain) predicted to be involved in contacts based on the consensus of top 10 models from each method (see Fig. 9B and Fig. 10 B2). Figure 5. A.) The overall 3D structure of predicted Aggregatin. Robetta server predicted the 3D model structure of Aggregatin protein by the homology model. The model was subjected to a procedure of protein-peptide docking. The cartoon model representation and image were generated with Chimera 1.14. 33 Structures are symbolized as interactive colored ribbons to show strand and helix forms. B.) ProSA-web service analysis of Aggregatin. The black dot denotes that the input Aggregatin is between Z-score values of the experimental structures relative to the several amino acid residues and energy graph of the predicted Aggregatin. The Z-score or the overall model-quality was designated to be -6.15 in the X-ray region of the plot(Left). The other plot represents the local quality concerning the number of sequence positions (Right).

B
Besli and Yenmis: Assessment of the Interaction of Aggregatin ...
The results of both Aβ42 and Aβ28 docking with Aggregatin protein are ARG5 HIS6, HIS13, and TYR10 residues that commonly interact. Binding modes of Aggregatin with Aβ 42 and Aβ 28 by molecular docking simulation   Table 2).

Discussion
This research seeks to address the association of Aggregatin protein, whose function and three-dimensional structure is not characterized yet, with Alzheimer's pathology using in silico analysis. The interaction of Aggregatin with Amyloid-beta and lesion-forming complex of both was elucidated since an extracellular aggregate formation observed in Alzheimer's pathology. 5 Even though Aggregatin is expressed in the nerve cells, it remains uncertain from which part of the cell and compartment it is released and how it forms a core with the extracellular amyloid-beta plaques. Yet, the in vitro studies carried out so far showed that Aggregatin is located in the nucleoplasm and plasma membrane of the cell parts as well as in the mitochondria and focal adhesion. 27 Protein structures may comprise of multiple intense foldable parts named domains. These domains contain typical hydrophobic cores, can be folded free of each other, and frequently connected to establish distinct functions. 28  For this purpose, our first approach in this study was to predict the domain of the Aggregatin's aa sequence and find out if there is a potential significant aa sequence such as a signal peptide being used as the protein binding interfaces.
Domain prediction using the actual sequence may lead to crucial consequences linking theoretical knowledge to the experimental studies. During the structural research of the proteins by experimental studies such as NMR or X-ray crystallography, the achievement is limited to single-domain proteins rather than full multi-domain proteins. 30 Thus, for structural biologists, it would be meaningful to analyze the primary structure of proteins as it would be more logical to classify single-domain proteins in a distinct category than the multi-domain ones. In our results, Computational analysis of the membrane localization of Aggregatin protein validates in vitro studies (see Fig. 2).
Robetta is a protein structure prediction service continuously evaluated with CAMEO (Continuous Automated Model EvaluatiOn), which constantly assesses the accuracy and reliability of the prediction. Among other prediction tools of CAMEO, Robetta and QMEAN are the first-line by time-based statistical confidence and reliable performances. In addition to these tools, the ProSA-web Besli and Yenmis: Assessment of the Interaction of Aggregatin ... was used to verify the quality of the model protein. The overall model quality or the Z-score was designated to be -6.15 as shown in Figure 5. The Z-score indicates total model quality and calculates the deviation of the total energy of the structure regarding the energy distribution that comes from random conformations (see Figure 5A, B). Consequently, the model Aggregatin is reliable to subject to molecular docking procedure.
The structural simulations of the protein-protein interactions are fundamental to explain how each cell machinery assembles at the molecular level. These simulations may be helpful to assess multiple sequence alignments and their structures, thus unmask the binding interfaces when neighboring proteins have possible homologous sequences. In the docking procedure of the current study, a free online tool, InterEvDock2 was used for protein-protein docking operation and a potential InterEvScore was produced to combine evolutionary information. The In-terEvScore has determined the heteromeric protein interfaces and the integration of the evolutionary information retrieved from the multiple sequence alignments of each protein in the clusters with a residual-based multi-body statistical potential.
In this online server, docking searching is systematically applied using the FRODOCK2 and the results are re-calculated by InterEvScore 24 and SOAP_PP atom-based statistical potential to boost the confidence of the predictions.
As mentioned before, we predicted acceptable clusters using the InterEvDock2 server. This server predicts the top 10 consensus complexes for 239 out of 812 tested cases. The selected clusters for each of Aggregatin-Aβ 42 peptide and Aβ 28 peptide are represented in Tables 2 and 3. Besides, the InterEvDock2 server predicts the top 5 of residues interacting at the interface of a complex by a scoring system for the top 10 clusters of 30 models retrieved from InterEvDock2, FRODOCK2, and SOAP_PP.
In our study, we aimed to address the docking of Aggregatin with both Aβ 42 and Aβ 28 peptides since both peptides are involved in the amyloidosis process. Aβ 28 peptide is the major part of the amyloidosis process as it is deposited in AD in the early phase of amyloidosis. Here, we revealed that the side chains of HIS13 and LYS16 in the Aβ 28 are localized on the same face of the helix. Interestingly enough, the docking process with HIS13 and LYS16 residues of the Aβ 28 peptide is among the highest binding energy results as shown in Table 3. The interaction of Aβ 42 peptide with Aggregatin through LYS16 HIS13 residues retrieved from FRODOCK2 outcomes also possesses higher binding energy (See Table 2). Our expectation in the study was to find out whether our docking results were compatible with the in vitro analysis performed by Yan et al. 5 and if Aggregatin binds to amyloid-beta from the N-terminal region.
Taken together, we reached the adequately satisfactory docking results as represented in Figures 6-10. As shown in Figure 4, the N-terminal domain of Aβ 42 peptide is composed of the first 1-26 amino acids as we have demonstrated the 3D model structure of Aggregatin and its both N-and C-terminal ends. When comparing docking results with amyloid beta-42 and -28, the common residues that Aggregatin and amyloid peptides interact are ARG5, HIS6, and HIS13, and TYR10. These residues are in the N-terminal domain of Aβ42. The interaction of ARG5 and TYR10 in Aβ28 by Aggregatin and HIS6 and TYR10 in Aβ28 by Agregatin has higher binding energy score the other one (see Table 3). The binding mode of TYR303 in Aggregatin has the highest energy score with the residue HIS6 of Aβ42 (See Fig. 6A). Besides, the molecular simulation of Aggregatin(ARG447) with TYR10 and HIS443 with VAL18 of Aβ42 have better energy docking scores (See Fig. 7B). The critical point is which residue in the N-terminal domain of Aβ42 interacts with a higher binding energy in the calculation. Our calculation results show that the Amyloid Beta-42 generally interacts with its residues in the C-terminal region of Aggregation. As three scoring programs in the docking process have confirmed each other, we concluded that the models with the highest binding energy are the complexes that interact with residues  in the N-terminal region of the Amyloid beta.

Conclusion
Alzheimer's disease has no known specific remedy yet, and the treatment is limited to slowing the progression of the disease while increasing the quality of life. Unfortunately, there is no predictable result for Alzheimer's patients since some experience cognitive problems slightly whereas the others may undergo a quicker onset of symptoms with faster disease progression. Here, the strategies on how to implement the therapies have gained importance with molecular and cellular approaches. We claim that two different lengths of the amyloid-beta peptide with known NMR structures are docked to the model Aggregatin, and have critical interactions between residues measured by the computational calculation but the fact that the two proteins interact, is not enough to link this process to the amyloid plaque formation. This interaction might only suggest a role of Aggregatin in the amyloid-beta pathway, but in vivo and in vitro experiments should be performed to explain its actual role in plaque formation in Alzheimer's pathology, if any. The domain analysis of Aggregatin supports its localization within the cell as confirmed in vitro. This study will help us to understand the possible conformational changes in the three-dimensional structure of Aggregatin, which might be screened by the experimental methods as mutations such as deletions and single nucleotide polymorphisms.

Declarations
Funding: Not Applicable Conflict Of Interest: We have no conflict of interest to declare. Available of Data: Analysis data is ready to be shared upon request Code Availability: Not Applicable Authors Contributions: Both authors hypothesized the subject. Besli performed the analysis. Both authors evaluated the results and wrote the manuscript.