Characteristics, Structure, and Biological Role of Stefins (Type-1 Cystatins) of Human, Other Mammals, and Parasite Origin

The majority of lysosomal cysteine cathepsins are ubiquitously expressed enzymes. However, some of them differ in their specific cell or tissue distribution and substrate specificity, suggesting their involvement in determining normal cellular processes, as well as pathologies. Their proteolytic activities are potentially harmful if uncontrolled. Therefore, living organisms have developed several regulatory mechanisms such as endogenous protein inhibitors of the cystatin family, including the group of small cytosolic proteins, the stefins. The main focus of this review is stefins of various origins and their properties, structure, and mechanism of interaction with their target enzymes. Furthermore, oligomerization and fibrillogenesis in stefins and/or cystatins provide insights into conformational diseases. The present status of the knowledge in this field and current trends might contribute to identifying novel therapeutic targets and approaches to treat various diseases.


Introduction
The discovery of the lysosome was crucial for understanding intracellular protein degradation processes. 1 Clearly, this finding subsequently contributed to the rapid progress in studies on lysosomal proteases and led to the discovery of a great variety of their endogenous protein inhibitors and small-molecule inhibitors. Among lysosomal proteases are cysteine cathepsins, the most thoroughly studied enzymes that require a slightly acidic and reducing lysosomal environment. There are 11 human cysteine cathepsins, cathepsins B, C, F, H, K, L, O, S, V, X, and W, and they were identified at the sequence level and later confirmed by bioinformatic analysis of the genome sequence and mRNA expression levels. 2 Some of the cathepsins, such as cathepsins B, H, L, C, and O, are ubiquitously expressed in a wide variety of cells and tissues, whereas cathepsins F, K, S, V, X, and W show a more restricted cellor tissue-specific distribution and expression. [2][3][4] Most of the cathepsins exhibit predominantly endopeptidase activity, while cathepsins B, C, H, and X are exopeptidases. Lysosomal cysteine cathepsins resemble the papain family of cysteine peptidases (C1A). The crystal structure of papain includes two adjacent structural domains separated by a V-shaped active site cleft, with Cys25, His159, and Asn175 residues essential for catalysis. Compared to the crystal structures of true endopeptidases such as cathepsins L 5 , S 6 , and K 7 , the additional features found in the crystal structures of exopeptidases enable their exopeptidase activity by modifying the active-site cleft of these enzymes, such as an occluding loop in cathepsins B and X or an additional exclusion domain in cathepsin C and an octapeptide in cathepsin H. [8][9][10][11] The determined crystal structures of cathepsins and their substrate specifici-Turk et al.: Characteristics, Structure, and Biological Role ... ties [12][13][14] can provide clues about the biological function of these enzymes. Human cysteine cathepsins, in addition to intracellular protein degradation, participate and control many important physiological processes such as antigen presentation, 3 aging, 15 bone remodeling, 16 apoptosis, 17,18 prohormone activation, 19 and cell signaling. 13,20 However, more recent studies demonstrated the presence of the lysosomal cathepsins in the extracellular environment, nucleus, nuclear and plasma membrane, and cytosol, where they play a crucial role in the pathogenesis of cardiovascular diseases, 21 cancer, 22,23 neurodegeneration, 15,24 and other diseases.
Cathepsins are synthesized as inactive precursors; however, once activated, these enzymes are potentially hazardous to their environment. 25 Therefore, their proteolytic activities in vivo must be strictly regulated at multiple levels by various control mechanisms, including pH, zymogen activation, and endogenous protein inhibitors, to prevent improper cleavage of signaling molecules. 26,27 The nature of zymogen activation was elucidated from procathepsin structures, 28,29 which revealed that the propeptide folds on the surface of the enzyme and runs through the active site cleft, thus blocking the access of the substrate. In the final step, the propeptide unfolds at acidic pH and opens the catalytic site of the mature enzyme. 30 Propeptides differ in their length. Cathepsin X propeptide contains only 38 residues, 31 while cathepsin C and F propeptides contain 206 and 251 residues, respectively. 32,33 In most cathepsins, the N-terminal propeptide is proteolytically removed by various proteases 34 or autocatalytically under acidic conditions. 30,35,36 Very recently was demonstrated that procathepsin H is not autoactivated but requires other proteases, such as endopeptidase cathepsin L for its activation. 37 It was found that glycosaminoglycans (GAGs) can accelerate the autocatalytic removal of the propeptide and subsequent activation of cathepsin B 38 and some other cathepsins. 28 The released propeptides from endopeptidases exhibit a limited selectivity of inhibition against their cognate cathepsins in vitro, 39 whereas the true exopeptidases cathepsins C and X require endopeptidases, such as cathepsins L and S, for their activation but not autocatalytic processing. 40 The main regulators of cysteine cathepsins and other papain-like enzymes are their endogenous protein inhibitors, cystatins. The main function of cystatins is to protect the organism against their endogenous enzymes when released from the lysosomes to the extracellular environment, as well as to serve as a defense mechanism against proteases of invading pathogens. In the immune system, parasite stefins and cystatins modulate host's cysteine cathepsin activities by inhibiting processing of exogenous antigens and the MHC class II -Ii, carried out by lysosomal cysteine cathepsins and legumain. [41][42][43] Stefins and cystatins upregulate nitric oxide (NO) production by interferon γ-activated murine macrophages. NO inhibits cysteine proteases, particularly those from parasitic protozoa. [44][45][46] Parasite inhibitors contribute to the innate and adaptive immunity by targeting host's cysteine peptidases. It is evident that cystatin thus exert several immunomodulary functions. 41 The cystatins are generally non-selective, competitive, reversible, and tight-binding inhibitors. 28 They are widely found in all living organisms, from humans, animals, plants, parasites, bacteria, and archaea. 47 Based on their protein sequences and tertiary structure, the cystatin family (clan IH) is divided into three inhibitory subfamilies: the stefins (type-1 cystatins or I25A), cystatins (type-2 cystatins or I25B), and kininogens (type-3 cystatins or I25C), as seen in the MEROPS database (http://merops. sanger.ac.uk). However, the classification of protein peptidase inhibitors, including the cystatin family, is continually being revised. 48 Stefins are primarily intracellular single-chain proteins of about 100 amino acid residues that lack carbohydrate and disulfide bonds. Cystatins are extracellular single-chain proteins of about 115 amino acid residues and contain a signal peptide for secretion and two intracellular disulfide bridges, with the exception of human cystatin F, which contains an additional disulfide bridge. The most well-studied member is human cystatin C. 49 The third subfamily of inhibitors is the kininogens, also known as kinin precursor proteins. 50 They are large multifunctional and multi-domain proteins and are predominantly found in the blood plasma, with different biological functions attributable to each different domain. In humans, there are two types of kininogens: high-molecular-weight kininogen (HK) and low-molecular-weight inhibitor (LK). Both HK and LK are composed of three tandemly repeated type-2 cystatin domains (designated 1, 2, and 3) containing eight disulfide bridges. Only domains 2 and 3 of HK and LK bind and inhibit various cysteine proteases, including cathepsins and cruzipain. [51][52][53] Additional information about cystatins can be found in a recent review 28 and in several older review papers. [54][55][56][57] In addition to the cystatins, there are other known protein inhibitors of papain-like enzymes. Structurally unrelated to cystatins are thyropins, which are assigned according to the MEROPS database to the family I31 of clan IX 48 and show significant homology to thyroglobulin type-1 domains. 58 The main representatives are the p41 fragment of the invariant chain of MHC class II molecules 59,60 and the equistatin from the sea anemone Actinia equina. 61,62 The equistatin is composed of the three structurally related domains; the N-terminal domain inhibits cysteine cathepsins, 63 whereas the second domain inhibits lysosomal cathepsin D. 61,63 The p41 fragment strongly inhibits various cysteine cathepsins 64 and cruzipain. 65 The crystal structure of the cathepsin L-p41 inhibitory fragment complex possesses a novel fold of p41, which enables specificity to their target enzymes, in contrast to rather non-selective cystatins. 66 It was demonstrated that mammalian serpins are involved in cross-class inhibition with cysteine proteases. Thus, the serpin endopin 2C demon-strates selective inhibition of cathepsin L and elastase-like serine protease, 67 while the serpin squamous cell carcinoma antigen (SCCA) inhibits cathepsins K, L, and S, 68 suggesting a novel inhibitory pathway.
Many small-molecule protease inhibitors of clinical significance and applicability were synthesized using a number of reactive groups, which interact with enzymes. For example, the pioneering group of Elliott Shaw exploited, among others, the diazomethyl ketone functional group to inhibit irreversibly cysteine proteases, including cathepsins. 69,70 The discovery of the epoxysuccinyl-based inhibitor E-64 71 as a non-selective irreversible inhibitor of cysteine cathepsins led to its wide use in a variety of biological studies and as a diagnostic tool to assess the proteolytic activity of cysteine cathepsins and some other related enzymes. E-64 does not inhibit aspartic, serine, and metallo-proteases. Many E-64 derivatives were systematically synthesized by Katunuma's group that targeted various cysteine cathepsins such as CA-030, CA-074, and several CLIK inhibitors (reviewed in 72 ). It was reported that CA-074 as a specific inhibitor of cathepsin B suppressed the degradation of collagen in rheumatoid arthritis fluid. 73 The crystal structure of cathepsin B in complex with CA030 revealed for the first time a substrate-like interaction in the S1' and S2' sites of the active site cleft of the enzyme. 74 The binding geometry of the double-headed inhibitors was confirmed by the crystal structure of the papain-CLIK complex 75 and the cathepsin B-NS-134 complex. 76 More details about small-molecule inhibitors can be found in previous reviews 28,77,78 and in numerous original publications. Recent advances in the field of cysteine cathepsins as suitable drug targets are providing valuable research avenues for the treatment of various diseases that result from uncontrolled elevated cathepsin activity. 79 In this review, after the introduction to the lysosomal cysteine cathepsins and the regulation of their activities by various protein and chemically synthesized inhibitors, we discuss the current knowledge of the properties, structural characteristics, and oligomerization of protein inhibitors belonging to the stefin subfamily (type-1 cystatins) of the cystatin family.

1. Human and Other Mammalian Stefins
The first protein inhibitor of papain-like cysteine protease was isolated and characterized from chicken egg white, and later, the name "cystatin'' was proposed to designate its function. 80 The first intracellular protein inhibitors were isolated and partially characterized from pig leucocytes, 81 human epidermis, 82 and human spleen. 83 The stefin inhibitor (later named stefin A) was isolated from human polymorphonuclear granulocytes, and the amino-acid sequence was determined 84,85 as well as that of chicken cystatin from egg white. 86 Both sequences con-firmed structural differences between these two homologous protein inhibitors. In addition, a protein inhibitor of cysteine cathepsins was isolated from the sera of patients with Balkan endemic nephropathy, and the first 47 residues of the N-terminal sequence 86 was identical to that of human γ-trace, 87 and the name human cystatin was proposed. 86,88 Soon after, it was renamed human cystatin C. 89 These and other accumulated data were of great importance for the nomenclature and classification of the cystatin superfamily, comprising three families. 90 The inhibitor cystatin B/stefin B was isolated from human liver 91 and human spleen, 92 and the resulting sequences of the first 65 residues were identical, thus strongly suggesting that both inhibitors, isolated from different tissues, are structurally identical proteins. 92 The stefin B dimer was confirmed for the first time from human spleen. 92 Structurally homologous inhibitors to human stefins A and B have been identified and characterized in mammals, such as rats 93,94 and mice. 95 Stefin A, 96 stefin B, 97 and stefin C 98 are found in bovines. Interestingly, bovine stefin C was identified as the first tryptophan-containing stefin with a prolonged N-terminus. Four different porcine stefin-type inhibitors, namely A, B, D1, and D2, have been isolated and characterized. 99 Porcine stefins A, B, and D1 were sequenced, revealing that porcine D1 and the previously characterized pig leukocyte cysteine proteinase inhibitor-PLCPI 100 were identical proteins. Most of the stefins occur in multiple isoelectric forms in acidic or close to neutral pH and are mostly stable in the pH range 3-10 and temperatures up to 80 °C, thus avoiding protein denaturation. 54 Among mammals, human stefins A and B are clearly the main representatives and most studied protein inhibitors of the stefin subfamily. However, homologues of both human stefins have been found in various mammals, as mentioned above. Human stefins are intracellular proteins that are present in the cytosol of many cell types and tissues, but they also appear extracellularly in body fluids. 101 They are synthesized without signal peptides. It seems that stefin B is generally more widely spread in various cell types and tissues than stefin A. Stefins are the smallest among the members of the cystatin family of inhibitors.

Stefins from Parasite Origin
Little is known about stefins, cystatins, and other protease inhibitors in parasites and their role to protect themselves from degradation by host proteases and to manipulate the host response to the parasite. 102 Stefins have been identified and characterized in a wide range of organisms. 47,103,104 Currently, about 700 members of the stefin subfamily can be found in the MEROPS database. They are involved in the regulation of their own proteolytic activities and processing of their host proteins. Two inhibitors were isolated from the liver fluke Clonorchis sinensis, CsStefin-1 and CsStefin-2, which have sequence similarities to human stefins. 105,106 It was suggested that both in-Turk et al.: Characteristics, Structure, and Biological Role ... hibitors share functionally redundant regulatory functions to modulate activity and processing of CsCathepsin F. In addition, two inhibitors were isolated from the tropical liver fluke Fasciola gigantica, FgStefin-1 107 and FgStefin-2, which contain a signal peptide. 107,108 The cystatin B homologue SmCytB from turbot Scophthalmus maximus enhances macrophage bactericidal activity. 109 Three different stefins, designated rFhStf-1, rFhStf-2, and rFhStf-3, expressed by the trematode Fasciola hepatica exhibited differences in their inhibition profile against various tested enzymes. 110 Immunomodulatory properties of FhStefins could be used in order to evaluate their therapeutic potential against inflammatory diseases. The inhibitors FhStf-2 and FhStf-3 fall into an atypical subgroup of stefins due to the presence of a signal peptide, similar to the previously mentioned FgStefin-2. 108 The cysteine protease inhibitor AcStefin was identified and characterized from Acanthamoeba, the causative agent of granulomatous amoebic encephalitis and amoebic keratitis. 111 The human stefin homolog as SmCys expressed by Schistosoma mansoni is involved in hemoglobin degradation and its regulation 112 . Very recently, the novel stefin-type inhibitor EnStef was found in the sanguinivorous fish parasite Eudiplozoon nipponicum, and it was found to inhibit endogenous cathepsins and, surprisingly, legumain, asparaginyl endopeptidase (family C13), from the Ixodes ricinus tick. 113 Notably, only limited knowledge about the characteristics and roles of fish stefins and other endogenous inhibitors are available. [114][115][116] It is well known that fish and shellfish quality depend on the meat texture, which is mainly controlled by proteolysis and autolysis and storage conditions. Endogenous proteases and their inhibitors play crucial roles in these processes, as do fish parasite proteases and their inhibitors. Therefore, more biochemical and molecular biology studies in this direction are of great economic importance in order to improve and ensure the quality of fish and their products. 117

1. Mammalian Stefins
Members of the stefin subfamily are rather non-specific inhibitors of mammalian cysteine cathepsins. They are competitive, reversible inhibitors that form tight, equimolar complexes with their target enzymes. 54,55 However, they are able to differentiate between endopeptidases and exopeptidases because of the differences in the structures of the interacting regions of the enzymes. Human and other mammalian stefins mostly act as fast and tight-binding inhibitors of typical endopeptidases, cathepsins L and S, papain, and cruzipain, inhibiting with Ki values in the pM to nM range. 28,51 However, human stefin B is generally a weaker inhibitor than stefin A. In contrast, the exopeptidases cathepsins B, X, C, and H possess structural features that restrain the binding of the inhibitors to the parts of the active site cleft. 118 In mice, there are at least three variants of stefin A (Stfa1, Stfa2, and Stfa3); the first two are a result of polymorphisms. 95 Two variants, Stfa1 and Stfa2, act as fast and tight-binding inhibitors of endopeptidases such as cathepsins L and S (Ki values ranging 0.07-0.16 nM); however, their interaction with the exopeptidases cathepsins B, C, and H is several orders of magnitude weaker compared to that of human, porcine, and bovine stefins, suggesting that in mice, stefin A variants are involved predominantly in the regulation of endopeptidases. Bovine stefin A binds tightly and rapidly to cathepsin L (Ki = 0.03 nM), binds weaker to cathepsin H (Ki = 0.4 nM), and binds to cathepsin B slower but still tight (Ki = 1.9 nM), indicating different mechanisms of inhibition of various cathepsins by stefin A. 96 Bovine stefin B strongly inhibits cathepsin S (Ki = 8.0 pM) as a tight-binding inhibitor. 97 Similar to bovine stefins A and B, bovine stefin C strongly inhibits cathepsin L and papain (Ki of about 0.18 nM) and weakly inhibits exopeptidase cathepsin B. 98 Interestingly, porcine stefins A and B bind tightly and rapidly to exopeptidase cathepsin H (Ki = 0.02 and 0.07 nM, respectively), stefins D1 and D2 are poorer inhibitors of the same enzyme (Ki = 102-125 nM) and weak inhibitors of cathepsin B (Ki = 335 and 195 nM, respectively), and all four stefins (A, B, D1, and D2) are fast-acting and tight-binding inhibitors to the endopeptidases cathepsins L and S and papain (Ki values ranging 0.01-0.19 nM), as expected. 99 These results suggest that in addition to the differences in the enzyme active sites, which are used to classify proteases as endo-and exopeptidases, minor specific structural features of the porcine stefins, in particular, play an important role in binding.

2. Stefins of Parasite Origin
In non-mammalian species, there are some important differences in the potency and selectivity of their target enzymes compared to human and other mammalian stefins. There are several examples listed in this context. Two stefins (CsStefin-1 and CsStefin-2) from the parasite Clonorchis sinensis almost equally inhibit the endopeptidase plant papain, human cathepsin L, two endogenous cathepsin F variants (CsCF-4 and CsCF-4-6), and surprisingly human cathepsin B. All enzymes are inhibited in the range of Ki 0.03-0.06 nM. 105,106 Nanomolar inhibitions of bovine cathepsins B and L, human cathepsin S, and the released cysteine protease of the parasite were observed with the fluke Fasciola gigantica inhibitors FgStefin-1 and Fg-Stefin-2. 107,108 The Fasciola hepatica recombinant stefin inhibitors rFhStf-1, rFhStf-2, and rFhStf-3 strongly inhibit two variants of endogenous cathepsin L (FhCL-1,-3) and human cathepsin L (Ki 1.52-52 nM); variants rFhStf-1 and rFhStf-2 inhibit human cathepsin C (Ki 35-57 nM); and human cathepsin B is inhibited only by rFhStf-2 (Ki = 15 nM). 108 The S.mansoni inhibitor SmCys strongly inhibits papain (Ki = 0.065 nM). 112 However, the N-terminally truncated forms of SmCys with deletions of 10 and 20 amino acid residues resulted in much weaker papain inhibition (Ki = 0.739 nM and 4.915 nM, respectively). A similar effect was observed in truncated forms of human cystatin C of the first ten residues 119 and chicken cystatin upon deletion of the first eight residues preceding Gly9. 120 In summary, it is evident that some stefin-type parasite inhibitors are strong and tight-binding inhibitors of their endogenous cysteine proteases-cathepsins as well as human and other mammalian cathepsins, suggesting their involvement in the immune regulation and inflammatory diseases. [121][122][123][124][125] Furthermore, they demonstrate different inhibitory potencies against their endogenous cathepsins with endo-and exopeptidases activities compared to those of human and other mammalian cathepsins. This might be of importance for the successful accommodation and reproduction of parasites in their host organisms.

1. Interaction Between Stefins and Cathepsin Endopeptidases
Based on the known 3D structures of the chicken egg white cystatin 126 and human stefin B-papain complex, 127 the amino acid sequences of several stefins of mammalian and parasite origin have been aligned. The conserved residues in equivalent positions confirmed the relationships between stefin and cystatin subfamilies, although some differences are evident (Figure 1). Moreover, the correct alignment of the stefins and the cystatins revealed that the previous sequence alignments were partly incorrect because of the deletion of the shorter α-helical segment in the stefins.
The first and the most important step in the elucidation of the mechanism of inhibition of cysteine proteases was the determination of the crystal structure of chicken cystatin. 126 The chicken cystatin molecule consists mainly of a five-stranded antiparallel β-pleated sheet that is twisted and wrapped around a long central α-helix and an appending shorter α-helical segment. The partially flexible N-terminal highly conserved GG residues, an exposed first hairpin loop with the sequence QLVSG (the prototype of the highly conserved QVVAG sequence in almost all stefins), and a second hairpin loop with PW residues form a wedge-shaped hydrophobic tripartite edge that has high complementarity to the V-shaped active site cleft of papain, as shown in a docking experiment. 126 Based on this docking model, the mechanism of interaction between cysteine proteases and their cystatin-like inhibitors was proposed 126 and later essentially confirmed by the crystal structure of the recombinant human stefin B-papain complex. 127 The main-chain interactions are provided by the N-terminal segment occupying the non-primed subsites S3 to S1 of the enzyme in a substrate-like manner, but the peptide segment afterwards turns away at P1 from the ac-tive site preventing cleavage. The two hairpin loops bind to the primed-sites (S1' to S4') of the enzyme (Figure 2). In stefin B, there are only minor contributions from the second hairpin loop, but the carboxyl terminus provides an additional interaction region compared to chicken cystatin. These results provide firm evidence that the inhibition by the protein inhibitors of cysteine proteases is fundamentally different from that obtained with serine protease inhibitors. 129

2. Interaction Between Stefins and Cathepsin Exopeptidases
Binding of the cystatin-type inhibitors to cathepsin exopeptidases cannot be explained by the stefin B-papain complex. 127 Cathepsin H acts as an aminopeptidase and endopeptidase; however, it exhibits strong aminopeptidase activity and is inhibited by various cystatins, including the tight-binding inhibitor stefin A, with Ki = 0.31 nM. 28 The crystal structure of native porcine cathepsin H shows a typical papain fold. 11 In addition, it contains the octapeptide EPQNCSAT derived from the propeptide, called a mini-chain, which is covalently attached to the main body of the enzyme by the disulfide bond to the narrowed active site cleft in the substrate-binding direction in non-primed binding sites from S2 backwards (Figure 3). The major reason for the narrowing feature is a unique insertion loop of four residues. The carbohydrate moiety attached to the main body of the enzyme participates in the positioning of the mini-chain in the active-site cleft.
The displacement of the residues in the active site cleft results in the exopeptidase activity of cathepsin H. From the crystal structure of the stefin A-cathepsin H complex, 127 it is evident that stefin A binds to the active site cleft of the en-  zyme. However, the N-terminal residues of stefin A adopt the form of a hook, which pushes away the cathepsin H mini-chain residues and distorts the structure of an insertion loop that is unique to cathepsin H (Figure 4).
The crucial role of the human cathepsin H minichain was further confirmed by the expression of the recombinant cathepsin H in Escherichia coli as a nonglycosylated protein lacking the mini-chain after autocatalytic processing. 132 Removal of the mini-chain resulted in endopeptidase activity only. The recombinant cathepsin H was inhibited by human stefins A and B with Ki values in the range of 0.05-0.1 nM, which is stronger than the inhibition of native cathepsin H. Another example that possesses both exopeptidase and endopeptidase activities is human cathepsin B 8 ( Figure 5).
Although its overall structure and the arrangement of the active site residues are similar to those of endopeptidase papain, there are several insertion loops on the surface of the molecule that modify its properties. Some of the primed subsites are occluded by a novel 20 residue peptide segment, termed the occluding loop with two histidine residues (H110 and H111), which provide positively charged anchors for the C-terminal carboxylate group of the polypeptide substrates. The occluding loop restricts access to the active site cleft of cathepsin B by occupying part of the active site cleft on the primed side and blocking access to the active site cleft beyond the S2' substrate binding site. 8,12 These structural features explain the unique peptidyl-dipeptidase activity of exopeptidase cathepsin B. Deletion of the occluding loop by site-directed mutagenesis resulted in an enzyme with endopeptidase activity but completely lacking exopeptidase activity. 134 The crystal structure of the human stefin A-human cathepsin B complex revealed that occluding loop residues are displaced, thus allowing the in-teraction with inhibitors in the binding region 133 and indicating that the occluding loop flexibility must be responsible for the cathepsin B endopeptidase activity.
Most of the protein structures were determined by X-ray crystallography with comparisons to NMR spectroscopy. Structures determined by both techniques, in the solid state and in solution, are usually very similar. However, two NMR structures of chicken cystatin, the native phosphorylated and recombinant non-phosphorylated variants, 135,136 showed the same overall fold and the flexible N-terminal part, but there were also some significant differences in the structurally variable segments of the polypeptide chain compared to the crystal structure. 126 The NMR analysis revealed that the second α-helix determined in the crystal is not present in the solution. Similarly, the solution structure of human stefin A 137 showed similarity to the homologous protein stefin B in complex with papain, 127 but some important differences in the binding regions such as in the mobile N-terminal region and the second binding loop were observed. The crystal structure of the stefin B type inhibitor CsStefin-1 from the liver fluke C. sinensis was just reported, indicating some minor structural differences to human stefin B such as a four-stranded antiparallel β-pleated sheet and an additional short α-helix not present in human stefin B. 138

3. Oligomerization and Fibrillogenesis of Stefins and Cystatins
Small-sized proteins, also termed mini-proteins, represent a useful and relatively simple model for studies on  oligomeric proteins. They are composed of two or more subunits, and most of them are symmetrical homo-oligomers, which are on average tetramers. 139 Oligomerization results from a variety of mechanisms and can provide insights into the evolution of proteins. Suitable examples are stefins (I25A) and cystatins (I25B), members of the cystatin family of inhibitors. The phyletic distribution of the cystatin family indicates the presence of only two ancestral lineages, stefins and cystatins, in eukaryotes and prokaryotes. 47 Stefins are present as single copy genes or small multigene families throughout the eukaryotes and underwent small changes in function during evolution. In contrast to stefins, the cystatins went through a more complex evolution involving numerous gene and domain duplications. Stefins and cystatins share a rather high sequence similarity and nearly the same fold, as already discussed. The early finding that human stefin B 92 and rat TPI-2 (cystatin β/stefin B) 140 form dimers indicated for the first time the possible appearance of the oligomerization of these proteins. Later, it was found that the trematode parasitic C. sinensis native stefin-type inhibitor CsStefin-2 exists in monomer, dimer, and tetramer forms, which are not the result of interchain disulfide bond interactions. 105 Similarly, the oligomerization from monomers (10 kDa) to oligomers of various sizes (over 100 kDa) in F. hepatica stefin-type inhibitors (rFhStf-1, rFhStf-2, and rFhStf-3) was reported very recently. 110 An important step in elucidating the oligomerization of cystatins was determining the crystal structure of dimerized domain-swapped human cystatin C 141 and of chicken cystatin and human stefin A in solution. 142 Then, it was demonstrated that the domain-swapped dimer of chicken cystatin oligomerizes to a tetramer as a transient intermediate prior to oligomerization. 143 Furthermore, it was shown that human cystatin C oligomers are intermediates in fibrillogenesis, indicating that the propagation of three-dimensional domain swapping is crucial to oligomerization processes. 144 A variant of human cystatin C (L68Q mutant) found in patients with hereditary cystatin C amyloid angiopathy (HCCAA) causes massive amyloidosis as a result of amyloid fibrils in the cerebral arteries, with fatal consequences for young adults 145,146 ). It has just been reported that the conformational destabilization of human cystatin E (I25B) results in a domain-swapped dimer that can convert to amyloid fibrils. 147 This dimer inhibits legumain by forming a trimeric complex but does not inhibit papain and human cathepsin S. Furthermore, it was shown that recombinant human stefin B, in contrast to stefin A, dimerizes, oligomerizes, and forms amyloid fibrils under in vitro conditions. 148,149 Soon afterwards, the crystal structures of the tetrameric human stefin B and of stefin B in solution were determined. 150 The structures revealed that the formation of the stefin B tetramer is not a further domain swapping process, as it was proposed earlier for cystatins, 151,152 but a new mechanism, termed hand shaking, through which 3D domain-swapped dimers be-come entwined as a consequence of concurrent trans to cis isomerization of proline 74, 150 as can be seen in Figure 6.
This proline residue is widely conserved throughout the stefins and cystatins. It was found that the tetrameric structure of stefin B in solution correlates with that of the crystal. These and other experimental data suggest that the isomerization of proline residues is a crucial component in tetramerization and very likely involved in other steps of amyloid formation. Taken together, the similarities in structure, sequence, and oligomerization processes between stefins and cystatins suggest that in addition to domain swapping there is an additional mechanism called, hand shaking, in which the trans to cis isomerization of proline 74 is leading from may be on the path of formation of the mature fibrils. Additional information about oligomerization and amyloid formation can be found in previous reports. [151][152][153] The recent progress in sample preparation due to their polymorphic purity, as well as solid state NMR and cryo-EM methods, recently provided insight in high-resolution 3D structures of amyloids. 154,155

Conclusions and Future Trends
Lysosomal cysteine cathepsins and the precise regulation of their harmful proteolytic activities are of crucial importance to prevent improper cleavage(s) of signaling molecules. 55,156 There are several means for this regulation, one of which is the use of endogenous protein inhib- itors, such as stefins and cystatins. We understand a great deal about the mechanisms of interaction with their target enzymes. However, low specificity of inhibitors for their target proteases indicates that we still do not understand their exact individual physiological roles.
On the other hand, most of the cathepsins are ubiquitously expressed, exhibit relatively wide specificity, and have multiple functions. Therefore, it is crucial to understand the diseases in which cathepsins play critical roles and the roles of individual cathepsins in these diseases. Interestingly, mutations in two endogenous protein inhibitors of cysteine cathepsins, stefin B, and cystatin C, are critical for the development of two neurological disorders, such as Unverricht-Lundborg disease-EPM1 157,158 and Hereditary cystatin C amyloid angiopathy (HCCAA). 159,143,160 Insight into the interplay of stefins and cathepsins may encourage the development of selective cathepsin inhibitors as candidates for clinical studies and eventually new drugs. 79 Furthermore, studies on parasites induced immune regulation and inflammatory diseases should also be encouraged in order to develop new therapeutic drugs. [161][162][163] Another important area is the identification of physiological substrates using proteomic strategies and chemical tools. 164,165 Although the understanding of the complexity of the numerous vital biological processes, both physiological and pathological, is best illustrated by the current trends in a number of ongoing research projects, it is likely that studies on the regulation of proteolysis in the light of the structure-function relationship will reveal valuable information in the near future.

Acknowledgments
Funding was provided by Slovenian Research Agency to research programs P1-0140 (B. Turk) and P1-0048 (D. Turk), and to the Infrastructural Funds to Centre of Excellence CIPKeBiP IO-0048 (D. Turk).