Causal association of cathepsins with female infertility: a bidirectional Mendelian randomization analysis

Article information

Obstet Gynecol Sci. 2025;68(3):237-243
Publication date (electronic) : 2025 April 3
doi : https://doi.org/10.5468/ogs.24254
Guangxi Reproductive Medical Center, The First Affiliated Hospital of Guangxi Medical University, Nanning, China
Corresponding author: Lidan Liu, PhD Guangxi Reproductive Medical Center, The First Affiliated Hospital of Guangxi Medical University, Nanning 530021, China E-mail: liulidan2022@126.com
Received 2024 September 21; Revised 2024 December 24; Accepted 2025 March 21.

Abstract

Objective

This study aimed to systematically evaluate potential causal relationships between nine cathepsins and female infertility using Mendelian randomization (MR) methods.

Methods

A bidirectional MR analysis was conducted utilizing single nucleotide polymorphisms as instrumental variables to investigate the potential causal effects between nine cathepsins and female infertility. Genetic data on female infertility were sourced from the FinnGen study, and cathepsin-related data were obtained from genome-wide association studies datasets of European ancestry.

Results

Elevated levels of cathepsin E were significantly and inversely associated with the risk of female infertility, suggesting a potential protective role. This finding was further supported by multivariable MR analysis. However, no significant associations were observed between the other eight cathepsins and female infertility.

Conclusion

This study represents the first systematic MR analysis to identify a potential protective effect of cathepsin E on female infertility.

Introduction

Infertility is defined as the inability to achieve a clinical pregnancy after 12 months or more of regular, unprotected intercourse [1]. Globally, the prevalence of infertility among couples of reproductive age ranges from 8% to 12%, with approximately one in eight couples seeking medical treatment after failing to conceive within a year [2]. Multiple risk factors are associated with infertility, including advanced maternal age, lifestyle factors (such as substance use, smoking, and alcohol consumption), sexually transmitted diseases, pelvic inflammatory disease, obesity, polycystic ovary syndrome, and diabetes. Additionally, disorders affecting the fallopian tubes, ovaries, or uterus can also contribute to female infertility [3-5]. Enhancing our understanding of the etiological factors underlying female infertility is crucial for developing preventive strategies and improving reproductive health outcomes.

Cathepsins are a class of lysosomal proteases that play a vital role in maintaining cellular homeostasis and regulating various biological processes [6]. Based on their catalytic mechanisms, cathepsins are classified into several families, including serine proteases (such as cathepsins A and G), cysteine proteases (such as cathepsins B, C, F, H, K, L, O, S, V, W, and X), and aspartic proteases (such as cathepsins D and E) [7]. Within cells, cathepsins contribute to a wide range of functions, including protein and lipid metabolism, autophagy, antigen presentation, growth factor receptor recycling, cellular stress signaling, extracellular matrix degradation, and lysosome-mediated programmed cell death [8]. Given their involvement in these essential physiological and pathological processes, cathepsins are recognized as key regulatory enzymes. Dysregulation of cathepsin activity has been implicated in various pathological conditions, including cancer, neurodegenerative diseases, cardiovascular diseases, and metabolic disorders [9]. Therefore, elucidating the molecular mechanisms underlying cathepsin activity and regulation in pathological contexts is critical for advancing our understanding of complex cellular networks and may offer promising therapeutic targets for novel treatment strategies.

One study comparing protein expression in the endometrium of patients with polycystic ovary syndrome (PCOS) and healthy women reported abnormal cathepsin expression in patients with PCOS. These abnormalities may lead to endometrial dysfunction, affecting its receptivity to embryos and thereby contributing to infertility [10]. Another study discovered that cathepsin B levels in the follicular fluid of pregnant patients were significantly higher than those in non-pregnant patients, suggesting that elevated cathepsin B levels are closely associated with improved pregnancy outcomes. Additionally, cathepsin B levels showed a certain degree of accuracy in predicting pregnancy outcomes. These findings suggest that cathepsins may play an important role in promoting the reproductive process in humans [11].

However, research on the role of cathepsins in reproductive health remains limited, particularly regarding the relationship between cathepsins and female infertility, where robust evidence is lacking. Therefore, there is a critical need to investigate this relationship using Mendelian randomization (MR). This method leverages genetic variants as instrumental variables (IVs) to assess potential causal relationships, thereby addressing confounding and reverse causation that commonly affect traditional observational studies. This approach could provide more reliable evidence in clarifying whether and how cathepsins influence female infertility.

Materials and methods

1. Study design

We conducted a bidirectional MR analysis to investigate potential causal relationships between nine cathepsin traits and female infertility. Single nucleotide polymorphisms (SNPs) strongly associated with the exposure factors were selected as IVs. To ensure analytical validity, SNPs in linkage disequilibrium (LD) or identified as weak instruments were excluded. For robust causal inference in MR, three key assumptions must be met: 1) the IVs must be strongly associated with the exposure; 2) they must be independent of confounding factors; and 3) they must influence the outcome only through the exposure, without exerting a direct effect on the outcome. To avoid sample overlap, we used independent genome-wide association studies (GWAS) datasets from different populations. As the original studies had obtained ethical approval and all data used were publicly available, no additional ethical approval was required for this analysis.

2. GWAS summary data sources

The GWAS summary data on female infertility were obtained from the FinnGen study, a large-scale human genome resource, using data from the R11 release (https://finngen.gitbook.io/) [12]. This dataset included genomic information from 136,188 Finnish individuals (16,720 cases and 119,468 controls), covering 20,423,066 genetic variants. In this study, female infertility was defined as the inability to conceive after a specified period of unprotected intercourse.

GWAS data used in this analysis were obtained from the publicly accessible IEU OpenGWAS database (https://gwas.mrcieu.ac.uk), which provides summary-level data from GWASs conducted in cohorts of European ancestry [13,14]. The cathepsins included in this analysis were cathepsins B, E, F, G, H, O, S, Z, and L2.

3. Instrumental variable selection

IVs (SNPs) were selected using established MR guidelines. SNPs significantly associated with each exposure were identified using a genome-wide significance threshold of <1×10-5. To ensure independence, a stringent LD threshold was applied (R2<0.001), and clumping was performed over a 10,000 kb window. SNPs with ambiguous strand alignment or allele frequency discrepancies were excluded. The strength of the selected instruments was assessed using R2 and F-statistics, ensuring that they met standard thresholds to minimize the risk of weak instrument bias [15,16].

F=R2×(N2)1R2R2=2β2EAF(1EAF)2NEAF(1EAF)SE2

4. Statistical analysis

Causal relationships between cathepsins and female infertility were assessed using the TwoSampleMR package in R Foundation (R Foundation for Statistical Computing, Vienna, Austria). Several MR methods were employed, including inverse-variance weighted (IVW), MR-Egger, weighted median, simple mode, and weighted mode approaches. The IVW method served as the primary analytical approach, while the other methods were conducted as sensitivity analyses to evaluate the robustness of the findings. Heterogeneity among IVs was assessed using Cochran’s Q test. Horizontal pleiotropy-where an instrumental variable affects the outcome through pathways unrelated to the exposure-was evaluated using the MR-Egger intercept. Outlier SNPs were identified and excluded using the MR-PRESSO test. Finally, a leave-one-out analysis was conducted to evaluate the influence of individual SNPs.

Multivariable MR analysis was performed to assess the independent effects of multiple cathepsins simultaneously. This approach provided a comprehensive understanding of their individual contributions to female infertility.

Results

Employing stringent criteria for selecting IVs, we focused on five MR methods to explore the potential links between nine cathepsins and female infertility. Among these, the IVW method was designated as the primary analytical approach due to its statistical efficiency under the assumption of no horizontal pleiotropy. In addition, MR-Egger regression, weighted median, weighted mode, and simple mode methods were implemented to provide complementary sensitivity analyses.

1. Effects of cathepsins on female infertility

The IVW analysis identified a single statistically significant association. Specifically, analysis of nine SNPs associated with cathepsin E revealed a significant inverse relationship, suggesting that elevated levels of cathepsin E are correlated with a reduced risk of female infertility. The odds ratio was calculated at 0.938, with a 95% confidence interval ranging from 0.897 to 0.982, and the association was statistically significant, with a P-value of 0.006 (Table 1). No significant associations were observed between the other cathepsins and female infertility (P>0.05) (Table 1).

Bidirectional two-sample Mendelian randomization analysis of cathepsins and female infertility

In analyses examining the relationship between cathepsin E and female infertility, the IVW and MR-Egger methods yielded P-values of 0.525 and 0.435, respectively, indicating no statistically significant heterogeneity. The MR-Egger regression reported an intercept of 0.004 with a P-value of 0.692, providing no evidence of directional pleiotropy. Additionally, the MR-PRESSO global test yielded a P-value of 0.543, supporting the absence of pleiotropic bias and reinforcing the robustness of the causal inference (P>0.05).

The leave-one-out sensitivity analysis, which systematically excludes each SNP and re-evaluates the data, showed no significant alterations in the outcomes, further underscoring the robustness of the findings (Supplementary Fig. 1). Funnel plots revealed no evidence of bias in the effects of the nine cathepsins (Supplementary Fig. 2). Scatter plots and forest plots are provided in Supplementary Figs. 3, 4, respectively.

2. Causal impact of cathepsins on the onset of female infertility

No evidence of reverse causality was observed between the nine cathepsins and female infertility (Table 1).

3. Multivariable MR

In the multivariable MR analysis, a random-effects model was employed using the IVW method, MR-Egger regression, and the weighted median method to evaluate the causal effects of multiple exposures. Across all three methods, cathepsin E exhibited a statistically significant negative association, suggesting a protective role against female infertility (IVW, β=-0.075; P-value=0.004; MR-Egger, β=-0.073; P-value=0.006; weighted median, β=-0.100; P-value=0.004). The remaining exposures did not consistently show significant effects across the methods (Fig. 1). Additionally, the heterogeneity test results from the IVW and MR-Egger methods indicated the presence of heterogeneity in SNP-specific effects (IVW heterogeneity, P-value=0.0114; MR-Egger heterogeneity, P-value=0.0112). However, the MR-Egger intercept was not statistically significant (P-value=0.452), indicating no evidence of directional pleiotropy. These findings suggest that cathepsin E may exert a protective effect, while the effects of other exposures require further investigation.

Fig. 1.

Forest plot of multivariable Mendelian randomization analysis showing the causal effects of cathepsins on female infertility. OR, odds ratio; CI, confidence interval. *Significant associations (P<0.05) are indicated with an asterisk.

Discussion

Female infertility is a significant global public health concern, characterized by complex etiologies encompassing a wide range of physiological and pathological factors. Despite extensive research into the various causes of female infertility, the investigation of proteomic biomarkers potentially associated with this condition remains relatively underexplored. In recent years, cathepsins, a crucial class of lysosomal proteases, have garnered considerable attention due to their pivotal roles in numerous cellular biological processes. However, there is a notable lack of research examining the potential causal relationships between cathepsins and female infertility, particularly regarding how aberrant protein expression might impact female reproductive health. Utilizing the MR approach, this study aims to elucidate the role of cathepsins in the pathogenesis of female infertility, thereby providing novel scientific insights into the underlying pathological mechanisms underlying infertility and may help identify promising targets for future preventive and therapeutic strategies.

This study, using a bidirectional MR analysis, represents the first systematic evaluation of potential causal relationships between nine cathepsins and female infertility. The findings indicate that elevated levels of cathepsin E are significantly and inversely associated with the risk of female infertility, suggesting a potential protective role in female reproductive function. This finding was further supported by multivariable MR analysis, reinforcing the hypothesis that cathepsin E may exert protective effects against the development of female infertility. In contrast, no significant associations were observed for the remaining cathepsins, underscoring the unique role of cathepsin E in this context. Several explanations may account for these null findings. First, the biological functions of other cathepsins may not be directly implicated in reproductive processes, or their effects may be subtle and offset by compensatory mechanisms within the cathepsin family. Second, although the statistical power of this study was substantial, it may still have been insufficient to detect smaller effect sizes associated with other cathepsins. Third, the GWAS datasets used may have inherent limitations, such as sample size constraints or population-specific genetic effects, which could influence the detection of significant associations. Future studies involving larger, multi-ethnic cohorts and functional validation experiments are warranted to further investigate these possibilities.

Compared to previous studies, the present research provides novel insights into the role of cathepsins-particularly cathepsin E-in female infertility. Prior investigations have primarily focused on the broader biological functions of cathepsins in cellular homeostasis and their involvement in diverse pathological conditions, such as cancer, neurodegenerative diseases, and metabolic disorders [9]. However, limited attention has been given to their specific impact on reproductive health. While earlier studies suggested that abnormal expression of cathepsins could influence endometrial function and embryo receptivity, thereby contributing to infertility, they lacked robust evidence establishing a direct causal relationship [10,11]. This study addresses that gap by employing MR to establish a significant inverse association between cathepsin E levels and the risk of female infertility, thereby providing stronger and more reliable evidence for a potential protective role of cathepsin E in reproductive health. Additionally, the absence of significant associations for the other cathepsins reinforces the specificity of cathepsin E’s effect, a distinction that was not fully recognized in previous research. Collectively, these findings highlight cathepsin E as a unique biomarker and potential therapeutic target in the context of female infertility, offering a foundation for further research in this domain.

The protective effects of cathepsin E in female infertility are likely mediated by its involvement in key biological processes essential for reproductive health. Cathepsin E, an aspartic protease, plays a critical role in lysosomal function by contributing to protein turnover and maintaining cellular homeostasis [17]. Dysregulation of protein homeostasis has been implicated in impaired ovarian function, adversely affecting oocyte quality and endometrial receptivity. Furthermore, cathepsin E contributes to immune modulation by regulating antigen processing and presentation, processes that are essential for establishing a receptive endometrium and facilitating successful embryo implantation [18,19]. Its ability to modulate oxidative stress and inflammatory pathways may also protect reproductive tissues from damage, thereby fostering a physiological environment conducive to fertility [20]. Together, these mechanisms suggest that elevated cathepsin E levels may support female reproductive function and reduce infertility risk.

The absence of significant associations between other cathepsins and female infertility observed in this study can be attributed to several factors. First, the biological functions of these cathepsins may be less directly relevant to reproductive processes, or their effects may be masked by compensatory mechanisms within the cathepsin family. Second, variations in gene expression or enzymatic activity may not reach biologically meaningful thresholds capable of influencing infertility within the studied population. Finally, unmeasured environmental or epigenetic factors may interact with genetic predispositions, thereby obscuring the contributions of these cathepsins.

These findings align with previous studies that highlighted the roles of cathepsins in reproductive health, particularly in endometrial function and embryo implantation. For example, elevated cathepsin B levels have been linked to improved pregnancy outcomes [21]. However, the present study uniquely underscores the specificity of cathepsin E’s protective effects on female fertility-a relationship that had not been explicitly identified in prior research. Earlier studies primarily focused on broader proteomic profiles without establishing direct causal links [22]. In contrast, this MR analysis strengthens the evidence base by minimizing confounding and reverse causation, thereby offering a novel, targeted perspective on cathepsin E’s role in female reproductive health.

This study holds promising clinical implications. First, the significant inverse association between cathepsin E levels and female infertility supports its potential utility as a biomarker and therapeutic target. Second, given the protective role suggested for cathepsin E, drug development efforts could focus on modulating its activity as a novel therapeutic strategy to enhance reproductive health. Finally, this research lays the groundwork for further investigation into the roles of other cathepsins in female infertility and related reproductive disorders.

This study provides valuable insights but is subject to two significant limitations. First, the genetic data utilized in this analysis were predominantly derived from individuals of European ancestry, which may limit the generalizability of the findings to other ethnic groups and geographic regions. This limited diversity highlights the need for future studies to prioritize more inclusive sampling strategies, incorporating underrepresented populations to improve the applicability of genetic findings across diverse groups. Second, the constraints of existing GWAS datasets prevented comprehensive analysis of all cathepsins potentially implicated in female infertility.

As a result, some relevant genetic associations may remain unexplored, warranting further investigations using more extensive datasets that encompass a broader range of candidate genes. Future research should also consider integrating multi-omics approaches-such as transcriptomics and proteomics-to provide a more holistic understanding of cathepsin-related pathways and their roles in female infertility across different populations.

Notes

Conflict of interest

The authors declare no competing interests.

Ethical approval

Ethical approval was not required for this study as it involved publicly available data from previously published studies. All analyses were conducted in accordance with the ethical guidelines and regulations of the source data.

Patient consent

The study used public GWAS data from FinnGen (R11 release) and IEU OpenGWAS. All original studies obtained participant consent and ethical approval. No additional consent was required for this analysis of anonymized data.

Funding information

No specific funding was received for this study.

References

1. Cucinella G, Gullo G, Catania E, Perino A, Billone V, Marinelli S, et al. Stem cells and infertility: a review of clinical applications and legal frameworks. J Pers Med 2024;14:135.
2. Anyanwu M, Touray A, Kujabi T, Suwareh K, Sumbunu A, Drammeh R, et al. The prevalence and etiology of infertility in a tertiary specialist hospital in the Gambia. Glob Reprod Health 2024;9e0090.
3. De Oliveira Trigo BR, Trigo A, Khan Sullivan F, Roshan D, Isazad M, Izquierdo H, et al. P-491 preliminary analysis of 943 patients using the fertility risk detection tool. Hum Reprod 2022;37:deac107–458.
4. Kadour-Peero E, Feferkorn I, Hadad-Liven S, Dahan MH. Does it affect the live birth rates to have a maximum endometrial thickness of 7, 8, or 9 mm in in-vitro fertilization-embryo transfer cycles? Obstet Gynecol Sci 2024;67:497–505.
5. Nori W, Helmi ZR. Can follicular fluid 8-oxo-2’-deoxyguanosine predict the clinical outcomes in ICSI cycle among couples with normospermia male? Obstet Gynecol Sci 2023;66:430–40.
6. Gallwitz L, Bleibaum F, Voss M, Schweizer M, Spengler K, Winter D, et al. Cellular depletion of major cathepsin proteases reveals their concerted activities for lysosomal proteolysis. Cell Mol Life Sci 2024;81:227.
7. Patel S, Homaei A, El-Seedi HR, Akhtar N. Cathepsins: proteases that are vital for survival but can also be fatal. Biomed Pharmacother 2018;105:526–32.
8. Cheng XW, Narisawa M, Wang H, Piao L. Overview of multifunctional cysteinyl cathepsins in atherosclerosis-based cardiovascular disease: from insights into molecular functions to clinical implications. Cell Biosci 2023;13:91.
9. Stoka V, Vasiljeva O, Nakanishi H, Turk V. The role of cysteine protease cathepsins B, H, C, and X/Z in neurodegenerative diseases and cancer. Int J Mol Sci 2023;24:15613.
10. Amjadi F, Mehdizadeh M, Ashrafi M, Nasrabadi D, Taleahmad S, Mirzaei M, et al. Distinct changes in the proteome profile of endometrial tissues in polycystic ovary syndrome compared with healthy fertile women. Reprod Biomed Online 2018;37:184–200.
11. Bastu E, Gokulu SG, Dural O, Yasa C, Bulgurcuoglu S, Karamustafaoglu Balci B, et al. The association between follicular fluid levels of cathepsin B, relaxin or AMH with clinical pregnancy rates in infertile patients. Eur J Obstet Gynecol Reprod Biol 2015;187:30–4.
12. Kurki MI, Karjalainen J, Palta P, Sipilä TP, Kristiansson K, Donner KM, et al. FinnGen provides genetic insights from a well-phenotyped isolated population. Nature 2023;613:508–18.
13. Folkersen L, Fauman E, Sabater-Lleal M, Strawbridge RJ, Frånberg M, Sennblad B, et al. Mapping of 79 loci for 83 plasma protein biomarkers in cardiovascular disease. PLoS Genet 2017;13e1006706.
14. Sun BB, Maranville JC, Peters JE, Stacey D, Staley JR, Blackshaw J, et al. Genomic atlas of the human plasma proteome. Nature 2018;558:73–9.
15. Pierce BL, Ahsan H, VanderWeele TJ. Power and instrument strength requirements for Mendelian randomization studies using multiple genetic variants. Int J Epidemiol 2011;40:740–52.
16. Palmer TM, Lawlor DA, Harbord RM, Sheehan NA, Tobias JH, Timpson NJ, et al. Using multiple genetic variants as instrumental variables for modifiable risk factors. Stat Methods Med Res 2012;21:223–42.
17. Scarcella M, d’Angelo D, Ciampa M, Tafuri S, Avallone L, Pavone LM, et al. The key role of lysosomal protease cathepsins in viral infections. Int J Mol Sci 2022;23:9089.
18. Abdulkhalikova D, Sustarsic A, Vrtačnik Bokal E, Jancar N, Jensterle M, Burnik Papler T. The lifestyle modifications and endometrial proteome changes of women with polycystic opvary syndrome and obesity. Front Endocrinol (Lausanne) 2022;13:888460.
19. Scantamburlo VM, Linsingen RV, Centa LJR, Toso KFD, Scaraboto D, Araujo Júnior E,  et al. Association between decreased ovarian reserve and poor oocyte quality. Obstet Gynecol Sci 2021;64:532–9.
20. Song H, Zhang R, Liu Y, Wu J, Fan W, Wu J, et al. Menstrual blood-derived endometrial stem cells ameliorate ovarian senescence by relieving oxidative stress-induced inflammation. Reprod Sci 2024;Nov. 5. [Epub]. https://link.springer.com/10.1007/s43032-024-01739-w.
21. Tanrıverdi Kılıç G, Yenigül NN, Dinçgez B, Yüce Bilgin E, Kılıç ÜK. The role of pentraxin 3 and cathepsin B levels in pregnancies complicated by preeclampsia. Biomarkers 2024;29:518–27.
22. Alikhani M, Amjadi F, Mirzaei M, Wu Y, Shekari F, Ashrafi M, et al. Proteome analysis of endometrial tissue from patients with PCOS reveals proteins predicted to impact the disease. Mol Biol Rep 2020;47:8763–74.

Article information Continued

Fig. 1.

Forest plot of multivariable Mendelian randomization analysis showing the causal effects of cathepsins on female infertility. OR, odds ratio; CI, confidence interval. *Significant associations (P<0.05) are indicated with an asterisk.

Table 1.

Bidirectional two-sample Mendelian randomization analysis of cathepsins and female infertility

Trait IEU GWAS ID Mendelian randomization
Reverse mendelian randomization
SNPS OR (95% CI) P-value SNPS OR (95% CI) P-value
Cathepsin S Prot-a-727 23 0.982 (0.936 to 1.031) 0.468 27 0.956 (0.812 to 1.127) 0.594
Cathepsin F Prot-a-722 12 0.997 (0.927 to 1.073) 0.938 27 0.877 (0.752 to 1.023) 0.095
Cathepsin G Prot-a-723 12 1.032 (0.983 to 1.084) 0.208 27 0.967 (0.808 to 1.158) 0.717
Cathepsin H Prot-a-725 11 1.013 (0.984 to 1.043) 0.371 27 1.060 (0.898 to 1.251) 0.493
Cathepsin B Prot-a-718 20 0.992 (0.955 to 1.031) 0.681 27 1.045 (0.866 to 1.262) 0.645
Cathepsin O Prot-a-726 12 0.962 (0.910 to 1.017) 0.169 27 0.922 (0.769 to 1.106) 0.383
Cathepsin E Prot-a-720 9 0.938 (0.897 to 0.982) 0.006 27 0.937 (0.811 to 1.083) 0.381
Cathepsin Z Prot-a-729 13 0.981 (0.945 to 1.019) 0.321 27 1.057 (0.914 to 1.223) 0.453
Cathepsin L2 Prot-a-728 12 0.952 (0.884 to 1.024) 0.189 27 0.945 (0.817 to 1.092) 0.445

IEU GWAS ID, integrative epidemiology unit GWAS database identifier; SNPS, single nucleotide polymorphisms; OR, odds ratio; CI, confidence interval; Prot, protein; GWAS, genome-wide association study.