Ultrasonographic ovarian mass scoring system for predicting malignancy in pregnant women with ovarian mass
Article information
Abstract
During routine antenatal ultrasound examinations, an ovarian mass can be found incidentally. In clinical practice, the differential diagnosis between benign and malignant ovarian masses is essential for planning further management. Ultrasound imaging has become the most popular diagnostic tool during pregnancy, with the recent development of ultrasonography. In non-pregnant women, several methods have been used to predict malignant ovarian masses before surgery. The International Ovarian Tumor Analysis (IOTA) group reported several scoring systems, such as the IOTA simple rules, IOTA logistic regression models, and IOTA assessment of different NEoplasias in the adneXa. Other researchers have also evaluated the malignancy of ovarian masses before surgery using scoring systems such as the Sassone score, pelvic mass score, DePriest score, Lerner score, and Ovarian-Adnexal Reporting and Data System. These researchers suggested specific features of ovarian masses that can be used for differential diagnosis, including size, proportion of solid tissue, papillary projections, inner wall structure, locules, wall thickness, septa, echogenicity, acoustic shadows, and presence of ascites. Although these factors can also be measured in pregnant women using ultrasound, only a few studies have applied ovarian scoring systems in pregnant women. In this article, we reviewed various scoring systems for predicting malignant tumors of the ovary and determined whether they can be applied to pregnant women.
Introduction
Malignant ovarian tumors are the seventh most common malignancy in women worldwide, accounting for 3.7% of all female cancer cases. These tumors have a high mortality rate owing to late detection, with 67% of the patients already having progressive disease at the time of diagnosis [1]. A recent study reported that ovarian tumors are found in 2% to 10% of pregnant women [2–4]. Most ovarian tumors detected during pregnancy are reported to disappear in late pregnancy, and those less than 5 cm are reported to disappear at a rate of 71% to 89% [3,5,6]. Nevertheless, clinicians should still consider the possibility of malignant ovarian tumors to ensure proper management.
Although there have been no definite ultrasound-based diagnostic criteria for ovarian cancer, many efforts have been made to distinguish benign from malignant tumors using ultrasonographic findings. Various ovarian scoring systems have been proposed for non-pregnant gynecological patients. With recent advances in ultrasonography, some studies have reported the detailed features of tissue composition using shades of colors and attempted to diagnose the histological and clinical stages of ovarian cancer before surgery [7–9].
Ultrasonography is the most commonly used tool for evaluating ovarian tumors during pregnancy because of its relative safety. As other diagnostic modalities, such as computed tomography (CT) or magnetic resonance angiography with contrast, cannot be used during pregnancy, the application of an ovarian scoring system based on the findings of ultrasonography, which can be safely performed during pregnancy, may help determine the timing of surgery during or after pregnancy.
In this article, we reviewed several scoring systems for predicting malignant ovarian tumors and discussed the applicability of the scoring systems during pregnancy.
Review of ovarian mass scoring systems
1. International Ovarian Tumor Analysis (IOTA)
The IOTA study is the largest study on the accuracy of ultrasound for the diagnosis of ovarian tumors, which has brought many benefits to the field of transvaginal ultrasonography (TVS). Studies have been conducted to explain the morphological and Doppler ultrasound characteristics of these tumors [10,11].
The IOTA group was started 15 years ago to create an appropriate “evidence-based” algorithm for all types of ovarian tumors. IOTA includes several scoring systems, including the IOTA simple rules, IOTA logistic regression (LR) models, and IOTA assessment of different NEoplasias in the adneXa (ADNEX) model.
The first scoring system is the IOTA simple rules. This is based on a set of five ultrasound features indicative of a benign tumor (B-features) and five ultrasound features indicative of a malignant tumor (M-features). If only B-features are observed, the ovarian tumor is classified as benign. If the ovarian tumor shows only M-features, it is classified as malignant. However, this scoring system has a disadvantage when both B- and M-features or features that do not correspond to these criteria are observed [12].
The simple rules system was reported to have a sensitivity of 93% and a specificity of 90% [13]. Thereafter, several validation papers reported a sensitivity of 86% to 93% and a specificity of 88% to 94%. However, when classifying inconclusive cases as malignant, the sensitivity was 91% to 96% and the specificity was 65% to 87% [10].
In 2005, more universally useful LR models, IOTA LR1 and LR2, were developed to distinguish between malignant and benign adnexal tumors before surgery [14]. The IOTA LR1 is calculated using a total of 12 factors according to the following criteria: (1) history of ovarian cancer (yes=1, no=0), (2) current hormonal treatment (yes=1, no=0), (3) patient’s age (in years), (4) maximum diameter of the ovarian mass (in millimeters), (5) presence of pain during the examination (yes=1, no=0), (6) presence of ascites (yes=1, no=0), (7) presence of blood flow within a solid papillary projection (yes=1, no=0), (8) presence of a completely solid tumor (yes=1, no=0), (9) maximal diameter of the solid component (but with no increase >50 mm), (10) irregular internal cyst walls (yes=1, no=0), (11) presence of acoustic shadows (yes=1, no=0), and (12) color score (1, 2, 3, or 4). The formula is y=1/(1+e-z), where z=−6.7468+1.5985 (1) −0.9983 (2) +0.0326 (3) +0.00841 (4) −0.8577 (5) +1.5513 (6) +1.1737 (7) +0.9281 (8) +0.0496 (9) +1.1421 (10) −2.3550 (11) +0.4916 (12) and e is the mathematical constant and base value of natural logarithms [14].
The IOTA LR2 is calculated based on six of the above criteria: (3), (6), (7), (9), (10), and (11). The formula used to determine the probability of malignancy is as follows: y=1/(1+exp−z), where z=−5.3718+0.0354 (3) +1.6159 (6) +1.1768 (7) +0.0697 (9) +0.9586 (10) −2.9486 (11). As with LR1, the probability y is dichotomized at a score of 0.1 to make a predictive diagnosis of cancer [14,15]. In the original article, the area under the receiver operating characteristic curve (AUROC) of LR1 was 0.936, the sensitivity was 92.7%, and the specificity was 74.3%. The AUROC of LR2 was 0.916, the sensitivity was 89.9%, and the specificity was 70.7%. Thereafter, in several subsequently published validation papers, when the LR2 cutoff value was 10%, the sensitivity was 88% to 95% and the specificity was approximately 80% to 90% [10].
The IOTA ADNEX is the first risk model to differentiate between benign and four types of malignant ovarian tumors (borderline, stage I cancer, stage II–IV cancer, and secondary metastatic cancers). The ADNEX model includes three clinical factors and six ultrasound factors. The clinical factors include age (years), serum CA125 level (U/mL), and type of the center where the patient underwent ultrasound. The six predictors in the ADNEX model are ultrasound variables, as follows: maximal diameter of the lesion (mm), percentage of solid tissue (%), number of papillary projections (0, 1, 2, 3, and >3), presence of more than 10 cyst locules (yes/no), acoustic shadows (yes/no), and presence of ascites (yes/no) [16]. In the original article, the AUC of the ADNEX model was 0.954 (95% confidence interval, 0.947 to 0.961) for the development data and 0.943 (0.934 to 0.952) for the validation data [16]. Using a previously proposed cutoff of 10% [14], the sensitivity for malignancy was 96.5% and the specificity was 71.3% for the validation data. The model well discriminated between benign tumors and each of four types of malignancy, with AUCs between 0.85 (benign versus borderline) and 0.99 (benign versus stage II–IV cancer) [16]. Table 1-1 summarizes the details of each IOTA scoring system.
2. Sassone score
The Sassone score was developed by Sassone et al. [17]. This scoring system evaluates four parameters: inner wall structure, wall thickness, septum, and echogenicity [17]. First, the inner wall structure is divided into four categories: smooth or irregularities (≤3 mm), papillarities (>3 mm), and not applicable (mostly solid), with scores of 1, 2, 3, and 4, respectively. Second, the wall thickness refers to points 1, 2, and 3 in three categories: thin (≤3 mm), thick (>3 mm), and not applicable (mostly solid). Third, the septa are classified into three categories: none, thin (≤3 mm), and thick (>3 mm) with scores of 1, 2, and 3, respectively. Fourth, echogenicity is divided into five categories: sonolucent, low echogenicity, low echogenicity with echogenic core, mixed echogenicity, and high echogenicity, which refer to points 1, 2, 3, 4, and 5, respectively (Table 1-2). The maximum Sassone score is 15, and the minimum score is 4. In this scoring system, the cutoff score is 9. In the original article, the sensitivity and specificity were 100% and 83%, respectively [17].
3. Pelvic mass score (PMS)
PMS is a scoring system proposed by Rossi et al. [8], which includes the sonomorphological index-Sassone score, logarithmic value of CA125 level, type of vascularity, menopausal status, and resistive index of the adnexal mass. The formula for the PMS was proposed as follows:
In this formula, SASS is the numeric value of the Sassone score, log(CA125) is the base 10 logarithm of the CA125 level, VAS is the type of vascularization (peripheral=1, central/septal=2), MS is the menopausal state (pre-menopausal=1, post-menopausal=2), and RI is the numeric value of the resistance index of the pelvic mass (Table 1-2). The ROC curve method recommended a cutoff value of 29 for PMS analysis. The sensitivity of PMS was reported to be 93%, and the specificity was approximately 88% [8].
4. DePriest score
DePriest et al. [18] proposed a morphology index based on sonographic findings of ovarian cancer. It included the ovarian volume, wall structure, and septal structure. Scores of 0 to 4 were developed for each category according to specific criteria. First, ovarian volume was divided into five categories: <10 cm3, 10–50 cm3, >50–200 cm3, >200–500 cm3, and >500 cm3. The inclusion of the ovarian volume was considered important. DePriest et al. [18] also noted that no malignant ovarian tumor had a volume of <10 cm3. In another study, no malignancies were observed in postmenopausal women with unilocular ovarian cysts <3 cm in diameter. Second, the wall structure was characterized as follows: smooth (<3 mm thickness), smooth (≥3 mm thickness), papillary projection (<3 mm), papillary projection (≥3 mm), and predominantly solid. They noted that the most consistent sonographic characteristic of malignant ovarian tumors was abnormality of the wall structure. In their study, all malignant ovarian tumors had a papillary projection or solid component protruding from the inner wall of the tumor. Third, the septal structure was divided into five categories: no septa, thin septa (<3 mm), thick septa (3 mm to 1 cm), solid area (≥1 cm), and predominantly solid (Table 1-3). Papers published after the study by DePriest et al. [18] proved that the shape of the septa is an important component of malignancy evaluation [19]. In the original article, the cutoff value of the scoring system was 5. The sensitivity of the DePriest score was 100%, and the specificity was approximately 61.2% [19].
5. Lerner score
In 1993, the Lerner score was developed. It included four parameters: wall structure, shadowing, septa, and echogenicity. The modified scoring system proposed by Lerner et al. [20] was primarily based on the Sassone scoring system [17]. The score assignments previously used in the Sassone scoring system were changed according to the results of a computer-based multiple linear regression analysis, and two other modifications were included. The computer-based analysis included a category in which the variable wall thickness was discarded as insignificant and described as a “shadow,” defined as the acoustic echo loss behind the sound-absorbing structure. This new category, “shadowing,” allows for a more accurate identification of benign cystic teratoma, which was the cause of many false-positive results in previous studies.
Scores of 0 to 3 were developed for each category according to specific criteria. First, the wall structure was divided into three categories: smooth or small irregularities <3 mm (0 points), solid or non-applicable (2 points), and papillarities ≥3 mm (3 points). Second, the shadowing score consisted of “yes” (0 points) and “no” (1 point) categories. Third, the septa were divided into two categories: none or thin (0 points) and thick (1 point). Fourth, echogenicity was also divided into two categories: sonolucent or low-level echo or echogenic core (0 points) and mixed or high (3 points) (Table 1-3). In the original article, the cutoff value of the scoring system was 3. The sensitivity and specificity of the Lerner score were 96.8% and 77%, respectively [20].
6. Ovarian-Adnexal Reporting and Data System (O-RADS)
In 2018, the American College of Radiology divided ovarian and adnexal tumors into six categories to construct a unified imaging report for communication and quality improvement. The six categories were as follows: 1) description of major categories, physiological category or not; 2) description of size; 3) description of solid lesions; 4) description of cystic lesions; 5) description of vascularity; and 6) general and extraovarian findings. First, the major categories were divided into two categories: physiological category (follicle and corpus luteum) and lesion category (unilocular, unilocular cyst with solid component, multilocular cyst without solid elements, multilocular cyst with solid component, and solid or solid appearance). Second, size was defined as the maximum diameter. Third, solid lesions were divided into two categories: external contour (smooth and irregular) and internal content (acoustic shadowing). Fourth, cystic lesions included the inner margin (papillary projection or nodule, and smooth/irregular) and internal content (anechoic fluid and hyperechoic components). Fifth, vascularity was divided into four color scores (CS): 1 (no flow), 2 (minimal flow), 3 (moderate flow), and 4 (very strong flow). Sixth, the general and extra-ovarian findings were as follows: cul-de-sec fluid, ascites, and peritoneal thickening or nodules.
With reference to these descriptions, risk classification was performed for tumors and six O-RADS scores were created. The meaning of each score was as follows: O-RADS 0, incomplete evaluation; O-RADS 1, physiological category (normal premenopausal ovary); O-RADS 2, almost certainly benign category (1% risk of malignancy) [21] (including simple cysts, unilocular cysts with smooth walls, and ovarian mass with maximal size less than 10 cm); O-RADS 3, lesions with a low risk of malignancy (1% to 10%) (including unilocular cysts [≥10 cm], typical dermoid cysts, endometriomas, and hemorrhagic cysts [≥10 cm]; unilocular cysts of any size with an irregular inner wall [<3 mm height]; multilocular cysts [<10 cm]; cysts with a smooth inner wall and CS=1–3; and solid smooth mass of any size with CS=1); O-RADS 4, lesions with an intermediate risk of malignancy (10% to 50%) (including multilocular cysts without a solid component, unilocular cysts with a solid component, multilocular cysts with a solid component, and solid tumors); O-RADS 5, lesions with a high risk of malignancy (>50%) (including unilocular cysts of any size, with ≥4 papillary projections and CS=any; multilocular cysts of any size with a solid component and CS=3–4; solid smooth mass of any size with CS=4; solid irregular mass of any size with CS=any; and ascites or peritoneal nodules).
In the validation study of O-RADS, the optimal cutoff value for predicting malignancy was O-RADS >3. The sensitivity of O-RADS was 98.7%, and the specificity of this system was 83.2% [22]. Table 1-4 summarizes the details of O-RADS.
Previous studies on ovarian masses in pregnant women
Several authors have reported that 51% to 92% of adnexal masses resolve during pregnancy [6,22,23], and the predictors of sustainability are mass size greater than 5 cm and a “complex” structure on TVS [24]. The incidence of acute complications is known to be less than 2% [25]. The American College of Obstetrics and Gynecologists has published guidelines outlining methods for the diagnosis and management of adnexal masses that occur outside of pregnancy [25]. However, clinical guidelines for women with ovarian masses during pregnancy remain elusive.
Previous studies evaluated only a few ultrasound functions during pregnancy, and most studies focused on outcomes in the presence of an ovarian mass (Table 2).
Bernhard et al. [6] investigated the risk factors for persistent ovarian masses during pregnancy in the largest study on ovarian masses during pregnancy. They divided ovarian masses into four sonographic categories: simple cysts with an average diameter of <5 cm measured on ultrasound, simple cysts with an average diameter of ≥5 cm, polycysts (mass containing one or more simple cysts), and complex masses. A cyst with smooth walls and no internal echoes was defined as a simple cyst. Masses that did not meet the criteria for hard masses or simple cysts were defined as complex masses. All simple cysts with an average diameter of >1 cm, all polycystic masses, and all complex masses of any size were included in the analysis. Most ovarian masses identified on ultrasound during pregnancy were small simple cysts that did not pose a pregnancy risk. Most large or ultrasonically complex masses spontaneously resolved. In this study, the factors that best predicted the persistence of an ovarian mass were shape and size on ultrasound [6].
The most recent study (published in 2021) evaluated the accuracy of malignancy prediction between ovarian scoring systems and subjective assessment [26]. It was a retrospective multicenter study conducted in Poland. The authors concluded that subjective evaluation is the best predictor of complex adnexal masses found on antenatal ultrasonography in pregnant women. For inexperienced sonographers, the IOTA simple rules risk and ADNEX scoring systems can also be used for the characterization of these tumors; however, the serum tumor markers CA125 and human epididymis protein 4 (HE4) and the Risk of Ovarian Malignancy Algorithm (ROMA) algorithms seem less accurate [26].
Recently, we reported a multicenter retrospective study on ovarian mass scoring systems during pregnancy [27]. This was a multicenter retrospective cohort study involving 11 referral hospitals. We compared ultrasonographic ovarian mass scoring systems (Sassone score, Lerner score, and IOTA ADNEX) and evaluated the factors that could predict the malignancy risk in pregnant women. The main findings of the study were as follows: among pregnant women with ovarian masses, the ovarian mass score of patients with malignant ovarian masses was significantly higher than that of patients with benign masses in all three scoring systems (AUROC: 0.831 for Sassone, 0.710 for Lerner vs. 0.709 for IOTA ADNEX; P<0.05, between Sassone and Lerner/IOTA ADNEX). The Sassone scoring system had the highest AUROC. Among the ultrasound characteristics, six factors showed statistically significant differences (maximal diameter of the ovarian mass, maximal diameter of an ovarian solid mass, inner wall structure, wall thickness, thickness of septation, and papillarity). A combined model was developed with these six components, which showed similar accuracy to the Sassone scoring system. We concluded that malignant ovarian tumors in pregnant women can be predicted with high accuracy using either the Sassone scoring system or a combined model [27].
Discussion
1. Factors constituting the ovarian mass scoring systems
The factors of representative ovarian scoring systems studied thus far (IOTA, Sassone score, PMS, DePriest score, Lerner score, and O-RADS) are compared in Table 3. The common features of each ovarian scoring system are size, proportion of solid tissue, papillary projections, inner wall structure, locules, wall thickness, septa, echogenicity, acoustic shadows, and presence of ascites.
In the Sassone scoring system, high echogenicity had the highest score, and the larger the solid portion, the higher the probability of malignancy. The DePriest and Lerner scores also increased as the size of the solid portion increased. In the Lerner score, high scores were assigned to mixed or high echogenicity and papillarity in the wall structure. In addition, only the IOTA ADNEX and PMS included CA125 level.
In the analysis using the Lerner scoring system, it was also found that papillary masses within the inner wall structure category had a greater percentage of malignancy than masses that were mostly solid. For this reason, papillary masses were assigned a higher weight. Significant values were obtained in the regression analysis using these four variables, and regression coefficients were used to calculate the final weight of each variable [20].
2. Ovarian mass scoring systems in pregnant women
In pregnant women, the incidence of adnexal masses is approximately 0.05% to 3.2% [6,23]. Mature teratomas and para-ovarian or luteal cysts are the most commonly reported pathological diagnoses [28–30]. Approximately 1.2% to 6.8% of pregnant women with persistent malignant masses are diagnosed with malignancy [23,31–33].
Most ovarian masses are discovered incidentally during routine ultrasound examinations in pregnant women [4,34–36]. Previously, the detection rate of such masses was low because of the lack of techniques for early detection [34]. However, the incidence and detection rate of ovarian masses significantly increased with the application of ultrasonography in antenatal care [4,34,37].
With the developments in ultrasound technology, many studies have been conducted on scoring systems for malignancy evaluation using ultrasound characteristics in non-pregnant women; however, no scoring system has been applied to pregnant women.
When making clinical decisions in pregnant women with ovarian masses, both the mother and the fetus should be considered, which makes the clinical decision more complex. Pregnancy complications and malignancy are the most important factors in the decision making process.
If the ovarian scoring systems mentioned above are applied to pregnant women, the following should be noted: first, as the gestational weeks increase, ascites become difficult to detect with ultrasound in pregnant women because they are hidden in the uterus. Second, the diagnostic utility of serum tumor markers in women with ovarian masses found during pregnancy remains controversial. The CA125 level is elevated during pregnancy. It peaks in the first trimester (range, 7.251 units/mL) and steadily decreases thereafter [38]. It is similar between pregnant women in the second and third trimesters and control women [39]. In general, slight elevations in CA125 level during pregnancy are not associated with malignancy [24]. CA125 and another widely used tumor marker, HE4, have been analyzed in pregnant women; however, the prognostic value of these markers alone or in combination with the ROMA is unknown in pregnancy [40–42].
Ultrasound is helpful in evaluating pregnant women with ovarian masses. Various ultrasound features of ovarian masses of various etiologies can be evaluated in pregnant women in actual clinical practice. Some ultrasound features may raise the suspicion of malignancy, including but not limited to the presence of solid components, increased wall thickness, multi-location large tumors with a maximum diameter of >6 cm, and total internal septum >2–3 mm. The presence of papillary protrusions, increased vascularity during Doppler examination, and ascites can also be clues to suspect malignancy [37,43,44].
Other imaging tests, such as CT, are inappropriate for assessing ovarian masses during pregnancy because of fetal radiation effects. Pelvic ultrasonography is considered the modality of choice for evaluating ovarian masses found during pregnancy and is suitable for guiding surgical intervention, if necessary [34,43]. Ultrasound may also be used to monitor changes in the ovarian mass as the gestational age progresses. It is also important to monitor the progression or regression of ovarian masses, such as in terms of size and characteristics [34].
Conclusion
In this review, we investigated the usefulness of existing ovarian scoring systems in pregnant women with ovarian masses. Unlike for general gynecological patients, the available imaging modalities for pregnant women are limited. Ultrasonography can be the optimal tool considering that pregnant women undergo ultrasound examinations several times during prenatal care. However, few research results have been published on whether the prediction of malignancy using ovarian scoring systems in pregnant women is accurate. Hence, more studies are needed on the application of scoring systems for the evaluation of ovarian mass malignancy in pregnant women.
Acknowledgments
We are grateful to the patients included in this study. We thank the medical staff for their assistance.
Notes
Han Sung Hwang has been an Editorial Board of Obstetrics & Gynecology Science; however, he was not involved in the peer reviewer selection, evaluation, or decision process of this article. Otherwise, no other potential conflicts of interest relevant to this article were reported.
Conflict of interest
No potential conflict of interest relevant to this article was reported.
Ethical approval
This study does not require approval of the Institutional Review Board because no patient data is contained in this article.
Patient consent
Not applicable.
Funding information
None.