Given the inherent risks of surgery and the potential consequences of postoperative complications, preoperative risk assessment is paramount. Since esophageal cancer is often diagnosed late and carries a poor prognosis, careful evaluation is necessary to determine when surgery is justified. Equally important is identifying cases where the risks outweigh the benefits, warranting alternative treatment options. A reliable means of risk–benefit assessment is desirable, with various scoring systems available for this purpose. One such system, recommended by the German S3 guideline for esophageal and gastric interventions, is the physiological and operative severity score for the enumeration of mortality adjusted for esophagogastric surgery (O-POSSUM score) []. The literature presents differing perspectives on the O‑POSSUM score. While some authors consider it a precise and useful tool [], others believe it overpredicts mortality []. Furthermore, other scoring systems such as the International Esodata Study Group (IESG) risk prediction and the Fuchs score exist, offering simple preoperative risk scales for counseling prior to surgery. However, none of the scores are currently recommended nationally. Therefore, the aim of this study is to validate and assess the predictive power of the O‑POSSUM score regarding mortality and morbidity following esophagectomy.

As per the German national treatment guideline for esophageal cancer (S3 guideline), surgical resection stands as a standardized treatment for tumors in the middle and distal thirds of the esophagus, aiming for a cure []. Minimally invasive techniques are predominantly recommended, with the guideline favoring either a minimally invasive approach or a hybrid technique combining minimally invasive and open procedures []. Nonetheless, even with minimally invasive techniques, surgical resection of esophageal cancer places a burden on patients [], often leading to postoperative complications []. Complications can arise due to accompanying comorbidities and the invasiveness of surgical therapy. Hence, early identification or, preferably, prevention of complications is crucial []. The Esophagectomy Complication Consensus Group (ECCG) has identified common complications, with pulmonary issues like pneumonia (14.6%) and cardiac complications such as atrial arrhythmias (14.5%) being frequent, along with gastrointestinal complications, notably anastomotic insufficiencies (11.4%) []. These complications, particularly anastomotic insufficiencies, can lead to severe outcomes such as mediastinitis, peritonitis, and sepsis, thereby increasing morbidity and mortality [].

Esophageal cancer ranks among the leading causes of cancer-related deaths globally []. Prognostically unfavorable, it carries a high mortality rate []. The incidence varies significantly by region, with squamous cell carcinomas being more prevalent worldwide. Esophageal squamous cell carcinoma (ESCC) particularly affects the “Asian esophageal cancer belt.” However, adenocarcinomas have seen a significant rise, particularly in industrialized countries like the USA and parts of Europe [].

To investigate morbidity, complications were categorized according to the morbidity scale of Dindo and colleagues. Kendall’s tau was employed to assess the relationship between the points allotted by the O‑POSSUM score and the Clavien–Dindo classification. Clavien–Dindo grades were further divided into minor complications (grade I–IIIa) and major complications (grade IIIb-V), with Kendall’s tau used to test for an association between minor and major complications and the O‑POSSUM score [].

Demographic variables such as gender, age, and tumor histology were collected, along with the surgical procedure, type of surgery, TNM stage, and the Clavien–Dindo grade. The O‑POSSUM score was calculated for the 71 patients included in the study, and mortality probability was determined based on the score using a calculator []. The primary endpoint considered was 30-day mortality. Binary logistic regression was used to compare mortality predicted by the O‑POSSUM score with the observed mortality. Goodness of fit was assessed using the Hosmer–Lemeshow test. A receiver operating characteristic curve (ROC curve) was constructed, and the area under the curve (AUC) was determined.

The O‑POSSUM score is a scoring system used for preoperative risk analysis in esophageal surgery. The score contains physical components and surgical components. The physical variables include age, age group, pre-existing cardiac conditions, and pre-existing respiratory conditions, as well as electrocardiogram (ECG) changes, systolic blood pressure, pulse rate, and the Glasgow Coma Scale. Laboratory values are also included: the hemoglobin value, the number of leukocytes, the urea value, the sodium value, and the potassium value. Operative variables of the score are the surgical procedure, the urgency of the operation (classified as emergency or elective), and malignancy [].

All histological types of esophageal cancer were included in the analysis (squamous cell carcinoma, adenocarcinoma, and undifferentiated carcinoma). Incomplete data such that the O‑POSSUM score could not be generated were excluded. Variables that were collected more than 6 months before the operation were also excluded, as recent values should be used wherever possible.

Kendall’s tau, which examined the relationship between Clavien–Dindo grade and the O‑POSSUM score, yielded a correlation of 0.089. Thus, there is a positive correlation, but it is very small. The result is not statistically significant (= 0.339). Figurea illustrates the correlation between the O‑POSSUM score and the grade according to Clavien–Dindo. Kendall’s tau to test for an association between minor and major complications and the O‑POSSUM score yielded a correlation of 0.093. This is also a very low correlation, and the result is not statistically significant (= 0.360). Figureb presents the relationship between minor and major complications and the O‑POSSUM score.

a Correlation between the height of the physiological and operative severity score for the enumeration of mortality adjusted for esophagogastric surgery (O-POSSUM score) and the grade according to Clavien–Dindo for the examined patients; b Correlation between the O‑POSSUM score and minor or major complications for the examined patients

The ROC curve of the examined patient group is located below the diagonal one. The AUC is 0.358. Consequently, the O‑POSSUM score was worse at predicting mortality than chance. The result is not statistically significant (95% confidence interval 0.055–0.660;= 0.291). Figureshows the ROC curve.

Mortality predicted by the O‑POSSUM score averaged 0.503%. The minimum calculated mortality was 0.2% and the maximum was 3.1% (median 0.4%). The actual mortality was 7% (5 out of 71 patients). Thus, the observed mortality was significantly higher (about 14 times) than the predicted mortality. Figureillustrates the percentage of patients who actually died and who did not in relation to the calculated mortality value. It should be noted that the calculated mortality rates are in a very low range; the highest calculated value is 3.1%. Therefore, the O‑POSSUM score did not predict a particularly high mortality probability for any patient case. Binary logistic regression analysis indicated that the model is not significant (= 0.599; χ= 0.276). The Hosmer–Lemeshow test showed a significance of 0.303. Thus, the goodness of fit of the test is sufficient. Ninety-three percent of the patients were assigned according to their observed mortality by the model. Of the 66 patients who did not die, all were correctly classified, whereas the five who died were all misclassified.

The frequencies of the individual parameters of the O‑POSSUM score for the studied patient collective are shown in Tablefor the physiological parameters and in Tablefor the operative parameters. A large proportion of patients (41; 57.8%) had previous cardiac disease, 23 patients (32.4%) showed abnormal electrocardiography (ECG) findings, and 22 patients (31%) had a respiratory history. Most patients had values in the systolic blood pressure and pulse rate categories within normal ranges. Hemoglobin levels, leukocyte count, and sodium and potassium levels were within the normal ranges in most patients, while urea levels were slightly elevated in the majority. The mean O‑POSSUM score was 22.3 points (standard deviation 7.094), the minimum was 12 points, and the maximum was 47 points (median 21 points). The distribution of the O‑POSSUM score for the studied patient collective is shown in Fig.

The patient population consisted of 56 men (78.9%) and 15 women (21.1%). The average age was 62 years (range 41–83 years). A large proportion of patients (59; 83.1%) had adenocarcinoma, 11 (15.5%) had squamous cell carcinoma, and in one case (1.4%), the tumor histology was undifferentiated. While 41 patients underwent open surgery, 29 underwent laparoscopic-assisted surgery and one case was converted from laparoscopic to open surgery. All surgical procedures were elective; there were no emergencies. Most patients had T2 or T3 stage according to the TNM classification. Lymph nodes were affected in 39 patients, and metastases were present in one patient undergoing surgery for recurrent tumor-bleeding. Most of the patients had an R0 resection. While 24 patients had no neoadjuvant therapy, 31 received chemotherapy and 16 combined radiochemotherapy. The characteristics of the 71 patients included in the study are shown in Table

Discussion

Several studies have concluded that the O‑POSSUM score overpredicts mortality []. However, in this study, the actual mortality was significantly higher (14 times higher) than that predicted by the O‑POSSUM score. Consequently, the O‑POSSUM score underpredicted mortality for the patient population examined in this study. Overall, 93% of patients were correctly assigned by the regression model according to their observed mortality: all those who did not die were correctly classified, and those who did die were all misclassified. Thus, the O‑POSSUM score failed to identify any of the deceased patients. These findings are consistent with the study conducted by Lagarde et al., who also concluded that the O‑POSSUM score could not identify the deceased []. The ROC curve generated in this analysis indicates that the O‑POSSUM score was worse than chance at predicting mortality. In the literature, ROC curves usually have an AUC >0.5, indicating that the prediction of risk is better than chance []. The fact that the ROC curve in this analysis is below the diagonal may be related, among other factors, to the limited number of patients.

13 ]. When examining the correlation between the Clavien–Dindo grade and the O‑POSSUM score as well as between minor and major complications and the O‑POSSUM score, a slightly positive but statistically nonsignificant correlation was found. S.M. Lagarde et al. were also unable to demonstrate differences between the individual risk groups and the occurrence of minor or major complications in their study [].

19 ‐ 23 ]. It seems likely that one of these scoring systems will be proven more suitable for preoperative risk prediction for esophagogastric surgery than the O‑POSSUM score. In particular, the risk scale developed by Fuchs et al. seems promising. However, it should be checked whether it is also valid in other Western European countries, as the study by Fuchs et al. rather represents the patient collective of the United States [ 23 ]. This study found that the O‑POSSUM score is not a reliable means of predicting morbidity and mortality in patients undergoing esophagectomy. There are now several different preoperative risk scoring systems available, such as the preoperative esophagectomy risk score (PER score), the risk score developed by Schröder et al., the International Esodata Study Group (IESG) risk prediction model, the index developed by Hodari et al., and the risk scale invented by Fuchs et al. However, some of them have yet to be independently validated []. It seems likely that one of these scoring systems will be proven more suitable for preoperative risk prediction for esophagogastric surgery than the O‑POSSUM score. In particular, the risk scale developed by Fuchs et al. seems promising. However, it should be checked whether it is also valid in other Western European countries, as the study by Fuchs et al. rather represents the patient collective of the United States [].

Limitations of the study include the fact that it is a retrospective data analysis. In addition, there could be an information bias. Moreover, the study is limited to a specific time period (2010–2022) and was conducted at a single center, thus allowing for only a limited number of patients ( n = 71) to be studied.

It can be concluded that the O‑POSSUM score was not found to be a reliable tool for preoperative prediction of morbidity and mortality in this study. The score failed to identify deceased patients and should be modified. Alternatively, another preoperative risk assessment could undergo validation studies, such as the risk scale according to Fuchs et al., or a new score could be developed. There is still no scoring system that has been validated, proven reliable in its predictive power, and undergone testing in clinical practice. A reliable scoring system would be an asset for the treatment team and also for patients. There is definitely still a need for research in this area and the aim of developing such a scoring system and establishing it in clinical practice should be pursued further.