The cornerstone of therapeutic management of rectal cancer is surgical resection by total mesorectal excision (TME). This can be performed using several surgical approaches: open, laparoscopic (L-TME), robot-assisted (R-TME) and transanal TME (TaTME) [1,2,3,4]. Whereas the first TME was performed using open surgery, minimal invasive approaches are increasingly used since the introduction of laparoscopic rectal resections in the mid 90’s. The R-TME and TaTME technique were introduced in the beginning of the 00’s and 10’s respectively, in order to overcome technical limitations of the L-TME procedure.

With the introduction and implementation of a new surgical approach, surgeons need to climb a learning curve. This is the amount of procedures required to achieve an adequate surgical performance, regarding safety, efficacy and efficiency [5]. The ideal minimal invasive procedure has a short learning curve, and is therefore easy to master. In addition, the period in which the surgeon ‘climbs’ the learning curve should not result in additional morbidity, worsened oncological outcomes or mortality for the patient [6, 7].

It is suggested that L-TME and TaTME have a relatively long learning curve of around 50–90 procedures per surgeon [8,9,10,11,12], while R-TME is suggested to have a shorter learning curve [13,14,15]. Despite the number of papers reporting on learning curves of these approaches, the quality of evidence is limited. Patient populations are heterogeneous by including both benign and malignant diseases. Experience with previous techniques is mostly not taken into account, and some studies do not make a clear distinction between colonic and rectal resections. Additionally, multiple designs and statistical methods are used to assess the learning curve. Finally, although systematic reviews are available, some are outdated, or not restricted to rectal cancer surgery, while others do not evaluate the learning curve of all three minimal invasive techniques [16,17,18,19,20].

The aim of this systematic review is two-fold: First, we aim to create an overview of the current available literature regarding the learning curve of L-TME, R-TME and TaTME for patients with rectal carcinoma. Second, we aim to explore the impact of the learning curve on clinical outcomes in L-TME, R-TME and TaTME.

Materials and methods

This systematic review was conducted and reported according to the PRISMA 2020 statement [21]. Approval of the Institutional Review Board (IRB) was deemed unnecessary, due to the nature of the study. Inclusion and exclusion criteria, as well as search strategies, the used critical appraisal tool, and outcomes of interest were prespecified. We did not register a review protocol in advance.

Eligibility criteria

In order to create an overview of studies regarding the learning curve of L-TME, R-TME, and TaTME, studies were deemed eligible if: (1) the studies included patients with primary rectal cancer, or patients with colorectal cancer in which rectal cancer patients could be distinguished, (2) the patients underwent a TME, (3) the primary or secondary aim of the paper was to obtain the learning curve of either L-TME, R-TME or TaTME. Studies were excluded if they: (1) were written in other languages than English, German, French or Dutch, or if the studies (2) did not resemble an original article.

Literature search and study selection

Two researchers independently conducted a systematic search (TAB and DJS) in PubMed, Embase and Cochrane Library on August 10, 2021. The following search terms were used: (rectum cancer OR colorectal cancer OR rectal OR colorectal) AND (learning curve OR learning), without limiting the search (for example to year of publication). After undoubling, title and abstract of all studies were screened for inclusion, and full text reading of the remaining studies was performed by two researchers independently. Finally, the reference lists of included studies were screened for possible eligible studies. Systematic reviews emerging in the literature search were excluded, but reference lists were screened for possible eligible studies. Disagreement between the two independent researchers was resolved through discussion until consensus was reached.

Data collection

The primary outcome was length of the learning curve for L-TME, R-TME, and TaTME. Secondary outcomes included intraoperative, postoperative and oncological outcomes of patients operated during the learning curve, compared with patients operated after completion of the learning curve. In addition, statistical methods used to obtain the learning curve, as well as the outcome variables used to obtain the learning curve were recorded. A prespecified form was used to capture data of studies. This form contained the following data: author, year, country, study design, surgical technique, number of participating centers and surgeons, number of patients included, exclusion criteria and aim of the study. Additionally, surgeon-based or institute-based learning curve analysis, prior experience with the surgical technique, length of the learning curve based on intraoperative complications, length of the learning curve based on postoperative complications, length of the learning curve based on positive pathological circumferential margin (CRM), length of the learning curve based on operative time, length of the learning curve based on other variables or a compound variable, and used statistical methods for learning curve analysis were registered. Finally, if a comparison was performed between patients operated during the learning curve and after the learning curve was achieved, the following outcomes were compared: intraoperative complications, postoperative complications, positive CRM rate and operative time. All data was extracted by two researchers independently and disagreement was resolved through discussion.

Outcomes

Length of the learning curve was specified as the number of procedures necessary to reach proficiency as identified by the specific study. Since studies used different clinical outcomes and statistical methods to assess proficiency of the surgical technique, length of the learning curve was reported per clinical outcome and statistical method used. Used clinical outcomes were: intraoperative complications; postoperative complications within 30 days; positive CRM rate, defined as a margin ≤ 1 mm; operative time, defined as time from incision to skin closure, or a composite of multiple clinical outcomes (i.e., conversion, local recurrence and postoperative complications). We registered length of the learning curve for each specific statistical method, and for CUSUM or RA-CUSUM analyses we differentiated between length of the learning curve based on deflection of the graph and stabilization of the graph, as the point at which the learning curve was achieved. Furthermore, a final conclusion per technique regarding length of the learning curve was defined as the reported lengths of the learning curve per technique as estimated only by RA-CUSUM analyses.

Risk of bias

The MINORS tool [22] was used to assess the quality of the studies. Both researchers (TAB and DJS) recorded the data independently. Disagreement was resolved through discussion until consensus was reached.

Results

Study selection

PubMed, Embase and Cochrane Library were searched on August 10, 2021 and yielded 3701 records. After undoubling 2851 records remained. Screening title and abstract for eligibility resulted in 298 records. After full text screening, an additional 253 records were excluded. This resulted in 45 records that were included in this systematic review. Studies were too heterogeneous, both clinically and methodologically, to perform a meta-analysis (Fig. 1).

Fig. 1
figure 1

Flow diagram of study selection. Lap Studies involving laparoscopic total mesorectal excision, Robot Studies involving robot-assisted total mesorectal excision, TaTME Studies involving transanal total mesorectal excision

Study characteristics

The characteristics of included studies are presented in Table 1. Studies were published between 2009 and 2021, with a total of six prospective studies [11, 23,24,25,26,27], 34 retrospective studies [9, 10, 12, 14, 15, 28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56], and five studies in which the design was not clearly described [8, 13, 57,58,59]. Thirteen studies reported on the learning curve of L-TME [10,11,12, 26, 27, 39,40,41,42,43,44, 58, 59], twenty on the learning curve of R-TME [13,14,15, 23,24,25, 28,29,30,31,32,33,34,35, 52,53,54,55,56,57], eight on the learning curve of TaTME [8, 9, 36,37,38, 48,49,50], and four reported on the comparison of the learning curve of two approaches [45,46,47, 51].

Table 1 Study characteristics of included studies

In total 7562 patients were included in this systematic review. The average number of included patients was 150 for R-TME studies, 168 for TaTME studies and 205 for L-TME studies. Most studies’ primary aim was to define the learning curve, though for nine studies it was a secondary aim [26, 27, 29,30,31, 42, 43, 46, 59]. Thirteen studies reported on institutional learning curves [8, 9, 11, 13, 31,32,33, 39, 41,42,43, 49, 59], while the others reported on surgeons’ individual learning curves. Previous experience with colorectal surgery was mentioned in twenty-one studies [8,9,10, 13, 15, 24, 25, 28, 29, 31, 35, 37, 39,40,41,42, 45,46,47, 51,52,53,54,55,56]. The majority of studies defined exclusion criteria, while seventeen did not exclude patients during the learning curve [8, 13, 14, 23, 27, 29,30,31,32, 37, 46, 48, 49, 53,54,55,56,57].

Risk of bias

None of the studies scored high on all criteria of the MINORS tool. Nineteen out of 41 non-comparative studies adequately reported more than half of the required criteria [9,10,11,12, 15, 24, 25, 27, 33, 35, 37, 38, 48,49,50, 53, 55]. Study quality was highest among the TaTME studies, and varied most among the R-TME and L-TME studies. All comparative studies adequately reported more than half of the MINORS criteria [45,46,47]. One study prospectively calculated the study size [9] and seventeen used adequate statistical analyses [8,9,10, 12,13,14, 24, 35, 37, 38, 44, 45, 47, 50, 52, 55]. Regarding the use of adequate definitions of clinical outcome variables, nineteen studies adequately reported unbiased assessment of endpoints [8, 10, 15, 24, 25, 27, 33, 35, 38, 39, 47,48,49,50,51,52,53,54,55,56,57, 60] (Table 2).

Table 2 Risk of bias assessment according to MINORS tool

Statistical methods of learning curve analyses

Most studies used a combination of different learning curve analyses. No clear learning curve analysis was used in three studies [23, 36, 54], eleven studies used split group analyses (SGA) or sequence analysis for one or more clinical outcome variables [12, 25,26,27,28, 39, 41,42,43, 49, 58, 59] and twelve studies used the moving average analysis (MAA) [10, 12, 14, 30, 40, 44, 45, 47, 58, 60]. Eighteen studies used the CUSUM analysis based on operative time [8, 9, 13, 15, 24, 29, 31,32,33,34, 37, 44, 46, 47, 50,51,52,53, 55,56,57]. Two studies used the CUSUM analysis based on intraoperative complications [12, 37], six studies used the CUSUM analysis based on postoperative complications [11, 12, 32, 37, 45, 48], one study based the CUSUM analysis on positive CRM rate [45] and seven studies used the CUSUM analysis based on a composite outcome [9, 11, 13,14,15, 34, 35].

One or more risk-adjusted CUSUM analyses (RA-CUSUM) were used in eight studies: three studies used postoperative morbidity [8, 9, 35, 38], two studies used positive CRM rate [10, 35], one study used local recurrence [10]. Another study used conversion [12] and three studies used a composite outcome [14, 15, 35]. Finally, some studies used the first deflection in the (RA-)CUSUM or MAA graph as the point at which proficiency was reached [8, 10, 12, 14, 15, 27, 29, 31, 33, 34, 44, 46, 47, 52, 53, 55,56,57], while others defined proficiency as the point at which stabilization was reached [9, 13, 24, 30, 35, 37, 38, 40, 58] (Tables 3 and 4).

Table 3 Results of individual studies regarding statistical analysis and learning curve
Table 4 Results of individual studies regarding statistical analysis and learning curve

Length of the learning curve

Despite the fact that all studies assessed the learning curve as their primary or secondary outcome, only 31 studies defined the number of procedures necessary to complete the learning curve based on their results [8, 9, 11, 13,14,15, 24, 25, 29, 30, 32, 34, 35, 37, 38, 40, 48, 50,51,52,53, 55, 56]. CUSUM analyses for length of the learning curve based on operative time differed between 19 and 128 for R-TME, between 51 and 95 for TaTME and between 36 and 42 for L-TME. The only study using RA-CUSUM for length of the learning curve based on operative time showed 87 procedures to be the learning curve for TaTME [38].

Length of the learning curve based on specific clinical outcomes differed widely. Two studies used intraoperative complications as the variable for the calculation of the learning curve: a TaTME study and a L-TME study estimated the learning curve to be respectively 40 and 243 patients using the CUSUM method [12, 37]. Additionally, two studies used positive CRM as oncological variable for the analyses of the learning curve, both using RA-CUSUM analyses: Length of the learning curve was 418 in a R-TME study [35] and 50–70 in a L-TME study [10]. Most studies calculated the learning curve based on postoperative morbidity: using CUSUM analyses lengths differed between 45 and 79 for L-TME studies [11, 12], 40–191 for R-TME studies [32, 35], and 21–108 for TaTME studies [8, 9, 37, 38]. When only taking into account RA-CUSUM analyses, lengths were 191 for R-TME [35] and between 24 and 54 for TaTME [8, 9, 38]. No RA-CUSUM analysis was conducted for L-TME.

Lengths of the learning curve using (RA-)CUSUM analyses based on compound outcome of clinical variables, were 11, 32, 75 and 177 for four R-TME studies and 36 for a TaTME study. No RA-CUSUM analysis was conducted for L-TME. A CUSUM analysis based on compound outcomes showed a length of 50 procedures in a L-TME study [9, 11, 35, 61]. When only taking into account RA-CUSUM analyses based on a compound outcome, length of the learning curve was between 32 and 177 for R-TME, 36 for TaTME, while this was not performed for L-TME.

Finally, taking into account all RA-CUSUM analyses of clinical outcomes only, length of the learning curve was between 50 and 70 for L-TME, 32–418 for R-TME and 36–54 for TaTME.

Before-after learning curve comparison

After establishing a learning curve, 23 studies reported on the comparison of outcomes between patients that had been operated during the learning curve and patients that had been operated after completing the learning curve [8, 9, 11, 13,14,15, 24, 25, 29, 30, 32, 35, 37, 38, 43, 46, 47, 51,52,53, 56,57,58]. Bege et al., who used postoperative complications to assess the learning curve, showed a decline in postoperative morbidity after the learning curve for L-TME was reached [11]. Rubinkiewicz et al., who used postoperative morbidity, intraoperative morbidity, operative time and a composite outcome to assess the learning curve of TaTME, showed a significant decline in postoperative morbidity and intraoperative morbidity after the learning curve was reached [37]. Operative times were significantly reduced in thirteen studies after the learning curve was reached [8, 14, 15, 24, 33, 37, 38, 46, 47, 51,52,53, 56, 57] (Table 5). Eight of these studies used operative time to assess the learning curve. While in three R-TME studies and two TaTME studies the learning curve was based on clinical outcomes [8, 14, 15, 35, 38].

Table 5 Comparison of outcomes during the learning curve and after the learning curve

Discussion

This systematic review aimed to provide an overview of the current literature regarding the learning curve of L-TME, R-TME and TaTME, and reveals the paucity of high-quality studies. The few available studies using a high-quality RA-CUSUM analysis based on intraoperative complications, postoperative complications or oncological outcomes show similar lengths of the learning curve for L-TME, R-TME, and TaTME. Additionally, although length of the learning curve is suggested to be similar, L-TME and TaTME might bear the risk of additional morbidity while obtaining the learning curve.

Only one L-TME study, three R-TME studies and three TaTME studies used the RA-CUSUM analysis based on clinically relevant outcomes such as intraoperative morbidity, postoperative morbidity or oncological outcomes [8, 9, 12, 14, 15, 35, 38]. Length of the learning curve was 50–70 for L-TME, 32–418 for R-TME and 36–54 for TaTME. This might suggest that the learning curve for R-TME is considerably longer than the learning curve of L-TME and TaTME. However, the results are influenced by the study of Lee et al., who found a learning curve of 177–418 procedures for R-TME [35]. As the authors state in their discussion, the substantial length of the learning curve might be due to the high number of examined cases: with increasing number of consecutive cases, length of the learning curve increases as well [5, 35, 62]. Taking this into account, the learning curve shows similar lengths between techniques: 50–70 procedures for L-TME, 32–75 procedures for R-TME and 36–54 procedures for TaTME [9, 11, 13,14,15]. This is in line with other systematic reviews evaluating the learning curve of minimal invasive techniques. A systematic review estimated the learning curve to be between 30 and 50 procedures in TaTME [16], and another systematic review estimated the learning curve of R-TME to be 37 procedures [17]. Furthermore, two systematic reviews compared length of the learning curve between L-TME and R-TME. One included studies with colorectal patients, both having benign and malign disease and reported a length between 5 and 310 for L-TME and 15–30 for R-TME [19]. A more recent systematic review only included studies with surgeons without laparoscopic experience and showed equal length of the learning curve: 44–55 for L-TME, and 41–55 for R-TME [20].

Although the length of the learning curve might not differ between the three techniques, L-TME and TaTME might bear the risk of additional morbidity while obtaining the learning curve. A L-TME and a TaTME study show higher rates of intraoperative and postoperative complications before reaching the learning curve, while no R-TME study shows a difference between these two phases [11, 18, 37]. Additionally, a systematic review comparing outcomes before and after the learning curve of TaTME showed less intraoperative complications, less anastomotic leakages and better quality of the TME specimen after the learning curve was obtained [16]. The evidence is scarce, but this might be in line with recently published data showing additional morbidity and higher local recurrence rates during the learning curve of TaTME [49, 50, 63,64,65]. This has also been suggested in a study assessing the learning curve of L-TME [10]. Perhaps the learning curve of L-TME and TaTME bear the risk of worsened oncological outcomes as these techniques differ significantly from the preceding ‘standard’ technique, while R-TME shows a high degree of similarity with the preceding L-TME technique. Subsequently, since most surgeons starting with R-TME have preceding experience with the L-TME technique, this influences the learning curve. While, on the other hand, surgeons starting with L-TME or TaTME start with a completely new technique, which might cause the additional morbidity during the learning curve.

The statements regarding length of the learning curve and additional morbidity during the learning curve should be interpreted cautious. Since, only limited amount of high-quality evidence exists, with lack of comparative studies, and a large amount of heterogeneity among studies. This is mainly caused by differences in patient-related factors, surgeon-related factors and statistical methods. First, regarding patient-related factors, inclusion- and exclusion criteria differ among studies, resulting in selection bias between studies. Furthermore, case-mix changes over the course of the learning curve: mostly an overrepresentation of “easy” patients is seen while climbing the learning curve, and more “difficult” patients are operated at the middle of the learning curve [14, 15]. Although case-mix can be controlled for by using a risk-adjusted analysis using the RA-CUSUM, this is only performed in a small number of studies.

Secondly, heterogeneity due to surgeon-related factors among studies exists as well: while some studies report on learning curves for individual surgeons, others report on institutional learning curves. As institutional learning curves might indicate the experience of the whole surgical team, they fail to address differences between individual surgeons. In addition, it is known that the first surgeon mastering the technique within an institution has a longer learning curve than the ones following, due to the institutional experience [66]. Furthermore, as experience with the minimal invasive technique and TME in general influences the learning curve, it is important to describe this. And although most studies reported the experience of the surgeon with the minimal invasive technique, details were lacking. Young surgeons who are at the start of their career, might have a longer learning curve than senior surgeons mastering minimal invasive surgery since the latter might have experience in performing open or L-TME [67]. Additionally, as R-TME and TaTME have been introduced 10–15 years later than L-TME, most studies addressing the learning curve of R-TME and TaTME included surgeons who already had experience with L-TME. This might be an important confounder while assessing the learning curve of L-TME with R-TME or TaTME, but it is inherent to the clinical practice. Finally, since TaTME is generally not used for an abdominoperineal resection, while this is performed using L-TME or R-TME, differences regarding the indication of the technique complicate the comparability of these techniques.

Thirdly, regarding heterogeneity among studies caused by the used statistical analyses, differences could be due to the used outcome measure to establish the learning curve, the used statistical technique and the used cut-off point. Regarding the used outcome measure to establish the learning curve, operative time is often used for the learning curve. However the outcome is said to be a poor surrogate for clinical outcomes, and mere a reflection of efficiency [5, 68]. Instead, clinical outcomes that are of interest for patients should be used to assess the learning curve [5, 68]. For example, intraoperative complications, major postoperative complications [69], positive CRM rate and for the long term local recurrence rate [70]. Additionally, in order to provide comparable outcomes, clear definitions according to international standards should be used [71].

Regarding the used statistical technique, several methods for the analyses are used: split group analysis (SGA), moving average analysis (MAA), CUSUM and RA-CUSUM. For SGA, patients are arbitrarily divided into two or more groups, based on the chronological order. Since these learning curves are dependent on how groups were divided, it could be doubted whether SGA is suitable for analyzing learning curves [5, 28, 39, 41]. MAA learning curves are based on operative time alone. As operative time might not be an adequate indicator of proficiency, this technique might not be suitable either [72]. CUSUM and RA-CUSUM analyses are more complex methods used to continuously monitor outcomes. The CUSUM is a chronically ordered cumulative sum of the difference between the outcome of the procedure and the average of the studied cohort or a predefined cut-off point based on literature [14, 15]. The RA-CUSUM analysis is the more sophisticated method, correcting for case-mix that may influence the risk of an event [14, 15, 35, 73]. However, both methods have been developed to monitor processes known to be adequate, while signaling inadequacy. For surgeons carrying out a procedure they have not yet performed regularly the learning curve CUSUM (LC-CUSUM) might be more suitable. This analysis assumes inadequacy of the surgeon, while signaling adequacy [62]. This method could be used when the surgeon has no experience with the procedure, as is the case with young surgeons starting with L-TME or R-TME. Or it can be used for describing the learning curve of an experienced colorectal surgeon starting with TaTME, since this procedure is to a large extent different from the “top-down” approach used in open, L-TME and R-TME.

Finally, regarding the used cut-off points, all CUSUM methods can be performed using limits based on averages of the cohort or using literature-based limits. Using averages of the cohort complicates comparison with other studies. And, as mentioned earlier, using averages causes the length of the learning curve to increase with larger cohort size [5, 62]. Therefore, literature-based limits are preferred. Furthermore, the point at which ‘proficiency’ is reached influences the length of the learning curve as well. Studies used two different points to identify proficiency: the point at which the graph deflects or the point at which a stabilization occurs. Both methods are used, while different outcomes are produced [13, 24, 52]. Therefore, it has been proposed that a learning plateau (i.e., stabilization) should reach a predefined competency level, based on estimates available in literature [5]. As not all studies included in this systematic review provided the point of stabilization, while the point of deflection was provided in every study, this was used in our analysis for assessing the length of the learning curve.

Although this is the first systematic review to provide an overview of the literature regarding the learning curves of all three minimal invasive techniques of TME and the methods used to establish them, it cannot draw a definite conclusion regarding differences in length of the learning curves and differences in additional morbidity during the learning curve of L-TME, R-TME and TaTME. Clearly, more high-quality studies are necessary to shed light on the learning curve of minimally invasive techniques for rectal resections. We suggest that this should preferably be performed with comparative studies, while controlling for patient-related factors (i.e., risk-adjusted analysis), and surgeon-related factors such as experience with TME in general and experience with the specific minimal invasive technique. In addition, if former experience with the TME procedure is limited (i.e., beginning surgeon or adhering to a new technique like TaTME) the LC-CUSUM should be used. Furthermore, we propose that learning curves should be established for individual surgeons, based on the following clinically relevant outcome variables: intraoperative morbidity, (major) postoperative morbidity and positive CRM. Additionally, clear outcome definitions should be reported and learning curves should be estimated using literature-based limits. Finally, comparison of outcomes during and after the learning curve should be performed, to investigate whether the learning curve is associated with additional morbidity.