Main

Breast cancer is a heterogeneous disease that may recur soon after initial diagnosis or after a follow-up period of >10 years. The recurrence risk varies over time according to molecular and clinical risk factors (Jatoi et al, 2011). ER-negative tumours and HER2−positive tumours display an increased annual rate of recurrences and deaths after a short period of time (1–3 years). In contrast, ER-positive (ER+)/HER2-negative (HER2−) patients have a considerably lower annual rate in the first years, but the annual recurrence rates persist after the first 5 years (Jatoi et al, 2011).

Despite the initial and sustained benefit of tamoxifen treatment in ER+/HER2− tumours, >50% of all relapses and more than two-thirds of deaths occur more than 5 years after diagnosis (Saphner et al, 1996; EBCTCG, 2005). The persistent risk of ER+/HER2− breast cancer over time is currently being addressed therapeutically in several large phase III randomised trials of extended endocrine therapy.

The National Cancer Institute of Canada Clinical trials Group (NCIC CTG) MA-17, a randomised, double-blind, placebo-controlled trial, showed that switching from tamoxifen to letrozole after 5 years of tamoxifen treatment improved disease-free survival (Goss et al, 2003; Goss et al, 2005). These results were confirmed by the ABCSG6a trial (Jakesz et al, 2007) using anastrozole for 3 years of extended therapy and the clinical data available from NSABP-33 (Mamounas et al, 2008). More recently, the ATLAS trial reported that 10 years of tamoxifen (in comparison to 5 years) halved mortality rates a decade after diagnosis (Davies et al, 2012). Importantly, close to 20 000 patients are currently being treated in phase III clinical trials investigating endocrine treatment beyond 5 years.

Based on published findings, women with ER+ disease who have completed 5 years of tamoxifen treatment should be considered for longer duration therapy using aromatase inhibitor (AI) or extended tamoxifen treatment. However, the expected benefit of prolonged anti-hormonal treatment has to be assessed based on toxicity and the individual likelihood of a late recurrence.

A first step to an individualised extended endocrine treatment of late metastases is therefore to identify women at risk and to understand the underlying biology. Clinical factors such as increased tumour size and nodal positivity have been shown to be associated with late relapse (Kennecke et al, 2007). So far, several prognostic multigene tests have been developed for ER-positive breast cancer patients to identify patients with a high risk of relapse (Paik et al, 2004; Parker et al, 2009; Nielsen et al, 2010; Filipits et al, 2011; Dubsky et al, 2012a). However, while these molecular tests are valuable to predict early metastasis, they commonly fail to identify late events (Esserman et al, 2011). This suggests that molecular mechanism may be different for the initiation of early and late relapse.

In summary, individualised estimates of late metastatic risk are a largely unmet medical need with regard to potential benefits of extended anti-hormonal treatment. Here, we assess whether the prognostic EndoPredict (EP) score - a multigene score that combines the expression levels of proliferative and ESR1 signalling/differentiation-associated genes – identifies late relapse events in ER+/HER2− breast cancer patients.

Materials and methods

Patients and samples

Patients included in this study participated in the ABCSG6 (tamoxifen-only arm) or ABCSG8 trial (Schmid et al, 2003; Jakesz et al, 2005; Dubsky et al, 2012b) and received either tamoxifen for 5 years or tamoxifen for 2 years followed by anastrozole for 3 years. None of the patients received adjuvant chemotherapy. Inclusion criteria and clinico-pathological assessment were recently reported by Filipits et al (2011). A total of 378 formalin-fixed, paraffin-embedded (FFPE) samples from ABCSG6 and 1324 FFPE samples from ABCSG8 were combined for the analysis (patient characteristics – Supplementary Table S1). The chosen cohorts represent the validation but not the training sets for the EP test.

Some of the patients from the ABCSG6 and ABCSG8 trials were enrolled into extended endocrine therapy trials (ABCSG6a (Jakesz et al, 2007) and ABCSG16). To avoid any potential bias, all patients with extended therapy were censored at the time of enrolment in ABCSG6a and ABCSG16. So, all patients included in this study were treated with 5 years endocrine therapy only. To ensure that the selective censorship does not lead to any additional selection bias, we compared all molecular and clinical characteristics between patients who were enrolled in extended endocrine trials and those who were not (Supplementary Table S2). Age and treatment arm (tamoxifen vs tamoxifen for 2 years followed by anastrozole for 3 years) were significantly different between the compared subsets. Except for these two variables, there were no significant differences for all other molecular and clinico-pathological parameters.

After censoring patients with extended endocrine therapy, 998 patients were at risk after 5 years with a median follow-up of 7.12 years. Approval for genetic expression analyses and retrospective analyses was obtained from institutional review boards.

EP and EPclin

The EP assay is based on the quantification of eight genes of interest and three normalisation genes in FFPE tissue sections by quantitative RT–PCR (Filipits et al, 2011). The combination of the EP with the two clinical risk factors nodal status and tumour size results in the EPclin. EP and EPclin low-risk and high-risk categories were those pre-specified before the validation in the ABCSG6 and ABCSG8 studies, as recently described (Filipits et al, 2011). Patients with an EP score <5 (EPclin score<3.3) were classified as low risk for distance recurrence, whereas patients with an EP score 5 (EPclin score 3.3) were stratified as high risk.

Exploratory analysis – biological signatures/modules associated with late recurrence

The gene expression levels of the EP signature were analysed according to the two time periods (0–5 years, early recurrence and >5 years, late recurrence). The analysis was carried out using the eight EP genes, and there was no comprehensive review of other predictive markers for late recurrence in high-throughput data sets. Therefore, the analyses should be regarded as hypothesis-generating.

The gene expression data were used to study the most common biological signatures that are combined in the EP test. Relative expression levels of the eight genes of interest were calculated as dCt values with regard to the three reference genes: dCt=20−Ct (gene of interest)+Ct (mean of reference genes). Two different molecular modules were defined: dCt levels of BIRC5, UBE2C and DHCR7 were used as a surrogate marker for proliferation/cell cycle. The linear combination of the three dCt values (BIRC5, UBE2C and DHCR7) was calculated using the same coefficients as in the EP risk score for all analysed samples: . As with the proliferation metagene, the linear combination of the dCt values of RBBP8, IL6ST, AZGP1, MGP and STC2 was used as a surrogate marker for ESR1-related signalling/cell differentiation: .

Statistical analysis

The primary end point of the statistical analysis was distant metastasis. Metastasis rates were estimated using the Kaplan–Meier method. All reported P-values are results of two-sided tests. P-values <5% were considered statistically significant. EP/EPclin was calculated using MATLAB software, version R2011b (The MathWorks, Inc., Natick, MA, USA). P-values and hazard rates were assessed in two different time intervals (0–5 years, >5 years) according to the different parameters. The c-index was used to assess the prognostic performance of the EP signature and clinico-pathological parameters. Its unbiased estimation for a combination of variables and its use to determine whether a variable adds significant information to a set of other variables was calculated as recently described. Clinico-pathological variables were used for multivariate analysis and c-index analysis as recently described (Filipits et al, 2011).

Results

The EP score identifies early and late relapse events

To assess the impact of the EP score and clinico-pathological parameters on the prediction of early and late metastases, we retrospectively analysed 1702 ER-positive, HER2−negative postmenopausal breast cancer patients from the ABCSG6 and 8 trials (patient characteristics – Supplementary Table S1).

In all, 49% (n=832) of all patients were classified as low risk according to the EP score. Kaplan–Meier analysis demonstrated that the EP low-risk group had a significantly improved clinical outcome in the first (0–5 years; P<0.001) and second time interval (>5 years; P=0.002; Figure 1). The EP low-risk group showed an absolute freedom of distance recurrence of 96.29% (93.48%–99.11%) between 5 and 10 years of follow-up. Nodal status was also significantly associated with the clinical outcome in both time intervals, with node-positive tumours showing a considerably higher rate of late recurrence events in comparison to patients with node-negative disease (Supplementary Figure S1). In contrast, grading and Ki67 levels were not significantly associated with late metastases (Supplementary Figure S2). Multivariate analysis showed that EP is an independent prognostic parameter after adjustment for age, grade, lymph node status, tumour size and Ki67 in the first and second time interval (Table 1).

Figure 1
figure 1

Kaplan–Meier plots. Kaplan–Meier plots of distant recurrence by the EP groups between (A) 0–5 years of follow-up and (B) 5–10 years of follow-up in the combined ER+/HER2− cohort (ABCSG6/8, n=1702). Cutoff point for EP was prespecified at 5. The numbers in parentheses indicate the 95% CI of the HR.

Table 1 Multivariate Cox proportional hazard models for estimating the contribution of variables to predict distant recurrence in the time interval 0–5 years and after 5 years (1702 ER+/HER2− tumours, ABCSG6/8)

Contribution of proliferative and ER signalling/differentiation-associated genes to early and late relapse events – an explorative analysis

The EP test identified a subgroup of patients who have a low likelihood of developing early and late metastases. Multivariate analyses demonstrated that the test provides complimentary prognostic information to clinico-pathological parameters. To analyse the underlying biology behind these findings, the prognostic genes of the EP test were subdivided according to biological functions. Although, the genes of interest cover several cellular processes such as apoptosis, DNA repair, cell adhesion, and cell signalling, the genes are also co-regulated with genes reflecting two relevant biological modules known to contribute to recurrence risk: proliferation and ER signalling/differentiation. The expression levels of BIRC5, UBE2C and DHCR7 were used as a surrogate marker for proliferation/cell cycle, whereas the expression levels of RBBP8, IL6ST, AZGP1, MGP and STC2 were used as a surrogate marker for ER signalling/cell differentiation.

Multivariate analysis included the same variables described previously (Table 1) but the multigene algorithm is now subdivided into the surrogates of proliferation and ER signalling. Proliferation genes add independent prognostic information to all clinical parameters included into the model for the prediction of early recurrences (0–5 years): a high expression of genes – thought to contribute to cell cycle progression – is significantly associated with higher rates of distant metastasis during the first 5 years but no longer shows a significant additional prognostic performance during the timespan thereafter (Table 2). In contrast, genes associated with ER signalling were not significantly associated with early metastases but showed additional prognostic information in the second time interval (Table 2).

Table 2 Multivariate Cox proportional hazard models for estimating the contribution of variables to predict distant recurrence in the time interval 0–5 years and after 5 years (1702 ER+/HER2− tumours, ABCSG6/8)

Combination of molecular and clinical parameters improves the prediction of late recurrences

C-indices, a statistical measure of prognostic performance, were calculated for all common clinico-pathological parameters and the EP test to assess the prognostic performance and individual contribution to the prediction of late distant metastasis (Figure 2). The combination of clinical parameters resulted in a c-index of 0.644 in the combined cohort. The addition of the EP score to the combination of clinico-pathological parameters resulted in a c-index of 0.716. The prognostic performance was significantly improved adding the molecular information of the EP test (P<0.001; Figure 2). Comparable results were obtained when the Adjuvant! Online score was used for classification: c-index for prediction of late metastases significantly increased from 0.674 to 0.765 by adding the prognostic information of the EP score. Adjuvant! Online, Ki67 and quantitative ER IHC were further combined resulting in a c-index of 0.64. EP also improved the combination of these parameters (Figure 2, c-index: 0.713). Analysis of proliferation and ESR1 signalling modules demonstrated that ER signalling contributes to the prognostic performance, whereas the addition of the proliferation module did not show a significant effect (Supplementary Figure S3). These results provide further evidence that ESR1 signalling is biologically involved in late metastasis.

Figure 2
figure 2

C-indices demonstrating the prognostic performance of different clinical and molecular parameters in 1702 ER+/HER2− breast cancer patients (ABCSG6/8) after 5 years of follow-up. The values on the x axis are unbiased estimates of the c-index of the linear combination of one or more variables by Cox regression. Statistical tests indicate whether the c-index increases significantly by addition of EP to a fixed set of clinico-pathological variables. Abbreviations: EP, EndoPredict (continuous); ER, oestrogen receptor (categorical); G, grade (categorical); N, nodal status (categorical); T, tumour size (categorical).

The EPclin – a predefined combination of the EP and the clinical risk factors nodal status and tumour size – showed the best performance in predicting late relapse events with a c-index of 0.786 (Figure 2). Splitting the EPclin score into the clinical and molecular information demonstrated that the molecular information (EP score) adds significant (P<0.001; Figure 2) prognostic information to the weighted clinical score of nodal status and tumour size (N/T score=0.35·T+0.64·N). C-index was increased from 0.666 to 0.771, which is still smaller compared with the predefined EPclin score (0.786, Figure 2). EPclin subgroups were further analysed using Kaplan–Meier curves with dichotomised EPclin low- and EPclin high-risk groups. Significant differences in metastasis-free survival were found between the EPclin risk groups between 0 and 5 years and after 5 years of follow-up (Figure 3). After 10 years of follow-up absolute freedom of distance recurrence of patients with EPclin low-risk and EPclin high-risk were 98.20% (96.54–99.85%) and 87.69% (82.86–92.52%), respectively.

Figure 3
figure 3

Kaplan–Meier plots. Kaplan–Meier plot of distant recurrence by EPclin groups between (A) 0–5 years of follow-up and (B) 5–10 years of follow-up in the combined ER+/HER2− cohort (ABCSG6/8, n=1702). Cutoff point for EPclin was prespecified at 3.3. The numbers in parentheses indicate the 95% CI of the HR.

EP as well as EPclin were also significantly associated with early and late recurrences when the analyses were restricted to patients who had been treated with 5 years of tamoxifen (excluding patients treated with tamoxifen for 2 years followed by anastrozole for 3 years; Supplementary Figure S4).

Discussion

In this study, we have assessed whether clinical and pathological variables and the multigene expression signature EP are associated with distant metastasis arising later than 5 years after the diagnosis of breast cancer.

This is a retrospective analysis of a large biomarker sample set obtained from two phase III clinical trials from ABCSG asking endocrine treatment questions and relying on prospectively acquired clinical data (Schmid et al, 2003; Dubsky et al, 2012b). This sample set has previously been analysed to validate the EP test (none of these samples were included in the training phase) (Filipits et al, 2011). According to the revised level of evidence classification by Simon et al (2009), the EP test has level Ib evidence.

We show that EP is significantly associated with both early and late distant metastasis and that the signature provides additional prognostic information regarding late metastasis to common prognostic variables or a combination thereof (e.g., the Adjuvant! Online algorithm). An exploratory analysis of the biological modules contained in the EP score suggests that the expression of genes thought to be regulated by the oestrogen receptor contribute to late distant metastatic risk. A high expression of these genes is associated with a lower incidence of late distant metastasis in this homogenous cohort of women treated with endocrine treatment in the absence of chemotherapy.

The basic hypothesis of this analysis states that tumour-derived biological factors contribute to late distant metastasis. However, during a full decade of survivorship many other factors may have a profound impact on cancer recurrence: the retrospective nature of our analysis does not account for changes in lifestyle, not for all competing health problems or psychosocial issues. In line with assumptions that would prioritise host factors, the training of first-generation multigene signatures was carried out to predict early but not late recurrence. For instance, Mammaprint was developed to identify metastases that occurred within 5 years after initial diagnosis (van 't Veer et al, 2002). Most of the prognostic multigene tests heavily rely on genes associated with cell cycle progression/proliferation. Not surprisingly, these signatures failed to identify late events (Esserman et al, 2011; Sgroi et al, 2013). For instance, the Oncotype DX assay failed to identify late distant metastases in the ATAC trial (Sgroi et al, 2013). In contrast, the Breast Cancer Index (BCI) was found to be a significant prognostic factor in the same trial (Sgroi et al, 2013). BCI combines genes that are associated with proliferation and oestrogen-receptor signalling (Ma et al, 2008). Our analysis also suggested that proliferation genes predict early recurrence events but show a decreased prognostic performance after 5 years of follow-up. The results indicate that underlying molecular mechanism leading to the development of early – as opposed to late – metastases differ substantially.

Additionally, we analysed how clinical factors contribute to the prediction of late relapse events. As already described by Kennecke et al (2007), nodal status and tumour size are important prognostic parameters for late recurrences. So both, molecular and clinical parameters contribute to the underlying biology of late distant metastases. The EPclin, a previously validated score that in addition to gene expression incorporates tumour size and nodal status, was consequently identified to be the best predictor of late recurrence events and correctly identifies a large subset of patients with very low late relapse rates.

The main limitation of this study lies within the biomarker sample gained from ABCSG6 and ABCSG8. Both studies addressed endocrine treatment questions in the absence of adjuvant or neo-adjuvant chemotherapy. Accordingly, there is a clear selection bias toward clinically low-risk ER+ postmenopausal breast cancer patients: only one-third of patients had involved lymph nodes and only 5% >3 metastatic nodes. In all, 32% had T2 tumours and only 2% were >5 cm. Due to the inclusion criteria and the study designs, only 4% of women had high-grade differentiation. A total of 998 women were at risk 5 years after diagnosis and the median follow-up of this group is just over 7 years. Clearly, an even longer follow-up for a study that specifically investigates late relapse would be desirable but could not be obtained for the current analysis. This is partially due to the fact that some of the patients from the ABCSG6 and ABCSG8 trials were censored for the late time interval, because they were treated with extended adjuvant therapy (ABCSG6a, ABCSG16). We decided to censor those patients to analyse a homogenous cohort of ER+/HER2− breast cancer patients who were treated with 5 years of endocrine therapy only. We tested whether the selective censorship could potentially lead to any bias. Age and treatment arm (tamoxifen vs tamoxifen/anastrozole) were significantly different between patients who were enrolled into ABCSG6a/ABCSG16 and those patients who were not. This is due to the fact that younger patients were more frequently recruited for extented endocrine trials. As age is not a prognostic parameter in postmenopausal breast cancer patients, it is unlikely that the results could have been overestimated or underestimated. None of the other molecular and clinical parameters was significantly different between the compared groups.

The 5 years cutoff was chosen, because most patients are commonly treated with 5 years of endocrine therapy, and the decision whether to prolong adjuvant endocrine treatment is normally made after 5 years of treatment. Other cutoff levels might also be reasonable for future studies in case that prolonged follow-up data is available.

With regard to these limitations, our findings are in need of validation in further populations; retrospective analysis of phase III trials investigating extended endocrine treatment would be preferable.

ER+/HER2− breast cancer displays a proclivity for late recurrence. Several phase III trials have addressed this issue and are addressing this medical need therapeutically by testing extended adjuvant therapy. Prospective and retrospective analyses from these data sets suggest that women with premenopausal status at diagnosis (Goss et al, 2013), with a co-expression of ER and PgR (Jakesz et al, 2007), a larger tumour size (Mamounas et al, 2008) and involved lymph nodes (Goss et al, 2005) derive an increased benefit from extended AI treatment. Data showing significant statistical interaction with extended adjuvant treatment that would suggest a predictive factor are lacking.

In the absence of predictive factors, the identification of women with an extremely low risk who can be spared a full decade of endocrine treatment is an important goal of clinical research. The low-risk group of women identified by the EPclin score comprises 64% of patients from our biomarker sample after 5 years of follow-up. These patients have an absolute risk of distant metastasis of 1.8% between 5 and 10 years of follow-up and might be sufficiently treated with 5 years of adjuvant endocrine therapy. The side effects and the economic burden of extended adjuvant endocrine therapy in addition to competing health risks should be weighed against such a projected outcome.