Main

Breast cancer is a heterogeneous disease. There is a great diversity within tumours that determines different tumour profiles, different degrees of treatment sensitivity and different rate of disease progression (Polyak, 2011). The identification, validation and application of suitable predictive and prognostic factors is a major challenge in order to improve the treatment scheme selection for each individual patient.

The role of core needle biopsy has been clearly established as an important diagnostic tool for breast lesions, and it is considered the standard diagnostic method for breast disease (Di Loreto et al, 1996; Pettine et al, 1996; Pinder et al, 1996; Pijnappel et al, 1997; Crowe et al, 2003; Cipolla et al, 2006). The accuracy of core biopsy for the diagnosis of breast carcinoma is superior to the cytological diagnosis by fine needle aspiration, and a good concordance rate has been reported between core biopsy and surgical biopsy (91–100%), with a specificity rate ranging from 96% to 100% (Verkooijen et al, 2000).

In addition to the pathological diagnosis, there is a growing demand for predictive and prognostic factors, which are determined by tumour biology. Selection of patients for systemic treatment relies on histological features from non-surgical diagnostic samples. However, core biopsy may not accurately reflect the histological features of the tumour. It provides a relatively small sample of tumour tissue, and the obtainment of representative samples can be difficult due to the existence of a peritumoral or intratumoral heterogeneity (Morris et al, 2002).

Histological grade (HG) as defined by Elston and Ellis is a major prognostic and predictive factor in breast cancer (Amat et al, 2002; Wang et al, 2002; Petit et al, 2004; Guarneri et al, 2006). This is especially important in HR-positive/HER2-negative patients where HG discriminates low grade, low proliferation and good prognosis tumour from high grade, high proliferation and poor prognosis tumour and is consequently a central determinant of the treatment scheme. Whereas patients with positive-HER2 or triple-negative tumours will most likely receive chemotherapy irrespective of the HG.

However, a broad range of variation in HG concordance between core biopsy and surgical excision specimen has been previously described (59–91%) (Sharifi-Salamatian et al, 2000; Komaki et al, 2006; Usami et al, 2007; Rakha et al, 2010). The largest study (Harris et al, 2003) found a concordance rate of 67% in 500 patients. The grade in core biopsy was more underestimated than overestimated, and mitotic index had the highest rate of discordance, with an underestimation of 35%.

However, physicians need an accurate grading determination in two different settings: (1) In patients whose first-line treatment is surgery, the identification of aggressive pathological features from the core biopsy can select patients who will require adjuvant chemotherapy. A port-a-cath will be implanted during the same surgical procedure; and (2) in neoadjuvant treatment if a complete pathological response is achieved, the core biopsy specimen is the only sample available to evaluate prognostic and predictive factors.

The information obtained from core biopsy may be the only information available for determining the candidates for (neo) adjuvant treatment.

To our knowledge, no studies have reported the correlation between the HG discordance associated with tumour subtypes and the impact on treatment planning. Therefore, the present study aimed to analyse the HG concordance between core biopsies and corresponding surgical specimens related to different tumour subtypes, especially in HR-positive HER2-negative invasive breast carcinoma and to assess whether grade discrepancy could alter the treatment decision-making process.

Patients and methods

The medical records and pathological reports of 350 patients who had both a core biopsy for invasive breast carcinoma and a surgical excision of the tumour at the Institut Curie, Paris, France between January 2008 and December 2009 were reviewed.

Only invasive ductal carcinomas were included in our study. Patients who underwent neoadjuvant treatment (chemotherapy, hormonal therapy or radiotherapy), multicentric and multifocal tumours and patients with a personal history of other cancer were excluded.

Clinical, radiological and pathological data such as patient age, menopausal status, tumour size determined by clinical and ultrasound assessment, lymph node involvement, histological tumour grade, estrogen receptor (ER), progesterone receptor (PR) and HER2 status were collected through a retrospective review of medical and pathological records.

All breast cancer cases were pathologically diagnosed by either palpation-guided, stereotactic (using an 8 or 11 gauge automated needle) or ultrasound-guided core biopsy (14-gauge automated needle). Core biopsies were then fixed in formalin, paraffin embedded and processed according to the standard protocol.

Histological and immunohistochemical study

Histological tumour grade

The HG was scored for tubule formation, nuclear pleomorphism and mitotic count. Mitotic Index was assessed on histological sections stained by haematoxylin and eosin (Figure 1). The criteria of Van Diest et al (1992a,, 1992b) was used to define the mitotic figure. It corresponded to the mitotic score defined in the Nottingham grade. The number of mitoses observed in 10 consecutive high-power fields using a Leica (Wetzlar, Germany) DMRB microscope with a × 40 objective and a × 10 ocular was counted. Cutoff of <11, 12–22 and >22 mitosis was used to define low, intermediate and high mitotic indexes, respectively. Overall, final tumoral grade was scored according to the Elston and Ellis modification of the Scarff–Bloom–Richardson grading system (Elston and Ellis, 1991).

Figure 1
figure 1

Grade III tumour.

ER and PR status

After rehydration and antigenic retrieval in citrate buffer (10 mM, pH 6.1), the tissue sections were stained for ER (clone 6F11, Novocastra, Leica Biosystems, Newcastle, UK; 1/200) and PR (clone 1A6, Novocastra, 1/200). Revelation of staining was performed using the Vectastain Elite ABC peroxidase mouse IgG kit (Vector, Burlingame, CA, USA) and diaminobenzidine (Dako A/S, Glostrup, Denmark) as chromogen. Positive and negative controls were included in each slide run. Cases were considered positive for ER and PR according to the standardised guidelines using a cutoff of 10% stained tumour nuclei (Balaton et al, 1995, 1996).

HER2 status

After rehydration and antigenic retrieval in citrate buffer (10 mM, pH 6.1), the tissue sections were stained for HER2 (clone CB11, Novocastra, 1/1000). Revelation of staining was performed using the Vectastain Elite ABC peroxidase mouse IgG kit (Vector) and diaminobenzidine (Dako A/S) as chromogen. Positive and negative controls were included in each slide run. The determination of HER2 overexpression was determined according to the GEFPICS guidelines, with FISH performed in all cases of HER2 2+ result (Penault-Llorca et al, 2010).

Definition of tumour subtypes

Immunohistochemistry features of each tumour subtype were applied. Patients were categorised as HR positive/HER2 negative, HER2 positive and ER negative/PR negative/HER2 negative (triple negative), though encompassing ER and PR between 1% and 10% of positive cells.

Surgery and adjuvant treatment

Patients underwent either mastectomy or breast-conservative surgery with axillary lymph node dissection depending on tumour stage and lymph node involvement. After surgery, adjuvant treatment (local-regional radiotherapy, hormonal therapy or adjuvant chemotherapy) was given according to the Institut Curie’s Treatment Guidelines: positive-HR tumours with good prognostic factors (grade I, small tumour size, no lymph node involvement) could not receive adjuvant medical treatment while positive-HR tumours with poor prognostic factors were treated with hormone therapy. Adjuvant chemotherapy was given according to tumour subtype, age and prognostic factors (tumour stage, lymph nodes, HG).

Statistical analysis

The description of quantitative variables was performed using median and range (minimum–maximum). The qualitative variables were presented by means of the description of proportions. A comparison was done between the core biopsy results and the final histology from surgical specimens. Baseline characteristics were compared using Chi-square or Fisher’s exact tests for categorical variables and Student’s t-tests for continuous variables. All tests were two-sided, and the significance level was set at 0.05.

The correlation between groups was calculated using the Cohen’s Kappa coefficient. Complete agreement is considered as Kappa score of 1. Kappa values close to or <0 show poor correlation. An absolute concordance rate was calculated for all three corresponding grades. Kappa was also calculated for each grading system separately (tubule formation, nuclear pleonorphism, mitotic index and Elston and Ellis grade).

Results

A total of 350 consecutive biopsies were assessed and compared with surgical specimens: 299 positive HR/negative HER2 cases (85%), positive HER2 in 29 cases (8%), and 22 cases (6%) were triple-negative tumours.

HR+/HER2− tumour subgroup

The median minimum size of the tumour cylinder core biopsies was 7 mm (range 1–22) and the median maximum size was 12 mm (range 3–30). The median number of core biopsies per patient was 3 (1–11) (Figure 2).

Figure 2
figure 2

HR+/HER2− tumour.

Clinical and pathological characteristics are detailed in Supplementary Table S1. From this group, 67% of core biopsies were performed under ultrasound guidance and 33% under clinical or stereotactic guidance.

The overall concordance rate for HG between the two sample types was 75% (224 patients), Kappa 0.59. Analysing grades separately, the concordance rates were 78% in grade I, 68% in grade II and 95% in grade III tumours (Table 1).

Table 1 Concordance rate for histological grade in the HR+/HER2− group

In the 75 discordant cases (25%), underestimation of HG was found in 55 patients (73%, 18% of all cases) and overestimation in 20 patients (27%, 7% of all cases). Overall Elston–Ellis grading error did not differ by more than one grade in 73 cases (97%, 24% of all the cases). No tumour graded III on biopsy was downgraded I on the surgical specimen. There were two cases graded I on core biopsy upgraded III on the surgical specimen (2% of grade I cases).

Regarding tubule formation, nuclear pleomorphism and mitotic index, the concordance rate was 75% (Kappa 0.56), 66.5% (Kappa 0.41) and 75% (Kappa 0.44), respectively. Overestimation of the grade was due to overrating the nuclear pleomorphism (50 cases, 17% of all the cases), whereas underestimation of the grade was mainly due to undergrading the mitotic index (70 cases, 23% of all the cases).

Potential factors for grading errors were analysed (Table 2). A discordant grade was found more frequently in premenopausal patients (37% vs 21%, P=0.005), in non-ultrasound-guided biopsies (32% vs 22%, P=0.04) and in larger tumours, measured by both clinical and ultrasound examination (P=0.02 and P=0.007, respectively). No statistically significant difference was found in the number, neither in the minimal nor in the maximal size of the core cylinders.

Table 2 Factors associated with grade misclassification

Table 3 shows the data related to factors that might have increased the risk of misclassification of the mitotic index. No multivariate model was identified to correct the discrepancy of these factors, most likely due to the strong interaction among them.

Table 3 Factors associated with mitotic index misclassification

Keeping in mind a potential reduction of chemotherapy treatment administration in ER+/Her2− patients, the grade II and III cases were analysed together vs the grade I index cases. The concordance rate was 87% (261 cases), whereas underestimation from core biopsies was found in 20 cases (7%) (Table 4), the Kappa value was 0.69. Analysing grade II/III together improved the consistency compared with a separate analysis of each grade subgroup (Kappa 0.69 vs 0.59, respectively). No factors were found to influence the risk of misclassification (Supplementary Table S2).

Table 4 Concordance rate in grade I vs grade II/III

Triple-negative and HER2+ tumour subgroups

Twenty-two patients (6%) had triple-negative tumours and 19 cases (86%) had Grade III score. Overall concordance rate for HG was 91% (Kappa 0.69), and discrepancy was found in cases of grade II (2 grade II in biopsy resulted to be grade III in postoperative specimen), not in grade III.

Twenty-nine Her2-positive tumours were identified (8%), 19 cases (66%) had grade III and the concordance rate was 79% (Kappa 0.6). All grade III cases were correctly scored on core biopsy; of the 14 cases with grade II on core biopsy, 5 were upgraded on surgical specimenbiopsy (Supplementary Table S3).

Consequences on treatment recommendations

In the HR+/HER2− group, 127 patients (43%) actually received chemotherapy, 146 hormonal therapy (49%) and 26 patients (9%) had no adjuvant therapy (Table 5).

Table 5 Changes in the recommended treatment due to grading discrepancies between core biopsies and surgical specimens in HR+/HER2− subgroup

The distribution of discordant HG cases in the different treatment subgroups was as follows: (P<0.05): 48 patients (38%) in adjuvant chemotherapy subgroup, 26 patients (18%) in hormone therapy subgroup, and 3 patients (12%) in no adjuvant treatment subgroup. The discordance within the adjuvant chemotherapy group was mostly related to an underestimation of HG on core biopsies (94% of cases).

Hypothetical changes in treatment were identified in 7 cases (2%): 6 patients (2%) who underwent adjuvant chemotherapy were supposed to receive initially hormone therapy according to biopsy results and 1 patient (0.35%) who was finally treated with hormone therapy was originally scheduled to receive chemotherapy. No changes in treatment recommendations before and after surgery were identified among patients who did not have an indication for adjuvant systemic therapy.

Discussion

Breast cancer treatment has become increasingly complex over time with the expansion of surgical and systemic therapy options. The need for accurate prognostic and predictive factors from core biopsy specimens has increased in importance as many therapeutic decisions are based on their results. Clinicians need to obtain reliable information from core biopsy to identify patients with good prognosis tumours as well as patients who are most likely to benefit from systemic therapy. Patients who have positive HR tumours tend to have a better prognosis for disease-free survival and overall survival than those with negative HR tumours. They are also much more likely to respond to endocrine therapy. HER2 overexpression or triple-negative tumours are associated with certain clinical outcomes, such as higher risk of recurrence and mortality, relative resistance to endocrine therapy and a great chemosensitivity (D’Alfonso et al, 2010).

Although positive HR tumours have good prognosis, even a subgroup of patients with HR has been observed to have a more aggressive tumour biology and may benefit from chemotherapy. Tumour grade as defined by Elston and Ellis is a central node of the treatment decision-making process in HR+/HER2− breast carcinoma as it discriminates low grade, low proliferation and good prognosis tumour from high grade, high proliferation and poor prognosis tumour. Age at diagnosis, tumour size (clinical, ultrasound, MRI) and lymph node involvement are other major treatment decision factors in this setting.

Multiple studies have investigated the concordance between core biopsy and surgical specimens, usually with small patient series and with large discrepant results. Our study was undertaken: (1) to assess the correlation between the tumour grade (Elston and Ellis) obtained from core biopsy samples and from corresponding surgical specimens according to different subtypes, mainly in HR+/HER2− breast cancer, and (2) to assess how grade discrepancy could affect the treatment decision-making process.

We found that the concordance rate for HG in HR+/HER2− was 75%. Grade from core biopsy tends to be lower than from full tumour specimen, due to an underestimation of mitotic index (Harris et al, 2003; Badoual et al, 2005; Burge et al, 2006; Park et al, 2009). Our Kappa statistics indicated a good agreement in tubule formation and global HG, whereas Kappa value for nuclear pleomorphism and mitotic index was poorer. Fifty-five cases (18%) were undergraded, but few cases differed from more than one grade (only two tumours classified as grade I on biopsy were subsequently reclassified as grade III, and all grades III on biopsy were reconfirmed later on postoperative specimen). When we analysed jointly grade II and III vs grade I, the concordance rate increased (87% of all cases) and underestimation was lower (7% of all cases); Kappa value showed a stronger correlation (Kappa 0.69). Such underestimation is due to the inherent small sample size in core biopsy procedure. The tissue sampled may not be fixed optimally or may not include the growing edge of the tumour and mitotic index cannot be assessed accurately. To improve the correlation, some standardised parameters in tissue fixation, tissue processing and grading classification method have been established (Rakha et al, 2008).

Another cause of discordance is intratumor heterogeneity. Morris et al (2002) showed that histological heterogeneity between the periphery and the centre of the tumour was found in 29% for large tumours. This study concluded that multiple foci of the same tumour should be studied in order to achieve a reliable score of HG.

In our study, small tumours had greater concordance and large tumours had greater heterogeneity and more unrepresentative samples. The only independent factors associated with discordance were premenopausal status, tumour size and the guidance type of core biopsy performed (an ultrasound guidance allowed more specific sample of the lesion). We found no differences in concordance rates related to the number and the size of the core biopsies. This is supported by O’Leary et al (2004) who showed in a series of 113 patients that the number and size of biopsies did not influence the concordance between preoperative and postoperative histopathology reports. In contrast, a short report published by McIlhenny et al (2002) found that a single core biopsy was only 32% accurate in predicting the grade, whereas this rate rose to 74% for four core biopsies (P=0.001). Predictive factors in HR+/HER2− group are important to decide adjuvant systemic treatment or neo-adjuvant chemotherapy. The greatest discordance rate was found in the subgroup of patients with an indication of adjuvant chemotherapy (38% in chemotherapy group, 18% in hormone therapy group and 12% in no adjuvant systemic treatment group). This is potentially explained by a more aggressive tumour biology and greater heterogeneity. This discrepancy had no or few therapeutic impact, because systemic treatment decision-making also integrates other variables, such as tumour size, nodal involvement and the age of the patient.

From the present study, we can conclude that HG discrepancy is mainly related to an underestimation of mitotic index and factors probably associated with the great tumour heterogeneity of breast cancer, such as premenopausal status, tumour size and the type of core biopsy performed. However, grade discordance does not appear to modify the therapeutic decision in ER+/Her2− breast carcinoma.