Introduction

Introduction of rituximab combined with cyclophosphamide, adriamycin, vincristine, and prednisone (R-CHOP) has resulted in improved survival outcomes in patients with diffuse large B cell lymphoma (DLBCL) [18]. DLBCL is a heterogenous group of B cell non-Hodgkin lymphoma (NHL), rather than a single clinicopathologic entity [9]. Multiple histologic subtypes were recognized and several molecular and genetic abnormalities were variably present. In recent years, most studies have focused on identifying molecular markers in order to define new prognostic factors. However, no relevant prognostic molecular markers have been validated, and an agreement on prognostic models has not yet been reached [10].

Aggressive NHL, including DLBCL, has been staged according to the Ann Arbor staging system, which was originally designed for Hodgkin lymphoma (HL). The International Prognostic Index (IPI) is the primary clinical tool used to predict the outcome for patients with aggressive NHL based on the number of negative prognostic factors at the time of diagnosis, including Ann Arbor stage III/IV and other factors (age ≥ 60 years, elevated lactate dehydrogenase level, Eastern Cooperative Oncology Group performance status ≥2, more than one extranodal site) [11]. However, due to the higher heterogeneity and hematogenous spread pattern of dissemination in NHL relative to contiguous lymphatic spread with HL, Ann Arbor staging system has limited value in the context of assessing accurate tumor burden in NHL. For instance, despite the presence of a high tumor burden in stage II disease, the IPI score can be “zero point,” whereas the score can be “one point” in stage III disease even if the tumor burden is low.

New imaging techniques such as 18F-fluorodeoxyglucose (18F-FDG) positron emission tomography (PET) have been used as prognostic tools in NHL [12, 13]. As indicated by several positive sets of data, imaging techniques have become an important tool in clinical decisions on therapeutic strategies for treating aggressive NHL, including DLBCL [1416]. The objective of the present study was to investigate whether metabolic tumor volume (MTV) by PET can be used as a potential prognostic tool, compared with the Ann Arbor stage, in patients with stages II and III nodal DLBCL.

Materials and methods

One hundred sixty-nine patients with de novo nodal DLBCL between July 2004 and November 2008 in five medical centers (Pusan National University Hospital, Dong-A University Medical Center, Kosin University Gospel Hospital, Busan Paik Hospital, and Gyeongsang National University Hospital) who underwent PET–CT at diagnosis were enrolled in the present study. All patients received six to eight cycles of R-CHOP therapy according to Coiffier et al [1]. The median follow-up duration was 36 months, and the male-to-female ratio was 1.56:1 (Table 1).

Table 1 The baseline characteristics and comparison between stages II and III nodal DLBCL patients

Inclusion and exclusion criteria

Patients were included if they had primary nodal localization as a de novo DLBCL histotype, and the stage was II or III according to Ann Arbor staging and had been available for clinical follow-up. Patients were excluded if they presented any extranodal involved site, DLBCL secondary to low-grade NHL, or other treatment, including radiotherapy after R-CHOP therapy or autologous stem cell transplantation and if there was a discrepancy in the LNs between PET and conventional computed tomography (CT). In addition, patients were excluded if they had uncontrolled diabetes mellitus, evidence of infection at the time of diagnosis, especially active tuberculosis, or antibodies against human immunodeficiency virus.

Measurement of MTV by PET/CT

Dual-modality PET/CT tomography was performed on a biograph (Siemens Medical Solution, Hoffman Estates, IL, USA), based on a dual-slice helical CT and a full-ring PET tomography. FDG-PET images were evaluated for regions of focally increased tracer uptake. In the target lesions of FDG tracer uptake, SUV of ≥2.5 as contouring border was considered to represent lymphoma, as suggested by Freudenberg et al. [17]. The CT images were used for PET attenuation correction. Imaging reconstruction of corrected emission data was performed after Fourier transform with AWOSEM software (two iterations, eight subsets, 5 mm Gaussian filter). The CT criterion for pathologic LN was that the LN exceeded 1.0 cm in all regions, except the groin. PET image also evaluated the area of focal tracer uptake; thus, the SUV of ≥2.5 was considered as pathologic LN, and the MTV was measured after CT attenuation correction. CT images were acquired with 130 mAs, 130 kV, and slice width (or 5 min and table feed) of 8 mm per rotation. Intravenous or oral contrast agents were used in all patients, and a standardized breathing protocol was applied. PET images were interpreted by nuclear physicians at each institution. Data were then reviewed by two nuclear medicine experts at Pusan National University Hospital.

Pretreatment and response evaluation

Pretreatment staging and response evaluation after six or eight cycles of R-CHOP therapy were based on clinical examination, CT scan of the neck, chest, abdomen, and pelvis, bone marrow biopsy, and PET/CT. Response was assessed according to revised International Workshop Criteria [14]. Criteria were as follows: complete response (CR) is defined as (a) no signs or symptoms of disease, (b) negative PET and regression to normal size on CT, and (c) normal bone marrow. Partial response (PR) is defined as 50% decrease in tumor size but with a positive PET at the prior disease site. Stable disease (SD) is defined as (a) positive PET at the prior sites of disease and no new sites on CT or PET and (b) PET negative and no change in size of previous lesions on CT. Progressive disease is defined as (a) appearance of a new lesion >1.5 cm in any axis, ≥50% increase in the sum of the product of the diameters of more than one node, or ≥50% increase in the longest diameter of a previously identified node>1 cm in short axis and (b) lesions PET positive if FDG-avid lymphoma or PET positive prior to therapy.

Statistical analyses

The Mann–Whitney U test was used for assessment of differences in the frequency-independent prognostic factors of stages II and III groups. Progression-free survival (PFS) was calculated from the date of diagnosis to documented disease progression; observations were censored on the date the patient was last known to be alive or, for patients dying as a result of causes unrelated to lymphoma or treatment, the date of death. Overall survival (OS) was calculated from the date of diagnosis until death as a result of any cause or the date last known to be alive. PFS and OS were estimated by the Kaplan–Meier method, and the difference was compared using a log-rank test. Receiver operating characteristic (ROC) curve was performed for estimation of the accuracy in prediction of ideal cutoff value of MTV. Estimation of sensitivity and specificity was based on the cutoff value of MTV. SPSS software for Macintosh (SPSS 15.0; Chicago, IL, USA) was used for statistical data processing. A probability value <0.05 was considered statistically significant.

Results

Patient characteristics

One hundred sixty-nine patients with stage II/III nodal DLBCL were treated with R-CHOP from 2004 to 2008, and the baseline characteristics were summarized in Table 1. Differences of several independent prognostic factors between stages II and III groups were not observed. However, MTVs in the stage III group were larger when compared with the stage II group (p < 0.001). CR and PR of the treatment response were comparable between the two groups (p = 0.786, p = 0.236, respectively), whereas SD was higher in the stage III group when compared with the stage II group (p = 0.036, Table 1).

Measurement of cutoff value of MTV in patients at diagnosis

ROC curve analysis was employed to calculate the accuracy of the ideal cutoff value, which was used to distinguish the low MTV group from the high MTV group. The estimated area under the ROC curve was 0.857 (p < 0.001; 95% conference interval, 0.782–0.932), which suggests that the value was important to prediction of survival. Various cutoff values of MTV were used to obtain a reasonable balance of sensitivity and specificity; 220 cm3 of various values acquired a sensitivity of 91.7% and specificity of 65.3% (Fig. 1).

Fig. 1
figure 1

Receiver operating characteristic curve analysis in survival prediction according to MTV in 160 stages II and III nodal DLBCL patients (continuous variable). Area under the curve was 0.857 (p < 0.001, 95% CI 0.782–0.932), and 220 cm3 was determined as the cutoff value of MTV for comparison. Sensitivity and specificity of the dichotomized MTV (≥220 versus <220) were 91.7% and 65.3%, respectively

Clinical outcome according to stage or MTV

Three-year PFS and OS were significantly higher in the stage II group than in the stage III group (PFS, 80.0% in stage II versus 63.8% in stage III, p = 0.011; OS, 85.0% in the stage II group versus 64.2% in stage III, p = 0.001, Table 1). Clinical outcome according to the low MTV group versus the high MTV group was also analyzed, and the results were shown in Fig. 2a, b. PFS and OS were significantly higher in the low MTV group than in the high MTV group (PFS 89.8% versus 55.6%, p < 0.001; OS 93.2% versus 58.0%, p < 0.001).

Fig. 2
figure 2

Comparisons of survival according to the cutoff value of MTV and stage combined with the cutoff value of MTV. a PFS and b OS according to the cutoff value of MTV were higher in the low MTV group compared with the high MTV group (PFS <0.001; OS, p < 0.001, respectively). c PFS and d OS of stages II and III with the low MTV groups were higher compared with other groups, whereas survival between the two low MTV groups (PFS 90.5% in stage II versus 88.0% in stage III, p = 0.703; OS 95.2% in stage II versus 88.0% in stage III, p = 0.268) or high MTV groups (PFS 60.5% in stage II versus 51.2% in stage III, p = 0.347; OS 65.8% in stage II versus 51.2% in stage III, p = 0.175) were not different

Clinical outcome according to stage combined with MTV

Further analysis was performed to determine whether tumor burden was of clinical importance between stages II and III nodal DLBCL patients. Outcomes were compared among the four subgroups based on tumor burden and stage II or III (stage II group with low MTV, stage II group with high MTV, stage III group with low MTV, and stage III group with high MTV). The high MTV group, regardless of stage, had lower PFS and OS patterns, compared with the low MTV group (PFS and OS in stage II with low MTV, 90.5% and 95.2%; in stage III with low MTV, 88.0% and 88.0% versus in stage II with high MTV, 60.5% and 65.8%; in stage III with high MTV, 51.2% and 51.2%; p < 0.001, p < 0.001), whereas the prognostic impact of stage in the same MTV group was absent (in the low MTV group, difference of PFS and OS according to stage, p = 0.703, p = 0.268; in the high MTV, p = 0.347, p = 0.175, Fig. 2c, d).

Univariate and multivariate analysis

In the univariate analysis, stage III was still a poor prognostic factor for PFS and OS (PFS, hazard ratio (HR) = 2.094, 95% confidence interval (CI) = 1.162–3.773, p = 0.014; OS, HR = 2.758, 95% CI = 1.454–5.234, p = 0.002). In addition, high MTV was also shown to be a predictive parameter for poor survival (PFS, HR = 5.799, 95% CI = 2.787–12.055, p < 0.001; OS, HR = 8.097, 95% CI = 3.395–19.309, p < 0.001, Table 2). To further investigate the prognostic value of high MTV, multivariate analysis using a Cox proportional hazard model was performed on the high MTV and stage III groups. This analysis showed that high MTV was an independent factor for the prediction of an unfavorable outcome (PFS, HR = 5.300, 95% CI = 2.517–11.162, p < 0.001; OS, HR = 7.009, 95% CI = 2.902–16.927, p < 0.001), whereas stage III had no significant value (PFS, HR = 1.496, 95% CI = 0.822–2.724, p = 0.187; OS, HR = 1.894, 95% CI = 0.988–3.628, p = 0.0.054, Table 3).

Table 2 Univariate analysis for prognostic factors in patients
Table 3 Multivariate analysis for prognostic factors in patients

Discussion

Since its publication in 1971, Ann Arbor staging has been used as the staging system for both HL and NHL lymphomas [15]. However, the number of involved nodal sites was not considered in this optional staging system. For these reasons, the Ann Arbor scheme was revised, and modifications in the staging procedures were recommended within the framework of the Ann Arbor Classification by a committee meeting in the Cotswolds under the auspices of the Cancer Research Campaign and Imperial Cancer Research Fund [16]. However, accurate tumor burden was also not considered in the staging system. Interestingly, one previous study demonstrated that tumor burdens could be discriminated (i.e., low tumor burden vs. high burden) based on the number of extensive nodal areas and extranodal sites [18]. The author of this study suggested that tumor burden measured by their method was an excellent prognostic factor in CHOP era. However, this method only involved a simple arithmetic system, and no imaging techniques were utilized. Because accurate tumor surveillance is a fundamental precondition for assessment of prognosis and therapeutic options in patients with NHL, a more accurate staging system model should be developed, especially in the era of rituximab.

A recent meta-analysis study, based on data from three large clinical trials, suggested that treatment with rituximab resulted in significant improvement of the treatment outcome within each of the four IPI factors, including Ann Arbor stage [19]. Of particular interest, the study revealed data from the MabThera International Trial, where the advanced stage (III/IV) was no longer an independent factor of OS in multivariate analysis. In this study, data from the MegaCHOEP Trial also demonstrated that the advanced stage showed borderline correlations with PFS and was not associated with OS. In addition, data from the RICOVER-60 trial shown in this study demonstrated that the advanced stage was not an independent factor of PFS and OS. These findings indicate that treatment with rituximab resulted in improved outcome of advanced stage patients and diminished gap of survival between the limited stage and advanced stage according to the Ann Arbor staging system. For these reasons, we do not believe that the advanced stage itself would be a true poor prognostic factor in the era of rituximab. According to the results of meta-analysis in the above clinical trial, the authors demonstrated that four IPI factors were independent factors. However, it did not address several discrepancies, including patient characteristics, using regimens other than CHOP and a different treatment schedule in each clinical trial.

Development of imaging techniques such as 18F-FDG-PET has resulted in increased diagnostic accuracy and allowed clinicians to distinguish primary malignant lesions from benign areas. Thus, 18F-FDG-PET has been reported to provide superior information on staging of NHL when compared with conventional CT scans. Interestingly, two recent studies showed that tumor burden measured by PET could be used to measure the actual tumor burn of lymphoma [20, 21]. One of these studies showed that active tumor burden based on PET might be a prognostic indicator of volumetric response [21]. Volume assessment in these studies was based on percent reduction in SUVmax. However, in the present study, we used a volume measurement process that included a cutoff value of absolute SUV volume measurement, as described by Freudenberg et al. [17].

PET using the tracer 18F-FDG incorporates metabolic tumor function with anatomic localization. Tumor volumes by PET in solid tumors have been associated with clinical outcome in several studies [22, 23]. However, to the best of our knowledge, clinical application of tumor burden by PET as a new staging tool has not yet been reported in DLBCL patients treated with R-CHOP therapy.

The findings reported in the present study suggest that total tumor burden of lymphoma is a more important prognostic parameter than Ann Arbor stage for assessing DLBCL. In the multivariate analysis, a high MTV had a greater clinical significance than stage III in survival. This result demonstrates that the Ann Arbor staging system has limited use in assessing DLBCL due to the heterogenous spread pattern of NHL in contrast to HL. Therefore, overall assessment of tumor burden of lymphoma may be needed before treatment strategies can be developed. In addition, the clinical outcome was not different for the patients in the same MTV state; however, the stage was different. These results suggest that a simple classification for prognosis according to diaphragm would not be wise, at least for DLBCL, in the era of rituximab. The present study was conducted to analyze the clinical importance of MTV between only nodal stages II and III DLBCL patients. Therefore, a further well-designed study including all nodal stages and extranodal sites is needed.

In conclusion, quantitative assessment of metabolic tumor volume using PET may potentially be more useful in the prediction of clinical outcome than the Ann Arbor staging system in stages II and III of exclusively nodal involved DLBCL patients treated with R-CHOP.