Main

Prostate cancer (PrCa) is one of the most commonly diagnosed cancers in men worldwide (Jemal et al, 2011), representing the second most common cause of male cancer-related deaths in the United States, the third in the European Union and the sixth worldwide with over 250 000 deaths per year (Ferlay et al, 2013). Established risk factors for PrCa include age, family history of PrCa and ethnicity, although many common genetic variants that individually contribute to moderate increased risk have also been identified (Goh et al, 2012). Prostate cancer has also been observed to aggregate with other types of familial cancer, in particular with breast and ovarian cancers (Hemminki and Chen, 2005). Several genes that were initially implicated as high risk for breast or ovarian cancer predisposition, for example, BRCA1, BRCA2, CHEK2 and BRIP1, have subsequently been shown to increase the risk of PrCa as well (Dong et al, 2003; Cybulski et al, 2004; Agalliu et al, 2007; Kote-Jarai et al, 2009). This suggests that shared genetic and/or environmental factors may be causal for multiple cancer types. Further evidence for pleiotropy has come from the recent Collaborative Oncological Gene-environment Study (COGS), a multicancer mega-consortium and previous genome-wide association studies (GWAS), which report sharing of common loci between cancers and especially hormonal-related malignancies (Bojesen et al, 2013; Eeles et al, 2013; Garcia-Closas et al, 2013; Kote-Jarai et al, 2013; Michailidou et al, 2013). Genomic instability is the hallmark of most cancers and therefore the investigation of DNA repair pathways in hereditary cancer risk is widely established, for example, mutations in the DNA mismatch repair (MMR) pathway cause hereditary non-polyposis colon cancer (HNPCC/Lynch syndrome) and mutations in Fanconi anaemia (FA) genes, which include BRCA2, BRIP1 and PALB2, predispose carriers to multiple cancers including PrCa (Silva et al, 2009; Kottemann and Smogorzewska, 2013).

We propose that additional moderate penetrance genes for PrCa have yet to be discovered and these will be facilitated by the recent development of massively parallel sequencing. This allows the candidate gene study to be easily expanded both in terms of depth and breadth, enabling the targeting of multiple genes across multiple samples in a single experiment. In this study, we investigated whether deleterious mutations in a set of DNA repair genes have a role in familial PrCa predisposition. We selected for this study the BROCA tumour suppressor gene set designed by Walsh et al (2010), which comprises known high- and moderate-risk breast/ovarian cancer genes from the BRCA–Fanconi anaemia complex and also genes involved in rare multiorgan cancer syndromes (Table 1); some of these overlap with previously known genes implicated in PrCa predisposition.

Table 1 BROCA 22 tumour suppressor gene set

Materials and methods

Study population

We selected a series of men with PrCa from the UK Genetic Prostate Cancer Study (UKGPCS; UKCRN ID 869) (Eeles et al, 1997), based primarily on their PrCa family history. Subjects were eligible if they had two or more relatives affected by PrCa; a total of 191 men were included in this study. Germline DNA was isolated from peripheral whole blood samples using the Nucleon DNA purification system (GE Healthcare Life Sciences, Pittsburgh, PA, USA) or methods described in Edwards et al (1997). The study was approved by the Royal Marsden NHS Trust, Local Research Ethics Committee.

Target capture enrichment and sequencing

A custom Agilent SureSelect bait library (Agilent Technologies, Santa Clara, CA, USA) was used to target capture 22 genes from germline DNA. Capture regions were designed to cover coding, non-coding and intronic sequences with an additional 10 kilobase (kb) genomic sequence flanking each gene. After repetitive DNA elements were masked, the total DNA targeted was 939 kb (Walsh et al, 2010). Sequencing libraries were prepared in batches of 48 and each sample was ‘barcoded’ with a 6 base pair (bp) index to allow multiplexed sequencing. An initial batch of 48 libraries was prepared using standard Agilent protocols, while the remaining 3 batches of 48 libraries used a modified Agilent protocol with prehybridisation pooling to allow 3 libraries to be captured at once with a single bait library (Cummings et al, 2010). All libraries were clustered and sequenced on Illumina cBOT and HiSeq 2000 instruments (Illumina, San Diego, CA, USA), using v.2 flowcells and Truseq reagents to produce 2 × 78 bp ‘paired-end’ reads and a 6 bp ‘index’ read.

Sequencing data analysis and variant annotation

Raw sequencing data were base-called and demultiplexed using Illumina CASAVA software (v.1.8.1, Illumina) and purity filtered reads were removed to produce paired FASTQ files. Each set of paired FASTQ files was aligned using BWA 0.5.8 to ‘The 1000 Genomes Project’ Phase 1 reference, human_g1k_v37.fasta (Li and Durbin, 2009). The ‘Best Practice Variant Detection with the Genome Analysis Toolkit (GATK) v.3 for targeted resequencing was implemented with Picard (v.1.52) and the GATK (all tools v.1.0.5216M except Unified genotyper v.1.6-9-g47df7bb) (DePristo et al, 2011) when realigning, recalibrating and genotyping. Variants were annotated using a February 2013 build of ANNOVAR using the summarize_annovar.pl script (Wang et al, 2010). This maps variants to RefSeq genes, known variation from dbSNP137 and annotates the predicted functional consequence of missense variants using six in silico tools (SIFT, PolyPhen-2, LRT, MutationTaster, phyloP and GERP++) from the dbNSFP v.1.3 (Liu et al, 2011). Additional clinical variant annotation was obtained from NCBI ClinVar (last accessed July 2013; http://www.ncbi.nlm.nih.gov/clinvar/).

LoF mutation definition and validation

Putative loss-of-function (LoF) mutations were defined as variants that are protein truncating or result in significant alteration of the protein sequence. This encompasses stop codon gain/loss, insertion/deletion frameshifts or splice site loss variants.

Loss of function variants were validated by Sanger sequencing in the probands and in other family members if available. Polymerase chain reaction amplicons were designed in Primer-Z (Tsai et al, 2007), except for the PMS2 variant, where published primers were used from De Vos et al (2004) because of the pseudogene PMS2CL (primer sequences available in Supplementary Table S1), sequenced on a ABI3730 Genetic Analyzer using a 1/16th BigDye v.3 protocol (Applied Biosystems, Foster City, CA, USA) and analysed using Mutation surveyor 3.97 (Softgenetics, State College, PA, USA), against the appropriate RefSeq accession sequence (Table 1).

Statistical and segregation analysis

We investigated the correlations between LoF mutation status and clinical features using Fisher’s exact test, Mann–Whitney U-test or the Mantel–Haenszel test for linear trend; patients with missing data for a particular clinical feature were excluded from that analysis (Table 3). All statistical analyses were performed using R 2.15.1; ‘stats’ and ‘vcdExtra’ 0.5-7 packages (R Core Team, 2012; Friendly, 2013).

Table 3 Clinical characteristics of LoF mutation carriers

To obtain an estimate of the average PrCa risk conferred by the LoF mutations identified, we carried out a modified segregation analysis using information on 186 probands of European ancestry. This was implemented in the pedigree analysis software MENDEL (Lange et al, 2013). The analysis was based on PrCa occurrence in male family members. Unaffected male subjects were censored at the age of 85 years, the age at death or last observation, whichever occurred first. As no proband was found to carry LoF mutations in more than one gene, we assumed a genetic model where all identified LoF mutations across all genes represent the alleles of a single genetic locus and assumed that all alleles conferred the same relative risk of PrCa. We assumed that the PrCa incidence depends on the underlying genotype through a model of the form: , where is the baseline incidence at age t in non-mutation carriers, β is the log risk ratio associated with the LoF mutation and g takes value 0 for non-mutation carriers and 1 for LoF mutation carriers. The overall PrCa incidence and overall possible genotypes in the model were constrained to agree with the population incidences for England and Wales in the period 1993–1997 (Parkin et al, 2002). We assumed that the total mutation carrier frequency in the model was equal to the sum of mutation carrier frequencies in the genes, as estimated in previous studies (total mutation frequency=1.4%; frequencies obtained from UK studies (European ancestry) where available or NHLBI GO Exome Sequencing Project (ESP); last accessed November 2013; http://evs.gs.washington.edu/EVS/) (Antoniou et al, 2002; The CHEK2 Breast Cancer Case–Control Consortium, 2004; Thompson et al, 2005; Seal et al, 2006; Rahman et al, 2007). The models were parameterised in terms of the log-relative risk ratios for PrCa. Parameters were estimated using maximum-likelihood estimation. To adjust for ascertainment, we modelled the conditional likelihood of all family phenotypes and mutation status of all tested family members (including the index/proband), given the disease phenotypes of all family members.

Results

Patient characteristics

Of the 191 men with 3 PrCa cases in their family, 128 men also had at least 1 relative affected by breast, ovarian or colon cancer. Ethnicity was known for 72% (137) of our series, with 96% (131) of white European descent, and the remainder of our set also included two men of black African descent, three men of black Caribbean descent and one Ashkenazi Jewish man. The method of diagnosis was available for 69% of patients, with an even split between clinically detected and PSA screened patients (64 and 68 patients, respectively).

Sequencing and variant quality control

All samples reached the required coverage threshold of 20 × read depth across 80% of the target regions. The median value of average target region read depth was 135.85, and STK11 had about half as much median read coverage compared with the rest of the target genes (76.8; Figure 1).

Figure 1
figure 1

Average read coverage per gene. The 22 genes are shown along the x axis and the average sequencing coverage is shown along the y axis. All genes except STK11 had over 150x median coverage, and even STK11 coverage was above the required 20x.

After QC filtering, 7138 exonic and splice site variants were genotyped, corresponding to 300 unique variants. Of the initial 14 unique putative LoF mutations identified, a BRCA2 stop-gain K3326X (rs11571833) was classed as ‘non-pathogenic’ in ClinVar (Wu et al, 2005) and consequently removed from further analysis. Non-carriers were defined as patients who did not carry an LoF mutation, a predicted deleterious SNV by all six in silico tools in dbNSFP v.1.3 or an SNV classed as ‘pathogenic’ in the NCBI ClinVar database for a disease other than PrCa (Table 3 and Supplementary Tables S2 and S3). Therefore 14 LoF mutation carriers and 140 non-carriers were selected for further analysis.

Frequency and type of LoF mutations

Thirteen LoF mutations in eight genes were identified in 14 familial PrCa cases (Figure 2 and Table 2). We found three frameshift mutations and one stop-gain mutation in BRCA2; two stop-gain mutations in ATM; a recurring stop-gain mutation in BRIP1 affecting two families and two frameshift mutations in CHEK2. One mutation was found in each of BRCA1, MUTYH, PALB2 and PMS2. Five of the 13 unique LoF variants found were not listed in dbSNP137, the 1000 Genomes Project (April 2012 data release) and the NHLBI GO ESP (ES6500SI data release). No subject carried more than one LoF mutation; therefore, 7.3% (14 of 191 men) of these familial cases were carriers of a deleterious mutation in 1 of the 22 tumour suppressor genes investigated here.

Figure 2
figure 2

Proportion of LoF mutations by gene. Proportion of the genes contributing to the 14 LoF identified in this study.

Table 2 List of loss-of-function mutations and characteristics of proband and family with the mutations

LoF mutations and clinical characteristics

Table 3 shows clinical characteristics of LoF carriers vs non-carriers. Median age at diagnosis in LoF carriers was very similar to non-carriers: 58.5 and 59.0 years, respectively (P=0.334), and median presenting PSA was higher in carriers compared with non-carriers but the difference was not significant (11.10 vs 8.25 ng ml−1, P=0.156). Gleason scores were categorised into three groups: 6, 7 and 8. There was no significant association between LoF carrier status and the Gleason grade groups (P=0.312), or when analysed against high grades only (Gleason 8; P=0.193). There was also no significant association between LoF carrier status and tumour stage trend (P=0.476), or when analysed against high tumour stage (T3–T4; P=0.704). However, there was a significant association between LoF carrier status and the presence of nodal involvement (42.9% vs 1.3%; P=0.0014) and metastasis (30.0% vs 6.3%; P=0.043).

LoF mutations and advanced PrCa

To further investigate the associations seen with nodal involvement and metastasis, we applied the AJCC Stage IV prognostic grouping for advanced disease, as defined either by nodal involvement, metastasis or primary tumour grade of T4 (Edge et al, 2010). We performed a logistic regression on LoF carrier status vs Stage IV status, controlling for the effect of age at PrCa diagnosis. This showed that LoF mutation carriers have significantly higher odds of having advanced disease (OR 15.09, 95% CI: 2.95–95.81, P=0.00164) (Table 3). Even after excluding the BRCA2 mutations (as these have been shown to be associated with poorer prognosis; Castro et al, 2013), the association remained significant (P=0.00285), indicating that this correlation is independent of BRCA2 LoF status.

LoF mutations and family history of cancer

We also investigated the role of the proband’s family history on the odds of having an LoF mutation; however, no association was found with the total number of PrCa cases within a family (P=0.808). We then used a modified risk prediction algorithm developed by Macinnis et al (2011) to assess the association between LoF mutations and family history of PrCa with and without 25 common risk SNPs, but no significant association was found with either (P=0.456 and P=0.856, respectively). However, this is not unexpected as all study cases were selected for having multiple relatives diagnosed with PrCa and only a subset of DNA repair genes were tested. On the other hand, LoF carriers were more likely to have a family history of breast cancer than non-carriers (OR 3.94, 95% CI: 1.07–18.10, P=0.023) and there was also a significant trend with the increase in the total number of breast cancer cases within a family (P=0.0035). Table 4 shows the 191 families grouped by family history of cancers other than PrCa and the percentage of LoF mutations found in each group.

Table 4 LoF mutations by family history of PrCa and other cancers

LoF mutations and familial segregation

Of the 13 unique LoF mutations, 8 were frameshift and 5 were stop-gain. The eight affected genes represent four DNA damage repair or response pathways; homologous recombination (HR) and FA, ataxia–telangiectasia-mutated signalling (ATM), MMR and base excision repair (BER). As might be expected, mutations in the ATM and MMR pathways were observed in families where PrCa coaggregated with colon cancer cases (three of the four families), whereas mutations affecting the HR/FA pathway were found mainly in families with breast cancer, and to a lesser extent in families with ovarian cancer in addition to PrCa (six of the families) (Table 2). The only mutation unique to families with only PrCa reported was a previously known (rs137852986) stop-gain mutation c.2392C>T (p.(Arg798*)) in the BRIP1 gene that was present in two families.

The most frequently mutated gene in this study was BRCA2, with four protein truncating mutations. Three of these were in exon 11 (two frameshifts and one stop-gain), and a stop-gain in exon 25. Of the four men affected, we had additional DNA available from other family members in the two men with exon 11 frameshifts. We found partial segregation of the c.4981del (p.(Tyr1661Ilefs*9); Supplementary Figure S5) mutation, where the proband had two brothers with PrCa: one diagnosed (Dx) at 67 years who did not carry the mutation, whereas the other did and died of PrCa 4 years after diagnosis (69 years) at 73 years. The second exon 11 frameshift c.4876_4877del (p.(Asn1626Serfs*12); Supplementary Figure S6) was not present in the proband’s brother with PrCa.

Two mutations were found in the ATM gene. An exon 50 stop-gain c.7327C>T (p.(Arg2443*); Supplementary Figure S1) was found in a young proband (Dx 59 years) and his affected brother also carried the same mutation (Dx 61 years); furthermore, the proband had an additional MSH2 (NM_000251.2) c.1275A>G substitution at the −2 position in the 3′ end of exon 7, which has been characterised as causing partial exon skipping at the RNA level (Pagenstecher et al, 2006). The second ATM mutation, a stop-gain c.7777C>T (p.(Gln2593*)) in exon 52 was present in a family (Supplementary Figure S2) where the father had colon cancer; four out of eight brothers had PrCa and a sister was reported to have had an unspecified leukaemia. We confirmed the mutation in the proband (Dx 65 years) and in a brother (Dx 68 years, the only available sample). It is also worth noting that both brothers also had a secondary cancer diagnosis of the colon.

We identified two mutations in the CHEK2 gene. An exon 12 frameshift c.1263del (p.(Ser422Valfs*15)) was found in a family with multiple PrCa cases. DNA samples were available for five brothers, two of the three PrCa cases carried the mutation (Dx 65 years and 72 years and non-carrier Dx 74 years). Of the two unaffected brothers, one was also a carrier of this mutation (Supplementary Figure S10). The second CHEK2 mutation was an exon 8 frameshift c.869del (p.(Asn290Thrfs*14)), which was present in a patient diagnosed with PrCa at the age of 53 years, but did not segregate with his father, the only available sample (Dx 68 years); therefore, it is likely that this mutation was inherited from the maternal line, which contains two colon cancers (in the proband’s mother and grandfather; Supplementary Figure S11).

Only one variant was observed in more than one family: the BRIP1 stop-gain mutation c.2392C>T (p.(Arg798*)), first discovered in FA and later described in breast cancer and PrCa families (Levran et al, 2005; Seal et al, 2006; Kote-Jarai et al, 2009). This mutation resides in exon 17 and has an MAF of 0.02% in 4300 European Americans from the ESP variant server. In the first family this variant was present in two brothers, both diagnosed young at 59 and 56 years (Supplementary Figure S9). In the second family there were three PrCa cases, but the only DNA available was from the proband that was diagnosed at 59 years (Supplementary Figure S8).

In four other genes, a single mutation was found: the BRCA1 mutation c.4065_4068del (p.(Asn1355Lysfs*10), Dx 58 years) was identified in a family with multiple (4) breast cancers and also had four PrCas (Supplementary Figure S3). The PALB2 frameshift mutation c.3507_3508del (p.(His1170Phefs*19), Dx 58 years) carrier had an affected brother (Dx 63 years) who also carried the mutation (Supplementary Figure S13) and similarly the MUTYH stop-gain mutation c.940C>T (p.(Gln314*), Dx 55 years) carrier had an affected brother (Dx 69 years) with the same mutation (Supplementary Figure S12). The only homozygous LoF mutation found was a PMS2 frameshift mutation c.2186_2187del (p.(Leu729Glnfs*6), Dx 51 years), this has a 2% MAF in African-American data from ESP but occurs as a homozygote 0.14% (3 of 2121). The carrier of this homozygous mutation in our study was a man of black African ancestry with a family history of four additional PrCa cases; however, we had no other DNA available from his family to test for segregation (Supplementary Figure S14).

LoF mutations and risk of PrCa

We carried out a modified segregation analysis of the seven genes found to contain putative LoF mutations in probands of European ancestry using estimated mutation frequencies from previous UK studies, except for MUTYH where UK data was unavailable, and therefore we used the European ancestry data from the NHLBI GO ESP NHLBI GO Exome Sequencing Project (last accessed November 2013; http://evs.gs.washington.edu/EVS/) (Antoniou et al, 2002; The CHEK2 Breast Cancer Case–Control Consortium, 2004; Thompson et al, 2005; Seal et al, 2006; Rahman et al, 2007).

Loss-of-function mutations in the DNA repair genes investigated in the present study were estimated to confer a relative risk of PrCa of 1.94 (95% CI 1.56–2.42). After excluding the families found to carry BRCA1 and BRCA2 mutations, the relative risk was estimated to be slightly lower but still elevated at 1.80 (95% CI: 1.38–2.35).

As the mutation frequencies of these LoF mutations were based on sparse published data, we repeated the analysis by assuming that the mutation frequencies were 50% greater or smaller than in the studies listed above. This resulted in small changes to the estimated relative risks for all LoF mutations ranging from 1.67 (95% CI: 1.34–2.09) to 2.42 (95% CI: 1.97–2.99). Much larger series of screened families would be required to obtain cancer risk estimates associated with LoF mutations in specific genes.

Discussion

We have analysed the coding sequences of 22 tumour suppressor genes in 191 familial PrCa cases in the United Kingdom and found that 7.3% (14 of 191) of these cases were carriers of a putative LoF mutation. These mutations showed partial segregation with PrCa within the families, which is consistent with previous observations of moderately penetrant genes mutated within families. The eight affected genes represent four DNA damage repair or response pathways: HR and FA, ATM, MMR and BER. The most frequently mutated gene was BRCA2, which is in concordance with previous studies showing BRCA2 as the most strongly associated PrCa predisposition gene identified to date (Kote-Jarai et al, 2011). There is currently an international targeted PrCa screening study in men with germline mutations in BRCA1 and BRCA2 (IMPACT; Mitra et al, 2011). Our study provides further evidence that ATM, CHEK2, BRCA1 and BRIP1 are also involved in familial PrCa predisposition (Dong et al, 2003; Angèle et al, 2004; Kote-Jarai et al, 2009; Leongamornlert et al, 2012).

Most importantly, we have shown that LoF mutation carriers were more likely to have advanced disease, defined by nodal involvement, metastasis or primary tumour grade of T4 (OR 15.09, 95% CI: 2.95–95.81, P=0.00164). We have previously shown that BRCA2 LoF mutation carriers have significantly reduced survival and present with a more aggressive disease (Castro et al, 2013); however, even when excluding the BRCA2 LoF mutations, this significant association was preserved. This finding could have important clinical implications as men with deleterious germline mutations in these genes should be considered for more intensive screening and treatment. Furthermore, some of the genes studied here are in the HR repair pathway, in which targeted agents such as poly (ADP-ribose) polymerase inhibitors (PARPi) can be considered.

In addition to the clearly deleterious mutations, we identified several missense variants; 13 of these are predicted deleterious by a consensus of six in silico tools (Supplementary Table S2). Therefore, some of these may be classified in the future as deleterious, and accordingly, our estimated frequency of deleterious mutation presented here is likely to be an underestimate. Previous studies have shown that common SNPs identified by GWAS can be used to stratify cumulative risk of PrCa. We were able to calculate a 25 SNP risk score using the algorithm developed by Macinnis et al (2011); the distribution of the median scores between of LoF carriers vs non-carriers were 0.223 vs 0.194, respectively, but this was not statistically significant (P=0.456). This would seem to suggest that, in our sample set, there is no correlation between putative LoF carrier status and PrCa risk score using the common risk SNPs and family history data. This perhaps is not surprising as all our subjects had strong family history and the non-carriers may have additional rare LoF mutations in genes not tested here.

To further investigate the association of LoF carrier status with additional breast cancer family history, we applied the BOADICEA model, commonly used in clinical practice to predict the probability that a proband carries a BRCA1 or BRCA2 mutation. BOADICEA models the genetic susceptibility to PrCa only in terms of the effects of BRCA1 and BRCA2. This model predicted that 8.6% of our probands (15 of the 175 cases that were possible to be scored) had a combined BRCA1 and BRCA2 mutation carrier probability 10% and therefore would have been recommended to have genetic testing in the United Kingdom under the upcoming (Q3 2013) National Institute for Health and Care Excellence (NICE) guidelines for familial breast cancer. Of the 14 men with an LoF mutation, only three (PRS4, 6 and 7; Table 2) of the five BRCA1/2 mutation carriers would have been eligible for genetic testing in the United Kingdom based on these new guidelines (Supplementary Figures S1–14). On the basis of our results, we would therefore recommend that in the future a panel of DNA repair genes should be tested in PrCa families with 3 or more PrCa cases.

The limitations of this study include the lack of events to allow overall survival analysis and the lack of functional evidence to enable the characterisation of the potentially pathogenic missense mutations. Nonetheless, our results suggest that mutations in a wider range of DNA repair genes, other than BRCA2, predispose to PrCa and that such cases are more likely to have advanced disease. Therefore, this warrants further investigation of an expanded set of genes within these pathways in a larger sample set. Also, we highlight that current genetic testing criteria would only have identified 3 of the 14 LoF carriers with breast cancer family history, and therefore current PrCa testing guidelines are likely to be inadequate in the era of personalised genomic medicine.

To our knowledge, this is the first study to apply second-generation sequencing to screen for germline mutations in multiple DNA repair genes in a familial PrCa cohort. We identified frequent deleterious mutations in these genes and the mutation carriers were more likely to present with advanced disease. These findings present strong evidence that genes in DNA repair pathways are good candidates for PrCa predisposition. The clinical utility of these and future findings within these pathways should become increasingly important as targeted screening (such as is undertaken in IMPACT; Mitra et al, 2011) and targeted therapies such as PARPi (Sandhu et al, 2013) become more widespread. If we can more effectively screen these men, clinicians can potentially offer more tailored screening, staging and treatment pathways.