Introduction

Thyroid hemiagenesis (THA) is a rare congenital anomaly occurring when one of the thyroid lobes fails to develop. The incidence of the disorder is estimated at 0.05–0.5% of the general population. THA occurs usually as an isolated feature, more frequently in women than in men [1]. THA belongs to the broad and heterogeneous spectrum of thyroid dysgenesis, but in contrast to the other more severe forms, clinical consequences of abnormality rarely include congenital hypothyroidism [2]. The accurate etiology of THA remains unknown, even though the abnormality was found to be an inherited condition in a few families, signifying its genetic origin [3]. Additionally, THA family members commonly present other thyroid developmental anomalies (i.e., thyroid agenesis, ectopy or thyroglossal duct cyst) [4]. This, in turn, suggests common genetic background of different thyroid developmental anomalies, but also contribution of other factors modulating expressivity and severity of the definitive phenotype. The genes known to regulate thyroid embryogenesis (i.e., NKX2-1, FOXE1, PAX8) are rarely altered in THA patients, and only one THA familial case was reported to be caused by a heterozygous mutation in PAX8 [5]. Valuable insights into the etiology of the condition provide genetic syndromes, where THA appears as a part of complex phenotype. THA is occasionally concomitant with Williams and DiGeorge syndromes [6]. In both disorders the responsible altered genetic regions are bearing SHH and TBX genes. Their mutated mouse orthologues were evidenced to cause THA in an animal model [7]. TBX1 knockout mice presented normal development of follicular cells, whereas due to disturbed formation of ultimobranchial bodies the gland lacked C cells [7]. The animal THA model however, does not reflect accurately the human species, because in surgical specimens obtained from the THA patients the presence of normal C cells was demonstrated [8]. Human SHH gene is responsible for severe developmental conditions like holoprosencephaly, and there are no evidences of SHH allelism, that may lead to milder phenotype [9]. Other clues coming from THA mouse knockouts encouraged researchers to pay particular attention on human homeobox gene family. The Hoxa3 null mice particularly show diminished number of follicular cells, hypoplasia or absence of one thyroid lobe which corresponds to human THA [10]. In the recent study by Kizys et al. HOXA3, HOXB3, HOXD3, and PITX2 genes were analyzed in THA, however no causative mutations were found [11].

Lack of success in identification of common genetic mechanism in THA indicate the need to search for other genetic targets. In this study, we aimed to look for novel genetic factors that contribute to the development of THA in a uniquely large cohort of patients.

Patients and methods

The studied group consisted of 34 unrelated patients and two familial cases, with THA as an inherited condition transmitted from mother to daughter. The diagnosis of THA was confirmed by ultrasound examination and scintiscan. Recruited patients did not present any concomitant congenital developmental disorders. DNA of all affected individuals and first-degree relatives in THA family was obtained from blood samples. A population cohort of 100 individuals originating from the same region of Poland was used as a control group.

Genetic examinations

Firstly, we analyzed the genes which role in the thyroid development was documented in previous studies. Sequencing of PAX8 (NM_003466), TTF1 (NKX2-1, NM_001079668), TTF2 (FOXE1, NM_004473), HHEX (NM_002729), SHH (NM_000193), and TBX1 (NM_080647) genes encompassed coding sequences, with neighboring intronic regions. Microarray analyses was performed using Illumina Infinium Human OmniExpress-12v1.0 beadchip (San Diego, USA) and then for selected samples on Affymetrix CytoScanHD arrays (2,7M, Santa Clara, USA). For assays, 200–250 ng of genomic DNA was used. Array scans were managed using Illumina GenomeStudio and Affymetrix Chromosome Analysis Suite (ChAS). Log2 R ratios and B-allele frequencies were used for copy number variations (CNVs) identification in all samples. Population specific CNVs were filtered out (control cohort of 100 healthy individuals were analyzed previously using CytoScan 750 K arrays and included in Poznan University of Medical Sciences database) and then identified abnormalities were compared with CNVs annotated in the database of genomic variants. Findings were validated using Real Time qPCR (ΔΔCT method, Life Technologies). Complete genomic coordinates regarding THA patients were deposited in ClinVar database (Table 1). The whole exome analysis was conducted in a three-generation family, presenting transmission of THA as an isolated feature (pedigree presented on Supplementary Fig. 2). Four family members were examined (proband and their first-degree relatives). A total of 1 μg of genomic DNA from subjects was used for the construction of a library with the TruSeq DNA Sample Preparation Kit (Illumina). Whole exome enrichment was performed with the use of DNA library and TruSeq Exome Enrichment Kit (Illumina). For pathogenicity evaluation, the following features were applied: (a) gene/transcript annotations (downloaded from UCSC GenomeBrowser, hg18), (b) known sequence variants from dbSNP (version 132), 1000 Genomes project (The 1000 Genomes Project Consortium). NGS findings were validated using Sanger sequencing. Strategy presented on Supplementary Fig. 1.

Table 1 List of recurrent genomic locations identified in sporadic THA patients

Prediction of mutations impact and protein network analysis

The identified mutations were analyzed for their impact on protein conformation and functionality. The amino acid changes were examined using Sift [12], and PolyPhen2 [13] as well as Splice Site mutations using Human Splicing Finder (HSF3.0) [14], and MutationTaster [15]. CNVs encompassing genes, were subjected to a search for phenotype relevancy using GENECODIS, whereas functional annotation was conducted with the database for annotation, visualization, and integrated discovery [16]. P-values and numerical scores were calculated to rank networks according to their degree of relevance in regards to the different gene lists.

Results

Sequencing of known thyroid transcription factor genes (PAX8, NKX2-1, FOXE1, and HHEX) and selected targets, demonstrated in previous studies to contribute to the thyroid development (TBX1, SHH) did not reveal mutations in THA patients. This prompted us to look for genomic abnormalities across sporadic patients, as well as one THA family. The examination using microarrays revealed CNVs (ranging in size from 5 Kb up to 3.2 Mb). We focused particularly on cytoregions bearing genes emerging 63 unique deleterious CNVs. Three deletions turned out to be found in more than one sporadic patient. Analogous investigation was conducted for duplications and selection of 18 gains in copy number were established. Within these loci, a total number of 24 genes were identified and one duplication was found in two unrelated patients. In total four recurrent CNVs (three deletions, one duplication) were detected. We explored DGV to find overlaps with CNVs reported in large control populations, and we found an ultra-rare variability for three out of four regions (not present for PSMD3 gene, where point mutation in THA family was identified). All four regions were statistically significant and not coincidently enriched comparing to large control populations. Microarray data are summarized in Table 1. Recurrent deletions were harboring highly conservative proteasome genes PSMA1 (NM_002786, two patients), PSMA3 (NM_002788, two patients) and additionally deletion encompassing PSMD3 gene (NM_002809, found in one sporadic patient). In the studied group of patients, deletions of two other proteasome-associated genes VPS13C (NM_020821, two patients) and RAD23B (NM_002874, one patient) were detected.

Exome-wide sequencing in a THA family resulted in identification of 93 alterations. One change, was a splice site mutation in a proteasome gene PSMD2 (c.612 T > C cDNA.1170 T > C, g.3271 T > C, NM_002808) found in affected mother and daughter, and regarding microarray findings it was particularly interesting. The mutation introduced an exonic splice sequence (silencer motif), knocking out one PSMD2 allele. Both microarray and whole exome sequencing (WES) data were concordant and subsequently indicating haploinsufficiency of core proteasome genes (Fig. 1).

Fig. 1
figure 1

Proteasome and protein units shown to be altered in THA sporadic patients (microarray studies, encircled in red) and familial case (WES, an arrow). The corresponding gene names as follow: Rpn1—PSMD2, Rpn3— PSMD3, α6—PSMA1, α7—PSMA3

Recurrent abnormalities were found in 7 out of 34 patients (20%), and therefore an attempt was made to emerge THA pathways, based on functional interactions between other detected genes/regions. We used all identified CNVs (63 deleterious, 18 gains), bearing in total 175 known genes. This examination resulted in identification of four significant clusters: (a) protein degradation via proteasome, (b) transmembrane transport (solute carriers), (c) cytoskeleton maintenance (actin related) and (d) transcriptional regulation, respectively (Supplementary Table 1). Except proteasome-related genes, none of the above genes altered in more than one patient.

Discussion

The development of thyroid gland in vertebrates is a multistep complex process. Recently, a comprehensive review summarizing the contribution of transcription factors in thyroid embryogenesis and their role in the differentiation of thyroid follicular cells was published [17]. Mutations of these genes are associated with broad spectrum of clinical features i.e., thyroid hypoplasia, ectopy, agenesis or even syndromal phenotypes characterized by multiple organ development failure. Although the knowledge of the mechanisms involved in the thyroid development has been expanded, in a substantial portion of patients with thyroid dysgenesis it was not possible to identify causative mutations. This suggests that other unknown genes may contribute to the disorder. The aim of the present study was to search for genetic abnormalities explaining the origin of thyroid bilobation failure in humans presenting as an isolated condition.

The transcription factors (TTFs) (PAX8, FOXE1, HHEX or, NKX2-1) represent different classes of conservative transcription factors. They show expression across various tissues, but appear together exclusively in the thyroid. Their interaction is crucial for proper organogenesis, and single TTF perturbation is leading to severe dysgenesis [18]. In order to identify THA genetic background, a cohort THA patients and a THA family were screened for mutations in TTFs. This examination failed to find any abnormalities in known thyroid genes in THA patients under study. This supports previous findings, that patients with isolated THA in majority do not present mutations in TTFs [4]. Therefore, we searched for defects in other genes that would affect thyroid bud migration, or stabilization of the bilobar thyroid structure. We identified recurrent abnormalities affecting proteasome genes comprising recently an attractive target for the studies on developmental mechanisms. The major function of proteasomes in the cell is degradation of ubiquitinated proteins. Their functional insufficiency leads to accumulation of undegraded protein aggregates, which may exert a toxic effect on the cell. Mutations in proteasome related genes are linked to a broad spectrum of phenotypes encompassing neurological, autoimmune, and developmental disorders [1921]. We found recurrent deletions of proteasome genes, accounting for portion of patients, that nevertheless is unlikely coincidental. All of the genes—PSMA1, PSMA3 and PSMD3 as well as PSMD2 gene in familial case present haploinsufficiency. None of those genes has been linked to the human disease so far. The detected c.612T > C mutation in a PSMD2 gene, which is transmitted together with a THA in a familial case, introduces an exonic splicing silencer sequence. Remarkably, all the identified genes are coding proteins that orchestrates proteasome, although they represent different functional subunits (PSMA1, PSMA3—core particle and PSMD3 as well as PSMD2—regulatory particles), depicted on Fig. 1. The proteasome-related pathomechanism was additionally supported by deletions of other proteasome associated genes VPS13C (two patients) and RAD23B (one patient).

The development of the thyroid is indispensably linked to the cardiovascular system due to juxtaposition of thyroid bud and cardiogenic mesoderm. This fact is supported by high occurrence of heart defects in children with congenital hypothyroidism [22]. On the other hand, the dysfunction of proteasomal-ubiquitin system is recently thoroughly explored in the context of cardiac diseases [23, 24].

Heritability of thyroid dysgenesis accounts for only about 2% of patients and for those cases, the contribution of monogenic background is evidenced (TTFs). So far none of the hypotheses proposed by the Abramowicz et al. (epigenetic regulation, early somatic mutation, stochastic events) referring to missing causes of thyroid developmental anomalies including THA were evidenced [25]. Taking into account high variability and a high frequency of de novo occurring genomic events (particularly CNVs), the model of autosomal dominant transmission proposed by the Macchia et al. is still valid and feasible [5].

In conclusion, our results shed a new light on etiology of THA, so far suspected to be linked only to mutations in the genes directly involved in thyroid development, like TTFs. We demonstrated, for the first time, that genomic alterations in proteasome-associated genes co-occur in patients presenting this developmental anomaly.