As compared to day 3 blastomere (spp) biopsy followed by fluorescence in situ hybridization (FISH), PGS 1.0 [1], the utilization of trophectoderm biopsy (days 5–6 embryos) combined with comprehensive chromosome screening (CCS) tests for embryonic aneuploidy, PGS 2.0, has been suggested to improve in vitro fertilization (IVF) outcome [2], though not without criticisms [3, 4]. Here, we draw attention to several underlying factors that will influence decisions to employ PGS 2.0 in routine clinical practice.

PGS and mosaicism

Though excessive mosaicism in cleavage stage in comparison to blastocyst stage embryos was given as a principal reason for the potential superiority of PGS 2.0 over PGS 1.0 [5], mosaicism has been reported in cleavage- and blastocyst-stage embryos derived from IVF [6], with mitotic rather than meiotic errors as main causes [7]. Liu et al. [8] reported that 69 % of blastocyst-stage embryos from women of advanced age are mosaic for inner cell mass as well as trophectoderm, while Johnson et al. [9] reported that in younger women only 20 % of blastocyst stage embryos are aneuploid, with a majority in addition presenting with only one or two structural chromosome abnormalities. Their observation would suggest at younger ages a lower, but still, clinically critical level of mosaicism at the blastocyst stage [6]. These data were to a degree confirmed by Munné’s group who at the 2015 ASRM meeting reported clearly increasing aneuploidy rates with advancing female age but surprisingly similar mosaicism rates at all ages at an average of 30.2 %, with actually the youngest women below age 35, quite surprisingly, demonstrating highest rates among all age groups at 33.2 % [10].

In assessing the potential clinical value of PGS 2.0 in improving IVF outcomes, before discussing costs, complexities, and the obvious lack of properly conducted prospective clinical trials based on “intent to treat” [3], one really has to assess whether the basic biology of the early human embryos allows for the accurate diagnosis of euploidy versus aneuploidy based on as single trophectoderm biopsy at blastocyst stage.

In this issue of JARG, Tortoriello et al. report highly divergent outcomes when trophectoderm biopsies from the same embryos were referred for PGS 2.0, at different laboratories, using varying assay platforms [11]. Chromosomal analyses after two sequential trophectoderm biopsies from the same embryos revealed only 11 % (3/27) ploidy detection concordance between microarray-based comparative genomic hybridization (aCGH) and next-generation sequencing (NGS). Moreover, 9/27 (33 %) of originally reported aneuploid embryos, upon repeat assessment, were found to be euploid. The predicted mosaicism rate was 51 % (19/37).

Such findings can have only three possible explanations: either laboratory techniques applied in one or more of the utilized PGS laboratories are disappointingly inaccurate; an inherent lack of concordance between platforms; or individual trophectoderm biopsies submitted for analyses differed in their chromosomal make up (i.e., the embryo was mosaic). Observed discrepancies, of course, could also be caused by a combination of all three. Noteworthy, that with regards to the discussion of the poor reproducibility of results between labs, the FDA has recently proposed to begin regulation of lab-developed genetic testing (http://www.fda.gov/NewsEvents/Newsroom/PressAnnouncements/ucm509814.htm).

From a clinician’s viewpoint, what causes such discrepancies may not matter much because all three explanations suggest an unacceptably low level of reproducibility of test results with PGS.2.0, demonstrating the test’s inability to accurately differentiate between euploidy and aneuploidy of any given embryo. It is of utmost importance to understand why PGS 2.0, like its precursor PGS 1.0., appears to fail clinically again in improving IVF outcomes.

The importance of defining the principal cause(s) for this repeat failure lies in the need to prevent yet a third premature marketing campaign of an unproven PGS product (i.e., PGS 3.0) under false expectations and with improper designs of validation studies.

The biology of mosaicism

It has been known for decades that aneuploid islands of cells, a rarity in humans, are a frequent finding in placentae of newborns [12]. With the placenta being a product of the trophectoderm, that observation alone should have cast significant doubts on the original claim of PGS 2.0 proponents that a single throphectoderm biopsy can reliably reflect embryo ploidy.

A recently published mouse study shed additional light onto the embryo’s ability to self-correct an inherent mosaic state of mixed euploid and aneuploid cells [13]. Treating mouse embryos with a spindle assembly checkpoint inhibitor during the 4- to 8-cell divisions, the authors generated aneuploid cells, which they followed via live-embryo imaging and single-cell tracking in chimeric embryos with euploid and aneuploid cells. They found that aneuploid cells in the fetal lineage (i.e., inner cell mass producing the fetus) were eliminated by apoptosis, while those in the placental lineage (i.e., the trophectoderm) did show proliferative defects though survived. However, aneuploid cells were progressively depleted from blastocyst-stage on. Any ploidy determination at blastocyst stage, therefore, is of questionable value but especially if based on a trophectoderm biopsy. However, even aneuploidy within the fetal lineage does not prevent ultimate birth of normal offspring as long as there is an adequate number of euploid cells present at early embryo stages.

If also applicable to human embryos, these data, therefore, would suggest that it is biologically impossible with a single trophectoderm biopsy at blastocyst stage to accurately assess an embryo’s ploidy. The primary concept of PGS, which suggests that by determining embryo ploidy in human embryos prior to embryo transfer IVF outcomes will be improved because transfers of aneuploid embryos are avoided, is, therefore, likely unsustainable.

In addition to the here discussed manuscript [11], using NGS, a recently published study [14], evaluated in eight top-quality embryos the concordance of multiple trophectoderm biopsies. In four embryos, the inner cell mass was also analyzed separately. Discordant results (mosaicism) were observed in three out of the eight embryos, and three out of 18 (16.6 %) trophectoderm biopsies were in addition inconclusive. Overall, 8/22 biopsies (36.6 %) revealed either mosaicism or inconclusive results. Also supporting above described mouse study, two independent groups reported surprisingly high live birth rates of healthy, genetically normal infants after transfer of embryos after PGS 2.0 reported to be aneuploid (mosaic) [15, 16]. These results are indicative of significant false-positive rates following PGS 2.0 and raise serious concerns about the potential discarding of perfectly normal embryos in large quantities in current PGS 2.0 utilization.

Further relevant clinical data

PGS 2.0, utilizing trophectoderm biopsy and comparative chromosome screening (CCS) for embryonic aneuploidy, was predicated on an apparently improved ability to accurately diagnose embryonic aneuploidies without compromising the embryo’s implantation potential. Several retrospective studies and supposedly prospective trials have, indeed, alleged improved clinical outcomes following PGS 2.0. These trials and some observational studies recently were evaluated by Dahdouh et al. [17] in a meta-analysis, aiming to study whether PGS 2.0 improves clinical implantation rates and sustained pregnancies (beyond 20 weeks) compared to routine embryo selection in IVF cycles. Of the 29 eligible articles, only three prospective trials and eight observational studies met inclusion criteria, suggesting only in patients with normal ovarian reserve significantly higher clinical and sustained pregnancies with the use of PGS 2.0.

Pretty much every method of embryo selection, however, improves IVF outcomes in good prognosis patients, who even without embryo selection achieve excellent pregnancy outcomes and, therefore, need outcome improvements the least. Embryo selection, however, does not benefit average prognosis patients and usually is outright harmful to poor prognosis patients [18]. All studies that favored of PGS 2.0 also biased their conclusions by only reporting on live birth rates following a first embryo transfer in a fresh IVF cycle, while, ultimately, the total reproductive potential of each initiated IVF cycle is really the more relevant outcome, and should include the initial fresh cycle plus subsequential frozen/thawed transfers.

Aside of favorable patient selection, the most important design flaw in almost all so far published alleged propsectively randomized clinical trials of PGS 2.0 was definition of pregnancy outcomes with reference embryo transfer, rather than by intent to treat (i.e., reference cycle start) since such an assessment of outcomes exclude poorer prognosis patients who do not reach embryo transfers.

Finally, a recent analysis of the national US data for 2011–2012 failed to demonstrate outcome benefits for PGS over non-PGS cycles, with more PGS than non-PGS cycles reaching ET (64.2 vs. 62.3 %) and, therefore, for all practical purposes confirming favorable patient selection biases for patients undergoing PGS. Moreover, live birth rates per cycle start (25.2 vs. 28.8 %) and per embryo transfer (39.3 vs. 46.2 %) were significantly better in non-PGS cycles, whereas miscarriage rates were similar (13.7 vs. 13.9 %) [19]. It thus appears that PGS not only does not improve IVF outcomes but actually neagtively affects them in the clinical reality of the national US data.

The required RCT

Whether PGS 2.0 will obtain clinical utility will depend on the publication of properly conducted prospectively randomized studies, which appropriately define patient populations and assess cumulative live birth rates of all fresh and frozen embryos obtained in one IVF cycle.

A model for such a study was recently published based on data in the literature on blastulation and aneuploidy rates, the rate of mosaicism, technical errors, and implantation/live birth rates of PGS and non-PGS cycles of day-3 and blastocyst stage embryos [21]. It clearly demonstrated superiority of non-PGS embryo transfers (day 3 and blastocyst stage) over PGS blastocyst transfers in cumulative live birth rates (18.2–50 vs 7.6–12.6 %, respectively).

A final word and conclusions

The PGS experience, thus, should serve as a reminder to medical journals that a fair peer review process places the burden of proof primarily on proponents of treatments not their oponents.

We conclude that, based on significant doubts that have arisen about the utility of PGS 2.0 in improving IVF outcomes and considerable concern that PGS 2.0 may actually harm IVF outcomes especially in poor prognosis patients, PGS 2.0 should be withdrawn from routine clinical utilization in IVF under the medical primacy of “doing no harm” over other considerations.

We, however, do support the continuous utilization of PGS 2.0 as an experimental procedure in proper, non-hypothetical prospectively randomized clinical trials,—not because we believe that PGS 2.0 miraculously will ultimately prove to be effective in improving IVF outcomes—but because such studies may help in shedding further light on the physiological role of trophectoderm mosaicism in embryo implantation. Aneuploidy has in oncology been associated with tumor invasivenss. Would it not be paradoxically delightful to learn that the physiologically so prevalent aneuploidy in trophectoderm has as similar function in the early embryo by fostering implantation? PGS may then, ultimately, have found a purpose!