Zusammenfassung
Hintergrund
Jede 7. Diagnose ist falsch. Jedes Jahr könnten 1,5 Mio. Menschen weltweit mit der richtigen Diagnose gerettet werden. Ärzte müssen mehr als 20.000 Ursachen berücksichtigen. Wissenschaftler der Harvard-Universität fanden 2015 nach einem Test von 19 Symptom-Checkern heraus, dass diese mit einer diagnostischen Treffergenauigkeit von nur 29–71 % nicht praxistauglich sind.
Fragestellung
In der vorliegenden Studie wird die diagnostische Treffergenauigkeit der neuen Technologien aus HNO-Sicht evaluiert.
Material und Methode
Die Autoren aktualisierten die genannte Studie zur diagnostischen Treffergenauigkeit von Symptom-Checkern, indem sie (1) die Symptom-Checker Symptoma, Ada, FindZebra, Mediktor und Babylon ergänzten und (2) die vorherigen Resultate der bisherigen Symptom-Checker auf die Gesamtzahl der Patientenfälle normierten. Den Gewinner ließen sie in einem HNO-spezifischen Test mit Fällen aus dem British Medical Journal gegen die 2 bisher wissenschaftlich am meisten untersuchten Tools (Isabel und FindZebra) antreten.
Ergebnisse
Die meisten neuen Symptom-Checker wiesen eine diagnostische Treffergenauigkeit im Rahmen der bisher getesteten auf, mit Ausnahme von Symptoma, der die richtige Diagnose in 82,2 % der Fälle auf Platz 1, in 100 % in den Top 3 und Top 10 listete und damit den bisherigen State-of-the-Art um je 38-, 29-, 16%-Punkte übertraf. Bei den HNO-Fällen zeigte Symptoma mit 64,3 % (Top 1), 92,9 % (Top 3) und 100 % (Top 10) die höchste Treffergenauigkeit im Vergleich zu Isabel (21,4 %; 40,5 %; 61,9 %) und FindZebra (26,2 %; 42,9 %; 54,8 %).
Schlussfolgerungen
Symptoma behauptete sich als erste und einzige brauchbare Lösung in diesem Markt. Größere Studien sollten durchgeführt werden, um die Leistungsfähigkeit der Symptom-Checker weiter zu validieren und anhand von seltenen Krankheiten zu testen.
Abstract
Background
Every seventh diagnosis is a misdiagnosis. Each year, 1.5 million lives could be saved worldwide with the correct diagnosis. Physicians have to consider over 20,000 diseases. A study from Harvard University published in 2015 tested 19 symptom checkers and found them to be insufficient, with only 29–71% accuracy in diagnosis.
Objective
The current study investigates the diagnostic accuracy of new symptom checkers from an ENT perspective.
Materials and methods
The authors update the abovenamed diagnostic accuracy comparison by (1) including the five new symptom checkers Symptoma, Ada, FindZebra, Mediktor, and Babylon; and (2) normalizing results of the previously tested symptom checkers as to reflect each diagnostic accuracy based on the same set of patient vignettes. The winner is then compared to the two symptom checkers with the most scientific evidence, namely Isabel and FindZebra, on the basis of an ENT-specific test with patient vignettes sourced from the British Medical Journal.
Results
Most of the new symptom checkers demonstrated diagnostic accuracy rates within the previously established range, with the exception of Symptoma, which scored the right diagnosis in 82.2% of cases at the top of the list (+38% points), and in 100% of cases in the top 3 (+29% points) and the top 10 (+16% points), thus raising the bar in this field. The cross-validation with ENT cases resulted in a diagnostic accuracy of 64.3 vs. 21.4 vs. 26.2% (top 1), 92.9 vs. 40.5 vs. 42.9% (top 3), and 100 vs. 61.9 vs. 54.8% (top 10) for Symptoma vs. Isabel vs. FindZebra, respectively.
Conclusions
Symptoma is the first and only viable solution in this market. Large-scale studies should be conducted to further validate these results as well as to assess the actual practical performance of the symptom checkers and their ability to diagnose rare diseases.
Literatur
Aymé S, Kole A, Groft S (2008) Empowerment of patients: lessons from the rare diseases community. Lancet 371:2048–2051
Baron J (2006) Thinking and Deciding
Berglund B, Anne-Cathrine M, Randers I (2010) Dignity not fully upheld when seeking health care: experiences expressed by individuals suffering from Ehlers-Danlos syndrome. Disabil Rehabil 32:1–7
Berner ES, Webster GD, Shugerman AA et al (1994) Performance of four computer-based diagnostic systems. N Engl J Med 330:1792–1796
Bowen JL (2006) Educational strategies to promote clinical diagnostic reasoning. N Engl J Med 355:2217–2225
Budych K, Helms TM, Schultz C (2012) How do patients with rare diseases experience the medical encounter? Exploring role behavior and its impact on patient-physician interaction. Health Policy (New York) 105:154–164
Carroll J (2005) Going on the offensive against defensive medicine. Manag Care 14:20, 22
Committee on Diagnostic Error in Health Care, Board on Health Care Services, Institute of Medicine et al (2015) Improving diagnosis in health care
DasGupta A, Cai TT, Brown LD (2001) Interval estimation for a binomial proportion. Stat Sci 16:101–133
Davis D, O’Brien MA, Freemantle N et al (1999) Impact of formal continuing medical education: do conferences, workshops, rounds, and other traditional continuing education activities change physician behavior or health care outcomes? JAMA 282:867–874
Dodge JA, Chigladze T, Donadieu J et al (2011) The importance of rare diseases: from the gene to society. Arch Dis Child 96:791–792
Expert Panel on effective ways of investing in Health (EXPH, 2018) Opinion on Innovative payment models for high-cost innovative medicines. European Commission (Publications Office of the European Union), Luxembourg.
Fridriksson S, Hillman J, Landtblom AM et al (2001) Education of referring doctors about sudden onset headache in subarachnoid hemorrhage. A prospective study. Acta Neurol Scand 103:238–242
Goldman L, Sayson R, Robbins S et al (1983) The value of the autopsy in three medical eras. N Engl J Med 308:1000–1005
Graber ML, Kissam S, Payne VL et al (2012) Cognitive interventions to reduce diagnostic error: a narrative review. BMJ Qual Saf 21:535–557
Grut L, Kvam MH (2013) Facing ignorance: people with rare disorders and their experiences with public health and welfare services. Scand J Disabil Res 15:20–32
Haffner ME, Whitley J, Moses M (2002) Two decades of orphan product development. Nat Rev Drug Discov 1:821–825
Hamady ZZR, Mather N, Lansdown MR et al (2005) Surgical pathological second opinion in thyroid malignancy: impact on patients’ management and prognosis. Eur J Surg Oncol 31:74–77
Kirch W, Schafii C (1996) Misdiagnosis at a university hospital in 4 medical eras. Medicine (Baltimore) 75:29–40
Kole A, Faurisson F (2009) The voice of 12,000 patients. EuroDIS, Paris
von der Lippe C, Diesen PS, Feragen KB (2017) Living with a rare disorder: a systematic review of the qualitative literature. Mol Genet Genomic Med 5:758–773
McQuillan LJ, Abramyan H (2010) US tort liability index: 2010 report. Pacific Research Institute, San Francisco
Mello MM, Chandra A, Gawande AA et al (2010) National costs of the medical liability system. Health Aff (Millwood) 29:1569–1577
Newman-Toker DE, McDonald KM, Meltzer DO (2013) How much diagnostic safety can we afford, and how should we decide? A health economics perspective. BMJ Qual Saf 2(ii11):ii20
Nordrum I, Johansen M, Amin A et al (2004) Diagnostic accuracy of second-opinion diagnoses based on still images. Hum Pathol 35:129–135
Orphadata http://www.orphadata.org/cgi-bin/index.php. Zugegriffen: 5. Jan. 2019
Orphanet (2006) Ehlers Danlos Syndrom, klassischer Typ. https://www.orpha.net/consor/cgi-bin/OC_Exp.php?lng=de&Expert=287. Zugegriffen: 5. Jan. 2019
Orphanet (2012) About rare diseases. https://www.orpha.net/consor/cgi-bin/Education_AboutRareDiseases.php?lng=EN. Zugegriffen: 4. Jan. 2019
Perrin TT (2003) US tort costs: 2003 update, trends and findings on the costs of the US tort system. Tillinghast-Towers Perrin, New York
PinnacleCare (2016) White paper: the human cost and financial impact of misdiagnosis. PinnacleCare, Baltimore
Raab SS, Grzybicki DM, Mahood LK et al (2008) Effectiveness of random and focused review in detecting surgical pathology error. Am J Clin Pathol 130:905–912
Raab SS, Stone CH, Jensen CS et al (2006) Double slide viewing as a cytology quality improvement initiative. Am J Clin Pathol 125:526–533
Rappaport N, Twik M, Plaschkes I et al (2017) MalaCards: an amalgamated human disease compendium with diverse clinical and genetic annotation and structured search. Nucleic Acids Res 45:D877–D887
Rodwell C, Aymé S (2015) Rare disease policies to improve care for patients in Europe. Biochim Biophys Acta 1852:2329–2335
Saber Tehrani AS, Lee H, Mathews SC et al (2013) 25-Year summary of US malpractice claims for diagnostic errors 1986–2010: an analysis from the National Practitioner Data Bank. BMJ Qual Saf 22:672–680
Schiff GD, Kim S, Abrams R et al (2005) Diagnosing Diagnosis Errors: Lessons from a Multi-institutional Collaborative Project. Advances in patient safety: from research to implementation (volume 2: concepts and methodology)
Schwartz A, Elstein A (2008) Clinical reasoning in medicine. In: Clinical reasoning in the health professions, 3. Aufl. Butterworth-Heineman, Amsterdam, S 223–234
Semigran HL, Linder JA, Gidengil C et al (2015) Evaluation of symptom checkers for self diagnosis and triage: audit study. BMJ 351:h3480
Shojania KG, Burton EC, McDonald KM et al (2003) Changes in rates of autopsy-detected diagnostic errors over time: a systematic review. JAMA 289:2849–2856
Singh H, Thomas EJ, Petersen LA et al (2007) Medical errors involving trainees: a study of closed malpractice claims from 5 insurers. Arch Intern Med 167:2030–2036
Smith-Bindman R, Chu PW, Miglioretti DL et al (2003) Comparison of screening mammography in the United States and the United kingdom. JAMA 290:2129–2137
Smith-Bindman R, Miglioretti DL, Johnson E et al (2012) Use of diagnostic imaging studies and associated radiation exposure for patients enrolled in large integrated health care systems, 1996–2010. JAMA 307:2400–2409
Tang H, Ng JHK (2006) Googling for a diagnosis—use of Google as a diagnostic aid: internet based study. BMJ 333:1143–1145
Wilson EB (1927) Probable inference, the law of succession, and statistical inference. J Am Stat Assoc 22:209–212
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Interessenkonflikt
J. Nateqi gibt an, dass er und T. Lutz Gründer und Teilhaber der getesteten Suchmaschine Symptoma sind und dass das Koautorenteam (S. Lin, H. Krobath, S. Gruarin, T. Lutz, T. Dvorak, A. Gruschina und R. Ortner) für diese im Beschäftigungsverhältnis steht. Des Weiteren besteht kein Interessenkonflikt.
Für diesen Beitrag wurden von den Autoren keine Studien an Menschen oder Tieren durchgeführt. Für die aufgeführten Studien gelten die jeweils dort angegebenen ethischen Richtlinien.
Rights and permissions
About this article
Cite this article
Nateqi, J., Lin, S., Krobath, H. et al. Vom Symptom zur Diagnose – Tauglichkeit von Symptom-Checkern. HNO 67, 334–342 (2019). https://doi.org/10.1007/s00106-019-0666-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00106-019-0666-y
Schlüsselwörter
- Fehldiagnosen
- Diagnose, computerunterstützte
- Selbstbehandlung
- Qualität der medizinischen Versorgung
- Informationssuchverhalten