OpenAI's Reasoning Model Outperforms Doctors in Harvard ER Trial

What Happened

A Harvard-led study published in Science used electronic medical records to compare OpenAI's reasoning model against emergency-room triage physicians. The model reached 67% diagnostic accuracy compared to the 50-55% range for doctors. The trial focused on initial triage diagnosis, not treatment decisions, and used retrospective records rather than live patient encounters.

My Take

Every health system board will see this paper, and most will misread it. The right takeaway isn't "replace triage doctors" — it's that AI is now better than humans at the pattern-matching part of medicine, while humans remain better at uncertainty, communication, and edge cases. Hospitals that pair AI triage with senior MDs will see a step-change in throughput. Malpractice law catches up by 2028 and quietly mandates AI as a standard of care.

Read Original Source