AI is reshaping healthcare, but headlines often outpace reality. Behind the “AI beats doctors” hype lies a complex story of limited evidence, clinical needs, and human trust. Real progress will come from collaboration, not competition.
It’s been years since Artificial Intelligence (AI) became one of the most overused buzzwords in modern science, invoking the same hopes and fears that surrounded the rise of computers in the second half of the 20th century, and nowhere this tension is more visible than in healthcare.
Headlines regularly proclaim that “AI is better than doctors at diagnosing disease X”. It’s a narrative that captivates readers, but also raises important questions: Is it really true? Do we need this “AI versus clinician” framing? And are such claims the work of a creative headliner on a deadline, or a symptom of something deeper within the research ecosystem?
The answer, as usual, is complex and hides in the details–and in the data.
As early as March 2020, a study published in the British Medical Journal critically examined research claiming that AI could match or surpass clinicians in interpreting medical images. The authors found that many such studies were of poor quality and arguably exaggerated.
At the time, only a handful of randomised clinical trials on deep learning in medical imaging had been completed or were underway, and even fewer had been tested in real-world clinical environments. Limited data access and opaque methodologies made reproducibility nearly impossible.
In many studies, the “human benchmark” was often based on a small sample–typically a median of just four clinicians per study. Sometimes, non-experts were also included in the comparison group, skewing results to make AI appear superior, and key metrics like diagnostic speed were often missing altogether.
The issue then extends beyond methodology. Many abstracts contained enthusiastic statements suggesting AI performed as well as, or better than, doctors, while failing to mention critical caveats or limitations. Since journalists and lay readers often rely on abstracts or press releases, this omission fuels media misinterpretation. It’s not necessarily malice, its momentum: the scientific excitement around AI meets the public’s appetite in the same area.
In 2024, the scenario didn’t change much. A paper in npj Digital Medicine proposed ethical guidelines for reporting claims that AI “outperforms” human clinicians. Its authors found that such claims remain highly uncertain, largely because models are often tested under unrealistic conditions.
AI systems might indeed perform better than some clinicians, but only under highly specific, controlled circumstances that rarely mirror real-world medical practice, and when those nuances are stripped from titles or abstracts, the message simplifies to “AI beats doctor”, feeding a feedback loop of hype and misunderstanding. Among the suggestions provided by the authors to journal editors, reviewers, and scientists, there is to check whether outperformance claims are specific, contextualised, empirically grounded, and that study conditions are transparently described.
But, on the other side, what do doctors think about AI in healthcare?
The need for digital support in healthcare is clear. Around the world, medical systems face staff shortages, rising patient loads, and an aging population. Clinicians are often overwhelmed by administrative work and time pressure, which erodes both decision-making and morale, leading to rising levels of burnouts. AI, and the whole in silico medicine technologies could help alleviate some of that burden, if they are designed and implemented with real clinical needs in mind.
A 2023 study published in Healthcare surveyed 265 US healthcare professionals. Among them, only 17% reported using AI in their practice and, among those, 31% found it difficult to learn and 28% said they required significant technical expertise to use it effectively.
By analysing the responses, the study authors found a strong positive correlation between trust in AI and the perception that it reduced workload. Clinicians also emphasized that AI should be seamlessly integrated into their workflows—not add new layers of complexity. In other words, clinicians welcome the AI in their workflows if it’s easy-to-use and instrumental to their work. Many respondents also said they’d welcome AI tools to assist with note-taking and the early identification of high-risk patients, while calling for clear governance protocols for clinical use.
Summing up all the previous information, the picture that emerges from the aforementioned studies is one of misalignment:
Clinicians want AI that is reliable, transparent and seamlessly integrated into their workflows;
Researchers and publishers often publish models tested in narrow conditions with emphasis on “beating the doctors”;
Media outlets amplify oversimplified claims;
Patients receive overhyped information that doesn't match the reality of care.
To bridge these gaps, the field needs more meaningful collaboration. Developing truly actionable models requires deep domain knowledge, a recognition that they exist in the context of larger systems and practices, and the humility about clearly stating what algorithms can–and cannot–actually do, and under which conditions.
Ultimately, realising the in silico medicine promise will demand an ecosystem approach: one that includes clinicians, scientists, ethicists, social scientists, regulators, policymakers, and patients from the start. Only through this holistic approach the in silico medicine field, of which AI in healthcare is part of, can move beyond hype and start delivering the care revolution it has long promised.
References: