In real-world medicine, patients often present with overlapping symptoms, incomplete histories, or unexpected complications. If an AI system cannot handle minor shifts in question formatting, it may also struggle with these kinds of real-life variability.
“These AI models aren’t as reliable as their test scores suggest,” Bedi said. “When we changed the answer choices slightly, performance dropped dramatically, with some models going from 80% accuracy down to 42%. It’s like having a student who aces practice tests but fails when the questions are worded differently. For now, AI should help doctors, not replace them.”
Nothing can replace wisdom. It takes time and experience to practice medicine well. Shift work makes that a bit harder to obtain, but it still can accomplished.
Just saw this in one of the random newsletters I get. Makes your point - human interaction is 'messy'
https://www.psypost.org/top-ai-models-fail-spectacularly-when-faced-with-slightly-altered-medical-questions/
In real-world medicine, patients often present with overlapping symptoms, incomplete histories, or unexpected complications. If an AI system cannot handle minor shifts in question formatting, it may also struggle with these kinds of real-life variability.
“These AI models aren’t as reliable as their test scores suggest,” Bedi said. “When we changed the answer choices slightly, performance dropped dramatically, with some models going from 80% accuracy down to 42%. It’s like having a student who aces practice tests but fails when the questions are worded differently. For now, AI should help doctors, not replace them.”
Nothing can replace wisdom. It takes time and experience to practice medicine well. Shift work makes that a bit harder to obtain, but it still can accomplished.