Andrew J Fishman and Nikola Jankulovic
Artificial intelligence systems are increasingly embedded in clinical decision-making environments, including high-risk surgical domains. While most evaluations emphasize benchmark accuracy, far less attention has been paid to how artificial intelligence alters expert cognition under real-world constraints. This study presents a controlled comparative analysis between a senior subspecialist surgeon and a multimodal large language model deployed as a peer-level consultant on high-complexity expert clinical reasoning tasks. Both agents achieved identical overall accuracy. However, qualitative divergence analysis revealed fundamentally different cognitive and error profiles. The artificial intelligence system demonstrated superior semantic precision and strict adherence to textual decision rules, while the human expert outperformed in visual interpretation, contextual inference, and clinical safety judgment. Critically, the study documents an automation-induced deference error in which the human expert deferred to a confident but incorrect artificial intelligence output, resulting in an incorrect final decision. These findings indicate that artificial intelligence functions not merely as a neutral instrument but as a persuasive cognitive actor capable of reshaping expert judgment. A Parallel Review model is proposed in which artificial intelligence serves as a precision safeguard rather than a decision authority, preserving human accountability while mitigating automation bias.