Both the AI-based clinical decision support and the conventional diagnostic tools demonstrate overlaps. Both can deliver data that can be used for CDSSs since performance is a crucial concern and because their outcomes can be documented. The inner workings of laboratory tests are also understood, as is frequently the case with other diagnostic tests, such as imaging, therefore they are not considered to be “black box” techniques. On the other hand, it is difficult to explain the outcome of any particular test using this technique. This makes it clear that there are two levels of explainability from a medical standpoint. The first level enables us to comprehend the system’s general method of conclusion generation, and the second level can find traits that are crucial for individual prediction.
Also Read: Is AI-powered Decision Compatible with the Inherent Values of Patient-Centered Care?
Individual forecasts can be safely verified for trends that would point to an incorrect prediction, such as an odd feature distribution in an out-of-sample situation. The AI-based CDSS will frequently have access to this second-level explainability, while other diagnostic tests will not. This also affects how explainability data are presented to clinicians (and patients). First-level explanations may be sufficient in some clinical use cases, depending on the risk associated with that particular use case, whereas second-level explanations are frequently needed in other use cases to protect patients. Clinical validation is now the first criterion for a medical AI system that is frequently discussed. Explainability is frequently only given serious thought afterward.
Whether AI-powered or not, CDSSs, in particular, must go through a rigorous validation procedure to meet regulatory requirements for medical certification. When this procedure is successful, there is evidence that the system is capable in extremely varied clinical circumstances or real-world clinical settings. Therefore, it is critical to comprehend how clinical validation is evaluated. Prediction accuracy/performance is a typical performance indicator. There are various metrics of prediction accuracy that are adapted to certain instances, but they all share the trait of reflecting the quality of the predictions and the overall clinical usefulness of a model. Consequently, improving prediction performance and offering low error rates are two of the key objectives of model development. Overall, AI-powered systems have generated lower error rates than conventional techniques.
Despite all attempts, however, imperfect precision cannot be achieved by AI systems due to several causes of inaccuracy. For starters, it is practically difficult to construct a model without any flaws because medical datasets are by their very nature inaccurate. There will always be instances of both false positive and false negative predictions since these errors are random.
AI bias is a very significant source of inaccuracy. Systematic errors or deviations from the desired prediction behavior may be caused by AI biases. A key objective of AI in the creation of healthcare products is to approach this ideal state through rigorous clinical validation and the development of diverse data sources. While doing so ensures that AI bias can be minimized, it will be impossible to create AI tools that are completely bias-free.
Therefore, from a medical perspective, clinical explainability—as well as clinical validation—plays a crucial role in the clinical situation. No matter which side of a debate an AI system and human experts are on, explainability makes it possible to resolve it. It should be highlighted that rather than random error, this will work best in situations where there is a systematic error or AI bias. In circumstances when the tool and the physician agree, random errors are far more difficult to spot and are more likely to go undetected or to result in situations where the tool and the physician disagree.
Results of explainability testing are frequently illustrated or explained in natural language processing. Both demonstrate to the clinicians how various elements influenced the ultimate recommendation. In other words, explainability can help doctors assess system recommendations based on their clinical expertise and experience. This enables users to decide for themselves whether or not to believe the system’s recommendations, which may subsequently increase their faith in it. Explainability enables verification of whether the parameters considered by the system make sense from a clinical point-of-view, especially in situations where the CDSS makes suggestions that are markedly out of line with a clinician’s expectations.
Since trust in these systems has not yet been built, explainability may be a major factor in the adoption of AI-driven CDSS in clinical practice. It is crucial to remember that any use of AI-based CDSS may sway a doctor’s judgment in this situation. Therefore, it will be very important to create transparent records of the methods used to generate suggestions.
Given these factors, explainability may be a crucial factor in the adoption of AI-driven CDSS in clinical practice. This is because there is now a lack of trust in these systems. Here, it’s crucial to keep in mind that any use of AI-based CDSS could influence a doctor’s judgment. Establishing clear documentation of how suggestions were made will, therefore, be quite important.
Written by:
Samridhhi Mandawat – Healthark Insights