Conclusions
The software, dubbed Tak, performs comparably to humans on straightforward cases and intermediate difficulty cases, but is outperformed by humans on challenging clinical cases. Tak outperforms a decision tree classifier at all levels of difficulty. Our results are a proof-of-concept that, in a restricted domain, probabilistic logic networks can perform medical reasoning comparably to humans.
2. Convolutional Neural Networks for the evaluation of cancer in Barrett's esophagus: Explainable AI to lighten up the black-box
Even though artificial intelligence and machine learning have demonstrated remarkable performances in medical image computing, their level of accountability and transparency must be provided in such evaluations. The reliability related to machine learning predictions must be explained and interpreted, especially if diagnosis support is addressed. For this task, the black-box nature of deep learning techniques must be lightened up to transfer its promising results into clinical practice. Hence, we aim to investigate the use of explainable artificial intelligence techniques to quantitatively highlight discriminative regions during the classification of early-cancerous tissues in Barrett's esophagus-diagnosed patients. Four Convolutional Neural Network models (AlexNet, SqueezeNet, ResNet50, and VGG16) were analyzed using five different interpretation techniques (saliency, guided backpropagation, integrated gradients, input × gradients, and DeepLIFT) to compare their agreement with experts' previous annotations of cancerous tissue. We could show that saliency attributes match best with the manual experts' delineations. Moreover, there is moderate to high correlation between the sensitivity of a model and the human-and-computer agreement. The results also showed that the higher the model's sensitivity, the stronger the correlation of human and computational segmentation agreement. We observed a relevant relation between computational learning and experts' insights, demonstrating how human knowledge may influence correct computational learning.
3. Explainable AI for COVID-19 CT Classifiers: An Initial Comparison Study
Artificial Intelligence (AI) has made leapfrogs in development across all the industrial sectors especially when deep learning has been introduced. Deep learning helps to learn the behavior of an entity through methods of recognising and interpreting patterns. Despite its limitless potential, the mystery is how deep learning algorithms make a decision in the first place. Explainable AI (XAI) is the key to unlocking AI and the black-box for deep learning. XAI is an AI model that is programmed to explain its goals, logic, and decision making so that the end users can understand. The end users can be domain experts, regulatory agencies, managers and executive board members, data scientists, users that use AI, with or without awareness, or someone who is affected by the decisions of an AI model. Chest CT has emerged as a valuable tool for the clinical diagnostic and treatment management of lung diseases associated with COVID-19. AI can support rapid evaluation of CT scans to differentiate COVID-19 findings from other lung diseases. However, how these AI tools or deep learning algorithms reach such a decision and which are the most influential features derived from these neural networks with typically deep layers are not clear. The aim of this study is to propose and develop XAI strategies for COVID-19 classification models with an investigation of comparison. The results demonstrate promising quantification and qualitative visualizations that can further enhance the clinician's understanding and decision making with more granular information from the results given by the learned XAI models.
4. Explainable AI-based clinical decision support system for hearing disorders
In clinical system design, human-computer interaction and explainability are important topics of research. Clinical systems need to provide users with not only results but also an account of their behaviors. In this research, we propose a knowledge-based clinical decision support system (CDSS) for the diagnosis and therapy of hearing disorders, such as tinnitus, hyperacusis, and misophonia. Our prototype eTRT system offers an explainable output that we expect to increase its trustworthiness and acceptance in the clinical setting. Within this paper, we: (1) present the problem area of tinnitus and its treatment; (2) describe our data-driven approach based on machine learning, such as association- and action rule discovery; (3) present the evaluation results from the inference on the extracted rule-based knowledge and chosen test cases of patients; (4) discuss advantages of explainable output incorporated into a graphical user interface; (5) conclude with the results achieved and directions for future work.
5. Improvement of a Prediction Model for Heart Failure Survival through Explainable Artificial Intelligence
Cardiovascular diseases and their associated disorder of heart failure are one of the major death causes globally, being a priority for doctors to detect and predict its onset and medical consequences. Artificial Intelligence (AI) allows doctors to discover clinical indicators and enhance their diagnosis and treatments. Specifically, explainable AI offers tools to improve clinical prediction models that experience poor interpretability of their results. This work presents an explainability analysis and evaluation of a prediction model for heart failure survival by using a dataset that comprises 299 patients who suffered heart failure. The model employs a data workflow pipeline able to select the best ensemble tree algorithm as well as the best feature selection technique. Moreover, different post-hoc techniques have been used for the explainability analysis of the model. The paper's main contribution is an explainability-driven approach to select the best prediction model for HF survival based on an accuracy-explainability balance. Therefore, the most balanced explainable prediction model implements an Extra Trees classifier over 5 selected features (follow-up time, serum creatinine, ejection fraction, age and diabetes) out of 12, achieving a balanced-accuracy of 85.1% and 79.5% with cross-validation and new unseen data respectively. The follow-up time is the most influencing feature followed by serum-creatinine and ejection-fraction. The explainable prediction model for HF survival presented in this paper would improve a further adoption of clinical prediction models by providing doctors with intuitions to better understand the reasoning of, usually, black-box AI clinical solutions, and make more reasonable and data-driven decisions.
6. Explainable Artificial Intelligence for Bias Detection in COVID CT-Scan Classifiers
Problem: An application of Explainable Artificial Intelligence Methods for COVID CT-Scan classifiers is presented. Motivation: It is possible that classifiers are using spurious artifacts in dataset images to achieve high performances, and such explainable techniques can help identify this issue. Aim: For this purpose, several approaches were used in tandem, in order to create a complete overview of the classifications. Methodology: The techniques used included GradCAM, LIME, RISE, Squaregrid, and direct Gradient approaches (Vanilla, Smooth, Integrated). Main results: Among the deep neural networks architectures evaluated for this image classification task, VGG16 was shown to be most affected by biases towards spurious artifacts, while DenseNet was notably more robust against them. Further impacts: Results further show that small differences in validation accuracies can cause drastic changes in explanation heatmaps for DenseNet architectures, indicating that small changes in validation accuracy may have large impacts on the biases learned by the networks. Notably, it is important to notice that the strong performance metrics achieved by all these networks (Accuracy, F1 score, AUC all in the 80 to 90% range) could give users the erroneous impression that there is no bias. However, the analysis of the explanation heatmaps highlights the bias.
7. Brain Hemorrhage Classification in CT Scan Images Using Minimalist Machine Learning
Over time, a myriad of applications have been generated for pattern classification algorithms. Several case studies include parametric classifiers such as the Multi-Layer Perceptron (MLP) classifier, which is one of the most widely used today. Others use non-parametric classifiers, Support Vector Machine (SVM), K-Nearest Neighbors (K-NN), Naïve Bayes (NB), Adaboost, and Random Forest (RF). However, there is still little work directed toward a new trend in Artificial Intelligence (AI), which is known as eXplainable Artificial Intelligence (X-AI). This new trend seeks to make Machine Learning (ML) algorithms increasingly simple and easy to understand for users. Therefore, following this new wave of knowledge, in this work, the authors develop a new pattern classification methodology, based on the implementation of the novel Minimalist Machine Learning (MML) paradigm and a higher relevance attribute selection algorithm, which we call dMeans. We examine and compare the performance of this methodology with MLP, NB, KNN, SVM, Adaboost, and RF classifiers to perform the task of classification of Computed Tomography (CT) brain images. These grayscale images have an area of 128 × 128 pixels, and there are two classes available in the dataset: CT without Hemorrhage and CT with Intra-Ventricular Hemorrhage (IVH), which were classified using the Leave-One-Out Cross-Validation method. Most of the models tested by Leave-One-Out Cross-Validation performed between 50% and 75% accuracy, while sensitivity and sensitivity ranged between 58% and 86%. The experiments performed using our methodology matched the best classifier observed with 86.50% accuracy, and they outperformed all state-of-the-art algorithms in specificity with 91.60%. This performance is achieved hand in hand with simple and practical methods, which go hand in hand with this trend of generating easily explainable algorithms.
8. The Ethics of Artificial Intelligence in Pathology and Laboratory Medicine: Principles and Practice
Artificial intelligence (AI) is transforming society and health care. Growing numbers of artificial
intelligence applications are being developed and applied to pathology and laboratory medicine.
These technologies introduce risks and benefits that must be assessed and managed through the
lens of ethics.
There is great enthusiasm for the potential of these AI tools to transform and improve health
care. This is reflected in efforts by pathology and laboratory medicine professionals both to
enhance the practice of pathology and laboratory medicine and to advance medical knowledge
based on the data that we generate.
AI application developers use real-world data sets to “train” their applications to generate the
desired output. Applications are ideally validated using separate real-world data sets to assess
the accuracy and generalizability of the AI output. In pathology AI, for example, a training or
validation data set might consist of digitized microscopic images together with the associated
diagnoses as assessed by human expert pathologists.
Clinical laboratories performing in vitro diagnostic tests (including histopathologic diagnosis)
constitute one of the largest single sources of objective and structured patient-level data within
the health care system.
Ethics in medicine, scientific research, and computer science all have deep academic roots. The foundational principles of medical ethics as articulated by Beauchamp and Childress are autonomy, beneficence, nonmaleficence, and justice.
9. Explanatory artificial intelligence for diabetic retinopathy diagnosis
Diabetic Retinopathy (DR) is a leading and growing cause of vision impairment and blindness: by 2040, around 600 million people throughout the world will have diabetes
a third of whom will have DR (Yau et al., 2012). Early diagnosis is key to slowing down
the progression of DR and therefore preventing the occurrence of blindness
Annual retinal screening, generally using Color Fundus Photography (CFP), is thus
recommended for all diabetic patients.
In order to improve DR screening programs, numerous Artificial Intelligence (AI) systems were thus, developed to automate DR diagnosis using CFP (Ting et al., 2019b). However, due to the “black-box” nature of state-of-the-art AI, these systems still need to gain the trust of clinicians and patients.
Nowadays exists an eXplanatory Artificial Intelligence (XAI) that reaches the same level of
performance as black-box AI, for the task of classifying Diabetic Retinopathy (DR) severity using Color Fundus Photography (CFP). This algorithm, called ExplAIn, learns to segment and categorize lesions in images; the final image-level classification directly derives from these multivariate lesion segmentations. The novelty of this explanatory framework is that it is trained from end to end, with image supervision only, just like black-box AI algorithms: the concepts of lesions and lesion categories emerge by themselves. For improved lesion localization, foreground/background separation is trained through self-supervision, in such a way that occluding foreground pixels transforms the input image into a healthy-looking image. The advantage of such an architecture is that automatic diagnoses can be explained simply by an image and/or a few sentences. ExplAIn is evaluated at the image level and at the pixel level on various CFP image datasets. We expect this new framework, which jointly offers high classification performance and explainability, to facilitate AI deployment.
10. Machine Learning and XAI approaches for Allergy Diagnosis
This work presents a computer-aided framework for allergy diagnosis which is capable of handling comorbidities. The system was developed using datasets collected from allergy testing centers in South India. Intradermal skin test results of 878 patients were recorded and it was observed that the data contained very few samples for comorbid conditions. Modified data sampling techniques were applied to handle this data imbalance for improving the efficiency of the learning algorithms. The algorithms were cross-validated to choose the optimal trained model for multi-label classification. The transparency of the machine learning models was ensured using post-hoc explainable artificial intelligence approaches. The system was tested by verifying the performance of a trained random forest model on the test data. The training and validation accuracy rate of the decision tree, support vector machine and random forest are 81.62, 81.04 and 83.07 respectively. During evaluation, random forest achieved a rate of 86.39 accuracy overall, and 75% sensitivity for the comorbid Rhinitis-Urticaria class. The average performance of the clinicians before and after using the decision support system were 77.21% and 81.80% respectively
Bibliography
1. Diagnosis of Acute Poisoning using explainable artificial intelligence
Michael Chary, Ed W. Boyer, Michele M. Burns
Computers in Biology and Medicine, Volume 134, July 2021
2. Convolutional Neural Networks for the evaluation of cancer in Barrett's esophagus: Explainable AI to lighten up the black-box
Luis A. de Souza Jr. , Robert Mendel, Sophia Strasser, Alanna Ebigbo, Andreas Probst, Helmut Messmann, João P. Papa, Christoph Palm
Computers in Biology and Medicine, Volume 135, August 2021
3. Explainable AI for COVID-19 CT Classifiers: An Initial Comparison Study
Q. Ye, J. Xia and G. Yang
IEEE 34th International Symposium on Computer-Based Medical Systems (CBMS), 12 July 2021
4. Explainable AI-based clinical decision support system for hearing disorders
Katarzyna A. Tarnowska, Ph.D., Brett C. Dispoto, B.S., and Jordan Conragan, B.S.
Published online 2021 May 17.
5.Improvement of a Prediction Model for Heart Failure Survival through Explainable Artificial Intelligence
Pedro A. Moreno-Sanchez
20 Aug 2021
6. Explainable Artificial Intelligence for Bias Detection in COVID CT-Scan Classifiers
Iam Palatnik de Sousa, Marley M. B. R. Vellasco, Eduardo Costa da Silva
23 August 2021
7. Brain Hemorrhage Classification in CT Scan Images Using Minimalist Machine Learning
José-Luis Solorio-Ramírez, Magdalena Saldana-Perez, Miltiadis D Lytras, Marco-Antonio Moreno- Ibarra, Cornelio Yáñez-Márquez
2021 Aug 11
8. The Ethics of Artificial Intelligence in Pathology and Laboratory Medicine: Principles and Practice
Brian R. Jackson MD, Ye Ye , James M. Crawford, Michael J. Becich, Somak Roy, Jeffrey R. Botkin, Monica E. de Baca, Liron Pantanowitz
9. ExplAIn: Explanatory artificial intelligence for diabetic retinopathy diagnosis
Gwenolé Quellec, Hassan Al Hajj, Mathieu Lamard, Pierre-Henri Conze, Pascale Massin, Béatrice Cochener
August 2021
10. Machine Learning and XAI approaches for Allergy Diagnosis
Ramisetty Kavya, Jabez Christopher, Subhrakanta Panda, Y. Bakthasingh Lazarus
August 2021