Speaker: Danielle Belgrave
The revolution in Artificial Intelligence (AI) and machine learning began with the recognition of the paramount importance of data. The speaker highlights the increasing digitalization of healthcare, creating opportunities to enhance understanding of patient health and wellness in real time. Leveraging the generated data further improves clinical decision-making by profiling patients and tailoring treatments to individual needs. Additionally, transitioning from a one-size-fits-all approach to personalized medicine has become crucial. AI is utilized to comprehend patient response heterogeneity. Generative AI, including large language models like ChatGPT, is valuable for advancing scientific knowledge and discovery. Despite this, AI for healthcare faces numerous technical challenges: Probabilistic modelling stands as a key obstacle, necessitating the understanding of latent patient representations and the inference of diverse condition subtypes from longitudinal patient profiles, time series data is often combined with the challenge of data sparsity, integration of multimodal data compounds the complexity, with sparse datasets posing significant hurdles to the task of identifying consistent patterns and effective treatments across patient cohorts and may affect while establishing causality between interventions and outcomes. Additionally, ensuring robustness and fairness in AI models presents a formidable challenge, requiring assessing replicability, robustness, and representation within datasets. Furthermore, the ability of models to generalize across diverse populations is essential
Utilizing AI for healthcare involves data-driven methodologies aimed at comprehending disease heterogeneity. The focus is uncovering distinct endotypes of conditions and understanding their underlying mechanisms. The aim is to distinguish between responsive and non-responsive patients and subsequently elucidate the mechanisms behind the non-responsive group to identify targets for effective treatment. Although the approach remains complex, it can be addressed by integrating multimodal data sources, including CT diagnostics, Human Neutrophil Antigens (HNA) staining, pathological data, Electronic Health Records (EHR) records, transcriptomics data, etc. Multimodal data addresses the challenge of noisy and incomplete patient profiles often observed when examining individual datasets in isolation. Integrating various data modalities allows for a more comprehensive understanding of patients, compensating for missing data or blind spots within specific domains. The approach integrates weak signals from multiple modalities to address noise and distinguish between true patterns and random variations within signals. By leveraging various modalities, response variables like treatment success can be more precisely inferred, enhancing the ability to interpret complex data and outcomes.
The session highlighted the study in Nature Reviews, which explored leveraging multimodal data integration to advance precision oncology and other healthcare areas. Unimodal models serve as the building blocks for multimodal models, and it is further crucial to move beyond unimodal models and integrate different data modalities to infer correlations. The landscape of machine learning techniques includes both supervised and unsupervised methods. In unsupervised learning, data lacking labels are analyzed to discover intrinsic patterns, which is beneficial for approaches like endotype discovery. The aim particularly recognizes interesting patterns within raw data, even when the response is unknown, often requiring external validation to confirm identified subtypes or patterns. Supervised learning, commonly used in statistics & predictive modelling, aimed to predict labels of interest, such as treatment response. In recent years, new techniques in deep learning, including convolutional neural networks (CNNs) and recurrent neural networks (RNNs), were developed. These techniques were primarily used in imaging data and generative modelling to label images, which have become common in clinical diagnostics despite the challenges of limited labels and inputs. CNNs were used to analyze large-scale imaging data, while RNNs focused on understanding patterns over time.
Three primary challenges in AI for healthcare emerged: scientific discovery for understanding disease mechanisms for early detection and prevention, Disease treatment for predicting health outcomes for preventive, personalized care and developing fairer AI systems. Efforts included leveraging models for biomarker discovery, which is essential for individualized healthcare. The speaker demonstrated utilizing computational pathology in the study; pipelines were built to automate annotation and identify different tumour and cell types in whole slide images. This approach made biomarker identification more efficient and cost-effective. Additionally, diverse multimodal data, including genetic, epigenetic, transcriptomics, proteomics, and morphological data, were combined to understand genetic profiles and distinguish between pulmonary fibrosis and Chronic obstructive pulmonary disease (COPD). This approach helped build detailed patient representations and cluster patients based on their profiles, consequently distinguishing them based on the profile. The speaker further addressed the importance of augmenting data distributions to address the primary challenge in AI and machine learning: data quality and representation, where model training data often does not represent the test data, thus leading to poor outcomes across different settings. A study was conducted where the diffusion model was trained to generate data samples more representative of the population, addressing the underrepresentation issue in model translation and was trained to identify probabilistic patterns in histopathology, radiology (chest x-rays), and dermatological data. Furthermore, classifiers were trained, and augmented data generated using artificial images improved model performance, reducing the gap between training and test sets. Credible images across different dermatological conditions were generated, considering parameters like age, gender, and race. The speaker addressed that deep contextual understanding is vital in machine learning for healthcare and aligning problem-solving with data and clinical context. Additionally, good science involves integrating various perspectives to form a comprehensive experience. Thus suggesting that AI for healthcare combines data-driven and domain knowledge approaches, prioritizing patient-centric solutions.
European Academy of Allergy and Clinical Immunology (EAACI), 2024 31st May-3rd June, Valencia