Artificial Intelligence in Medicine: Challenges & Opportunities Ahead

In this week’s essay, GDI Research Analyst Vincent Jeanselme of the University of Cambridge’s MRC Biostatistics unit provides an overview of Medical AI, discussing its evolution, limitations, and a likely path forward.

Modern medicine leverages artificial intelligence (AI) to aid doctors in their daily work, with AI systems assisting during surgery, alerting when patients become unstable, or even analysing x ray scans. This technology has drastically evolved over the past few years to allow the analysis of an ever growing amount of data. However, these AI-assisted tools are still hardly integrated into current practices. What are the limitations that prevent them from being applied — and what are these AI technologies’ potential benefits?

To answer those questions, we need to first remember what artificial intelligence is as well as its aims. The goal of AI, simply put, is to reproduce human reasoning in technology’s ability to make decisions. More specifically, a sub-branch of AI known as Machine Learning (ML) has become the subject of growing interest due to its use of computational and algorithmic novelties. ML’s technology is unique in that it consists of developing a model from observed data instead of approaching a problem from hard-coded rules specified by experts (i.e., the machine learns from the observed data). In medicine, the objective of ML technology is to mimic doctors’ decisions from observed patients’ data.

For example: A radiologist makes a diagnosis when analysing an X-ray. By gathering data from thousands of patients’ X-rays and diagnoses, the ML model aims to develop similar conclusions to those of the practitioners it is learning from. This type of technology could assist doctors in making quicker decisions by providing them with a ‘first’ diagnostic opinion, freeing up doctors’ time to focus valuable attention on more complex cases and help more patients.

Recently, Google AI developed such a model that showed ‘more accurate’ predictive results than physicians themselves. Through de-identified electronic health records (EHR) from two US academic medical centers with 216,221 patients hospitalized for at least 24 hours, the researchers acquired over 46.8 billion data points that were used in deep learning methods predicting in-hospital mortality rates, prolonged length of stay, and all of a patient’s final discharge diagnoses. Both this study and claims by other similar studies have been highly controversial.

The Trouble with Medical AI Data

This case study is one example for understanding a key problem in medical AI: reproducibility. Problems with reproducibility mean that the published experiments are difficult for other teams to reproduce (with reproducibility being a central element for how scientific knowledge is tested and develops).

Multiple factors explain this phenomenon. First, data have an enormous commercial value due to their potential, but also because of the associated cost to gather, clean, stock, and label them. In a competitive medical market, this prevents data from being shared with the medical community, which could benefit from it. In addition, medical data in particular are highly sensitive. Ensuring privacy is crucial, as the data are not only valuable but could harm the patient if misused (e.g., would your insurance be interested in knowing your entire medical history?) Maintaining patients’ privacy rights in studies like these remains a technical challenge, as de-identified data no longer guarantees anonymity because of the large amount of data gathered on each individual.

Next, data frequently suffers from biases reflecting prior beliefs and choices. For example, a standard medical procedure in the 80’s, now challenged, might be over-represented in historical data pulled from older records. This could lead to the development of a model that would no longer be relevant. Moreover, some populations that would benefit from a newly developed ML model might be underrepresented by the data that was used in it because of historic and present practices. Well known examples include pregnant women, who are rarely included in clinical trials, or racial diversity that might not be reflected in the data sample. Other medical-specific data biases include how disease-specific medications are developed on patients more likely to be sick from it, therefore ignoring rare cases which are treated by the ML model as an abnormality. These biases lead to problems with models not working on some patients, or even worse, hurting them.

Another limiting aspect of using healthcare data is the incompleteness of data: doctors make diagnoses not only on a series of numbers, but also after interacting with the patient and judging their overall condition from clues that can not be fully captured simply by numbers. To adjust an old saying: algorithms cannot model what is not measured.

Will Medical AI Replace Doctors?

These issues all lead to a key limitation of AI: the models are not interacting with the patient but instead, draw conclusions based on specific data points. This is essential to consider, especially in the medical context where better patient-doctor communication has been linked to improved health outcomes. Moreover, the feeling of a human connection is indispensable. Research has repeatedly shown that patients achieve better health outcomes when their doctors possess similar characteristics to them (e.g., cultural background, gender, socioeconomic background, etc.), as it seems to make conversations easier and therefore allow the doctor to better evaluate their patient’s overall state.

To properly diagnose a patient requires seeing them as a human who is embedded in multiple social and environmental systems, any combination of which could be contributing to medical issues, and interpreting their condition holistically. A patient cannot be diagnosed through their symptoms alone; background about the social and economical aspects of their life also play a key role in determining health and well-being. Machines are so far unable to reproduce these interactions, which I would argue are keys to our humanity and a fundamental aspect of what makes well-practiced medicine so powerful.

The resulting mistrust in machines is accentuated by the complexity of the developed models themselves, as their developers strive to cope with these many challenges linked to medical data. This complexity makes these models sensitive to their training data, but also mitigates our understanding of a specific decision. The absence of medical interpretation is one reason why medical AI has not been widely accepted in the medical community. Instead, simpler (but better understood) models are used.

This non-acceptance stems from the tension between two domains that meet in medical AI: engineering and statistics. The first has led to more complex models which maximize a given measure of performance, whereas the second is reticent to integrate complexity in the model if it is not the fruit of some precise assumptions about the data.

Moreover, there is some reticence against new technologies in the field and the threats they bring to jobs and certain expertise. Yet, I think similar fears have been associated with every new major tool: for instance, a reliance on calculators leaves us less efficient when doing mental addition, yet allows us to focus our attention instead on more complex problems. In the same way, AI might make us less effective at interpreting monitoring systems, but free doctors’ time to focus on a given disease’s origins. Regarding the impact on jobs: technologies have transformed, and will continue to transform, the workplace by removing some tasks while simultaneously creating new opportunities. Moreover, today, the medical world suffers from the opposite problem, with a greater need for doctors than are currently available. AI might provide a needed solution (or at least, aid) in responding to this global medical need.

AI models are optimised towards achieving the best performances on a given metric. This key component of ML is its strength, but also its weakness. If not understood or used properly, this structure can lead to models preferring the majority and ignoring rare cases, implying unadapted decisions. Put another way: in a totally automated system, a small modelling error might lead to AI-recommended medical treatments which could worsen patients’ conditions. This last point also raises the legal concern of responsibility. Who would be responsible for wrong doing when an algorithm has “abnormal” behaviour, or provides medical advice that leads to dire consequences? Though beyond the scope of this essay, it is clear that strong ethical considerations and subsequent legal regulations must enter this conversation before models can be responsibly applied in real cases, with real human lives at stake.

Looking Ahead

Addressing these many challenges is necessary if we are to apply medical AI in our hospitals, which is why they are such an active axis of research. Interpretability is deeply studied and better understood models are now used to create more robust and interpretable tools. Extensive testing and validation techniques are being explored to ensure their effectiveness. Code and data sharing initiatives are appearing to allow researchers to build on each others’ previous work. These steps are the key forward if we are to use these promising medical AI tools more widely.

In a lighter touch of technological intervention, and without going through an endless list of machine learning applications in healthcare, there have been a multitude of models developed in different medical specialties to replace human cognition in repetitive tasks. These applications have tremendous advantages due to their capacity to quickly process large amounts of multivariate data. This has allowed them to automatically discover unsuspected signals in data, leading to new medical discoveries. These technologies are therefore not limited to reproducing human behaviour, but can also allow us to enhance medical knowledge. This has been observed in simulations which leverage the speed of computers, with human medical expertise being guided by machine learning. For instance, so far this has helped researchers to focus on promising molecule configurations that could lead to more efficient drugs and vaccines.

The future of AI in medicine is clearly complex. However, ML offers a glimpse of a future that could allow a shift from an effect-based to a cause-based medicine, as long term health evolutions can be better understood through the evolution of observed data. This trajectory would help foster a more preventive approach to medicine, focusing on avoiding a disease in the first place instead of curing some of its symptoms. Artificial Intelligence comes with invaluable promises but, as with any tool, it also comes with its limitations and risks. To harness the strengths of medical AI will require building needed tools and frameworks that we may use to address its weaknesses, and specifically those technological, ethical, and legal challenges touched on here.

The Trouble with Medical AI Data

Will Medical AI Replace Doctors?

Looking Ahead

Footer