A 49-year-old man notices a painless rash on his shoulder but doesn’t seek care. Months later, during a routine physical, his doctor notices the rash and diagnoses it as a benign skin condition. More time passes, and during a routine screening test, a nurse points out the rash to another physician who urges the patient to see a dermatologist. A dermatologist performs a biopsy. The pathology report reveals a noncancerous lesion. The dermatologist seeks a second reading of the pathology slides. This time, a different verdict: invasive melanoma. The patient is immediately started on chemotherapy. Weeks later, a physician friend asks him why he’s not on immunotherapy instead.
Albeit hypothetical, a version of this scenario plays out all too often in modern health care—not because of negligence, but due to sheer human fallibility and systemic errors.
If done right, artificial intelligence could drastically reduce both systemic glitches and errors in the decision-making of individual clinicians, according to commentary written by scientists at Harvard Medical School and Google.
The article, published April 4 in The New England Journal of Medicine, offers a blueprint for integrating machine learning into the practice of medicine and outlines the promises and pitfalls of a technological advance that has captivated the imaginations of bioinformaticians, clinicians and nonscientists alike.
The vast processing and analytic capacity of machine learning can amplify the unique capabilities of human decision-making—common sense and the ability to detect nuance. The combination, the authors argue, could optimize the practice of clinical medicine.
Machine learning defined
Machine learning is a form of artificial intelligence not predicated on predefined parameters and rules but instead involves adaptive learning. Thus, with each exposure to new data, an algorithm grows increasingly better at recognizing patterns over time. In other words, machine learning exhibits neural plasticity not unlike the cognitive plasticity of the human brain. However, where human brains can learn complex associations from small bits of data, machine learning requires far more examples to learn the same task. Machines are far slower at learning but have greater operational capacity and produce fewer errors of interpretation.
“A machine-learning model can be trained on tens of millions of electronic medical records with hundreds of billions of data points without lapses in attention,” said commentary author Isaac Kohane, chair of the Department of Biomedical Informatics in the Blavatnik Institute at Harvard Medical School. “But it’s impossible, too, for a human physician to see more than a few tens of thousands of patients in an entire career.”
Thus, the authors said, deploying machine learning could offer individual physicians the collective wisdom of billions of medical decisions, billions of patient cases and billions of outcomes to inform the diagnosis and treatment of an individual patient.
In situations, where predictive accuracy is critical, the ability of a machine-learning system to spot telltale patterns across millions of samples could enable “superhuman” performance, the authors said.
To err is human
A 1999 report by the Institute of Medicine, now known as the National Academy of Medicine, titled “To Err is Human,” recognized the imperfections of human decision-making and the limits of individual clinician knowledge. The latter is poised to become a growing problem for frontline clinicians who must synthesize, interpret and apply an ever-growing amount of biomedical knowledge stemming from an exponential rate of new discoveries.“We must have the humility to recognize that keeping up with the pace of biomedical knowledge and new discoveries is humanly impossible for the individual practitioner,” Kohane said. “AI and machine learning can help reduce, even eliminate errors, optimize productivity and provide clinical decision support.”
According to the Institute of Medicine’s report, clinical errors encompass four broad categories:
- Diagnostic: failure to order appropriate tests or to properly interpret test results; use of outdated tests; wrong diagnosis or delay of accurate diagnosis; and failure to act on test results.
- Treatment: choosing suboptimal, outdated or wrong therapies; errors in administering the treatment; errors of medication dosing; and treatment delays.
- Prevention: failures in preventive follow-up and administration of prophylactic therapies such as vaccinations.
- Other errors involving communication or equipment failures, among others.
Machine learning has the potential to reduce many of these errors, even eliminate some, the authors of the commentary said.
A well-designed system could alert providers when suboptimal medication is chosen; it could eliminate dosing errors; and it could triage records of patients with vague, mysterious symptoms to a panel of rare-disease experts for remote consults.
Machine-learning models hold the greatest promise in the following areas:
• Prognosis: the ability to identify patterns predictive of outcomes based on vast numbers of already documented outcomes. For example, what is a patient’s likely trajectory? How soon will the patient return to work? How fast will the patient’s disease progress?
• Diagnosis: the capacity to help identify likely diagnoses during clinical visits and raise awareness of possible future diagnoses based on a patient’s profile and totality of previous laboratory test results, imaging tests and other available data. Machine-learning models could be used as back-up intelligence to prod physicians to consider alternative conditions or to ask probing questions. This could be particularly valuable in scenarios with high diagnostic uncertainty or when patients present with particularly confounding symptoms.
• Treatment: Machine-learning models can be “taught” to identify the optimal treatment for a given patient with a given condition based on vast datasets of treatment outcomes for patients with the same diagnosis.
• Clinical workflow: Machine learning could improve and simplify current electronic medical record (EMR) keeping, which poses a significant burden on clinicians. A change in efficiency and reduction of time spent on EMR would allow physicians to spend more time in direct contact with the patient.
• Expanding access to expertise: The ability to improve access to care for patients living in remote geographic locations or regions with a scarcity of medical specialists. Such models could provide patients with nearby care options or alert them when symptoms demand urgent attention or a visit to an emergency room.
Deus ex machina…not
AI and machine learning are not perfect, nor will they solve all glitches in clinical care.
Machine-learning models will be only as good as the data they are provided. For example, a machine-learning model for treatment solutions would be only as good as the accuracy of therapies entered in the database that the model was trained on.
The most significant barrier to developing optimal machine-learning models is the scarcity of high quality clinical data that includes ethnically, racially and otherwise diverse populations, the authors said. Other hurdles are more technical in nature. For example, the current separation of clinical data across and within institutions is a significant, yet not insurmountable, barrier to building robust machine-learning models. One solution would be to put the data in the hands of patients to enable patient-controlled databases.
Other obstacles include different legal requirements and policies and a mish-mash of technical platforms across healthy systems and tech providers that may not be easily compatible with each other and thus compromise access to data.
“The adage ‘garbage in, garbage out’ very much applies here,” Kohane said referring to the computer specialist lingo denoting that the final capabilities of any computational system are only as good as the data fed into the system in the first place.
AI-optimized MD
One unintended consequence of machine learning could be overreliance on computer algorithms and a reduction in physician vigilance—outcomes that would increase clinical errors, the authors cautioned.
“Understanding the limitations of machine learning is vital,” Kohane said. “This includes understanding what the model is designed and, more importantly, what it’s not designed to do.”
One way to minimize such risks would be to include confidence ranges for all machine-learning models, informing clinicians exactly how accurate a model is likely to be. Even more importantly, all models should be subject to periodic reevaluations and exams, not unlike the periodic board exams physicians must take to maintain certifications in a given field of medicine.
If done right, machine learning will act as a form of focused-intelligence back-up, enhancing the clinician-patient encounter, rather than being a substitute for human physicians.
The human encounter, a human physician’s sensibility, sensitivity and appreciation for fine nuance and complexity of human life will never go away, the authors said.
“This is very much a case of together with and not instead of,” Kohane added. “This is not about machine versus human, but very much about optimizing the human physician and patient care by harnessing the strengths of AI.”
Alvin Rajkomar and Jeffrey Dean, both at Google, co-authored the commentary.