AI model learns how to make cancer treatment less toxic

An AI model learns from patient data to make treatment for cancer less toxic but still effective. AI stands for artificial intelligence. The AI model, a machine-learning system, determines the smallest and fewest doses. However, the doses can still shrink brain tumors.

AI refers to software technologies that make robots or computers think and behave like human beings. Artificial intelligence contrasts with natural intelligence. Animals, including humans, have natural intelligence.

AI model – glioblastoma treatment

The MIT researchers say that their AI model may improve cancer patients’ quality of life. The researchers are investigating how it may reduce toxic radiotherapy and chemotherapy dosing for glioblastoma.

Glioblastoma or glioblastoma multiforme (GBM) is the most aggressive brain cancer that starts within the brain. Glioblastoma, which can also occur in the spinal cord, can affect people of any age. However, it is more common among older adults.

Prognosis for adults with glioblastoma is up to five years. In other words, patients rarely live longer than five years after diagnosis. They have to endure a combination of multiple medications and radiation therapy.

Doctors generally administer maximum safe drug doses to shrink tumors as much as possible. However, they are powerful drugs which cause debilitating side effects in patients.

AI model for cancer treatment regimens
The AI model ‘learns’ from patient data, and subsequently makes cancer treatment considerably less toxic.

Less toxic dosing regimens

Researchers are creating an AI model that could make dosing regimens much less toxic. However, they would still be effective. MIT Media Lab researchers are presenting their research at the 2018 Machine Learning for Healthcare Conference at Stanford University.

The AI model is powered by a ‘self-learning‘ machine-learning technique. It looks at treatment regimes that are currently in use and iteratively adjusts their doses. Iteratively means ‘in an iterative manner,’ i.e., repeating a process until one reaches a desired goal.

It eventually finds an optimal treatment plan. The plan has the lowest possible potency and dose frequency without losing efficacy. In this context, efficacy refers to the treatment’s ability to shrink tumors.

AI model reduced dose potency significantly

In a simulation – a trial involving fifty patients – the AI model designed treatment cycles that reduced potency to a quarter or half of almost all doses. However, the treatments maintained the same tumor-shrinking potential.

Sometimes the model skipped doses completely. In some cases, rather than monthly, scheduling administration occurred just twice a year.

Pratik Shah, a lead investigator at the Media Lab who supervised this research, said:

“We kept the goal, where we have to help patients by reducing tumor sizes but, at the same time, we want to make sure the quality of life – the dosing toxicity – doesn’t lead to overwhelming sickness and harmful side effects.”

AI model – reinforced learning

The researchers’ AI model uses an approach we call reinforced learning or RL. In this method, the model learns specific behaviors that lead to desired outcomes.

The technique consists of artificially intelligent ‘agents‘ that act in an unpredictable, complex environment to reach a desired ‘outcome.’

Whenever the model completes an action, it either receives a penalty or reward. What it receives depends on whether the action gets it nearer to the desired outcome.

AI model seeks ‘rewards’

The agent subsequently adjusts its actions so that it receives more rewards rather than penalties.

Penalties and rewards are basically negative and positive numbers, such as, for example, plus or minus one. Their values vary according to the action the agent takes and calculated by probability of failing or succeeding at the outcome.

Essentially, the agent is attempting to numerically optimize all its actions, which it bases on penalty/reward values.

It aims, above all, it to get the best score possible for each task it undertakes.

Deep Mind used the RL technique. Deep Mind was a computer program which beat the world’s best players in the game ‘Go’ in 2016.

Engineers also used the technique to train driverless cars in maneuvers such as parking the vehicle or merging into traffic. With the RL approach, the vehicle practices again and again, adjusting its course, until it gets everything right.

AI model for glioblastoma treatments

The researchers adapted an RL AI model for glioblastoma treatment. It used a combination of the drugs vincristine (PCV), procarbazine, and temozolomide (TMZ). The patients received medications over weeks or months.

The model’s agent scanned through every traditional regimen that doctors had administered. The regimens included protocols that researchers had used for decades in animal and clinical trials.

Oncologists use these established protocols to calculate what doses to administer to patients, which they base on bodyweight.

AI model had to make many decisions

The AI model explored each regimen. It subsequently decided on one of several possible actions. The model might, initially, either administer or withhold a dose.

It then decided whether to use the entire dose or only a portion. In other words, it decided how much was necessary.

Each time it carried out an action, it pinged another clinical model. The model is often used to predict what happens to a tumor in response to treatments.

It did this to determine whether the action had resulted in a reduction of the mean tumor diameter. If it shrank the tumor, the AI model got a reward.

Preventing maximum doses

However, the investigators also had to make sure that the AI model did not simply administer maximum doses. The model received a penalty every time it chose to administer at full dose. It, therefore, chose smaller doses.

The same happened when it administered the treatment too often. It got a penalty. Therefore, it chose fewer doses.

Shah said:

“If all we want to do is reduce the mean tumor diameter, and let it take whatever actions it wants, it will administer drugs irresponsibly.”

“Instead, we said, ‘We need to reduce the harmful actions it takes to get to that outcome.'”

First RL model of its kind

This was the first time an RL model weighed potential negative consequences of actions against an outcome. In other words, one that weighed potential doses against changes in tumor size.

Historically, RL models have worked toward a single action, such as winning. They take whatever action they need to achieve that outcome.

Regarding what happened in this study, in a press release, MIT said:

“The researchers’ model, at each action, has flexibility to find a dose that doesn’t necessarily solely maximize tumor reduction, but that strikes a perfect balance between maximum tumor reduction and low toxicity.”

“This technique has various medical and clinical trial applications, where actions for treating patients must be regulated to prevent harmful side effects.”

AI model – optimal regimens

The researchers trained the AI model on fifty patients. This was a simulation. The simulation randomly selected the patients from a database of glioblastoma patients. They had all been recipients of traditional treatments.

The model conducted approximately 20,000 trial-and-error test runs on each patient.

By the end of the training, the AI model had learned the parameters for optimal regimens. When the model received new patients, it used those parameters to formulate new regimens. The model based the new regimens on several constraints that the researchers had provided.

The team then tested the AI model on fifty new simulated patients. They compared the results with those of a conventional regimen using both PVC and TMZ.

Dosage penalty vs. no penalty

When the model received no dosage penalty, its regimens were almost identical to those that humans had formulated.

However, when it received small and large dosing penalties, the model significantly cut the frequency and potency of doses. The AI model also reduced tumor sizes when it cut the doses’ potency and frequency.

The team also designed the model to treat patients individually, as well as in a single cohort. The results were similar (the researchers had access to the medical data of each patient).

Traditionally, groups of patients get the same dosing regimen. However, differences in genetic profiles, biomarkers, medical histories, and tumor size may all alter a patient’s treatment requirement.

Traditional clinical trials do not consider these variables. Subsequently, there are poor responses to therapy in large populations, Shah explained.

‘Eye-balling’ method

The AI model offers a major improvement over the traditional ‘eye-balling’ method. The traditional method involves administering doses, observing patient response, and adjusting accordingly.

Clinical trial design expert, Nicholas J. Schork, said:

“(Humans don’t) have the in-depth perception that a machine looking at tons of data has, so the human process is slow, tedious, and inexact.”

“Here, you’re just letting a computer look for patterns in the data, which would take forever for a human to sift through, and use those patterns to find optimal doses.”

Schork is a professor and director of human biology at the J. Craig Venter Institute.

Prof. Schork believes that this work may be of particular interest to the US FDA. The FDA is currently seeking ways to leverage data and AI to develop health technologies.

FDA stands for Food and Drug Administration. The US FDA is the regulatory agency in charge of medical devices, medications, foods, tobacco products, and cosmetics.

Prof. Schork said:

“(Regulations still need to be established) but I don’t doubt, in a short amount of time, the FDA will figure out how to vet these appropriately, so they can be used in everyday clinical programs.”