Personalizing cancer treatment through machine learning

Researchers at the University of Waterloo’s Cheriton School of Computer Science have applied machine learning to identify tumour-specific antigens, which could help make personalized cancer vaccines practically feasible and more accurate.

In cancer, when a mutation occurs in a cell’s DNA, a substitution takes place. This substitution is flagged as an invader by our immune system and is referred to as a neoantigen, a mutated peptide that appears on the surface of cancer cells.

“If we can figure out what the neoantigens are on cancer cells, they can be used to develop a cancer vaccine—a vaccine that’s personalized to the cancer patient and that uses the patient’s own immune system to attack the tumour,” explains Hieu Tran, adjunct professor at the Cheriton School of Computer Science.

“When a cell becomes cancerous, the body knows about it,” adds Ming Li, a professor at the Cheriton School of Computer Science, who also holds the Canada Research Chair in Bioinformatics. “That’s because the human leukocyte antigen or HLA system—which is responsible for the regulation of the immune system—can showcase whether a peptide on the cell’s surface is normal or mutated. If the HLA system presents a normal peptide, our immune system doesn’t attack it. Our immune system will attack only the cells with mutations, the ones with neoantigens, otherwise known as cancerous tumour cells, on their surface.”

The trick, however, is finding these tumour-specific neoantigens—essentially a needle in a large haystack. Not surprisingly, it is a bewilderingly difficult task to do using conventional methods, but it is crucially important when developing a personalized cancer vaccine.

Catering medicine to the individual

Amino acids are the building blocks of peptides and ultimately protein molecules. Without them, we wouldn’t have an immune system or be able to digest food, grow or procreate. By convention, amino acids are labelled using a one-letter code. For example, the amino acid alanine is labelled A, arginine is labelled R, asparagine is labelled N and so on. A peptide’s amino acid sequence can be considered as a word composed of these letters.

“If you are familiar with natural language processing, you’ve likely seen your mobile phone guess the next word you might have typed as you compose a message. You write ‘how’ and it suggests ‘are’ and if you type ‘are’ it suggests ‘you’,” says Hieu Tran.

“We applied a similar machine-learning model to determine the amino acid sequence of neoantigens based on this one-letter amino acid code. If I know your immunopeptidome—the thousands of short 8-to-12 amino acid peptide antigens displayed on the cell surface—and I know that a neoantigen is different from your existing peptides by just one mutation, I can train a machine-learning model using your normal peptides to predict the mutated peptides. We used a recurrent neural network—a machine-learning model we call DeepNovo—to predict the amino acid sequence of neoantigens.”

To do this, the researchers downloaded the immunopeptidome datasets of five patients with melanoma, a type of skin cancer, which they then used to train, validate and test their machine-learning model.

Even more impressively, the machine-learning model is able to personalize the results—that is, it identifies specific neoantigens for each individual patient to provide personalized treatment and care.

“Cancer immunotherapy is quickly becoming a fourth modality of cancer treatment, alongside surgery, chemotherapy and radiotherapy,” adds Ming Li. “Every patient is different and every cancer is different, so cancer treatment shouldn’t be the same for all. Treatment should be tailored to the patient and that’s what our personalized machine-learning model allows us to do.”

This article was adapted and republished with permission from the This link will take you to another Web site University of Waterloo.

Up next

Improving stroke treatment with a modified therapeutic molecule

A research team from the Institut national de la recherche scientifique (INRS) has improved the protective effect of a molecule against ischemic stroke, which is caused by an interruption of blood flow to the brain.

Contact Newsletter

Get highlights of things happening at NSERC delivered to your email inbox. View all Newsletters

  • Twitter
  • Facebook
  • LinkedIn
  • Youtube
  • Instagram