DNA "echoes" of viruses that infected our ancestors millions of years ago may help our immune system to identify and kill cancer cells, according to a study.
The study, published in the journal Genome Research, looked at "endogenous retroviruses," fragments of DNA in the human genome that were left behind by viruses that infected our ancestors.
Over millions of years, our ancestors were infected with countless viruses and their DNA now makes up more of our genome than human genes.
Around eight per cent of the human genome is made up of retroviral DNA, while known genes only make up 1-2 per cent.
"This viral DNA typically lies dormant, as it is either non-functional or our bodies have evolved to suppress it," said George Kassiotis from The Francis Crick Institute in the UK, who led the study.
"However, when a cell becomes cancerous, some of these suppression mechanisms can fail and this ancient viral DNA can be reactivated," Kassiotis said.
"We looked for viral DNA that is reactivated by cancer and produces products that the immune system can see. The hope is that if we can train the immune system to spot these, we can selectively target cancer cells," he said.
Genes are pieces of DNA that contain instructions to produce proteins, which perform important functions in the cell or the body.
These instructions are transcribed into RNA "messenger" molecules before the proteins are produced. However, this transcription process can be influenced by DNA outside the gene, including endogenous retroviruses.
To study the effects of endogenous retroviruses on transcription, the team looked at patient samples from 31 different cancer types using a technology called "RNASeq' that can read short, random fragments of RNA.
However, as each "read" only delivers a small part of the sequence in an unknown order, it takes up to 50 million "reads" per sample to build a complete picture of transcriptional activity.
"Piecing together a full transcriptional profile is a monumental task," said Kassiotis.
"All you have is random fragments, so to piece them together you need to see where they overlap," he said.
The team used RNA sequencing data from 768 patient samples, with almost 40 billion reads to piece together.
Even using sophisticated algorithms, a desktop computer would need to run constantly for 24 years to stitch this data together.
From the full transcriptional data, the team developed a catalogue of over 130,000 different RNA transcripts produced by endogenous retroviruses, more than half of which had not been previously discovered.
Of these, there were roughly 6,000 transcripts that were specifically found in cancer samples and not healthy tissue.
Many of these were specific to the type of cancer, with most individual cancers expressing high levels of a few hundred transcripts.
"We focused on melanoma-specific transcripts and applied an algorithm to predict which could code for material that is visible to the immune system," said Kassiotis.
"We found 14 candidate transcripts from 8 different regions of the genome that could produce unique cancer antigens," he said.
The researchers inspected mass spectrometry data to see which of these antigens were present in real patient samples.
This narrowed it down to nine unique peptides that could be visible to the immune system.
"We hope this approach could form the basis of future cancer therapies, if we can vaccinate the immune system to recognise and attack cancer cells presenting these peptides," Kassiotis said.