Identification of local clusters of mutation hotspots in cancer-related genes and their biological relevance

  • Je Keun Rhee
  • , Jinseon Yoo
  • , Kyu Ryung Kim
  • , Jeeyoon Kim
  • , Yong Jae Lee
  • , Byoung Chul Cho
  • , Tae Min Kim

Research output: Contribution to journalArticlepeer-review

12 Scopus citations

Abstract

Mutation hotspots are either solitary amino acid residues or stretches of amino acids that show elevated mutation frequency in cancer-related genes, but their prevalence and biological relevance are not completely understood. Here, we developed a Smith-Waterman algorithm-based mutation hotspot discovery method, MutClustSW, to identify mutation hotspots of either single or clustered amino acid residues. We identified 181 missense mutation hotspots from COSMIC and TCGA mutation databases. In addition to 77 single amino acid residue hotspots 42.5 percent including well-known mutation hotspots such as IDH1 p.R132 and BRAF p.V600, we identified 104 mutation hotspots 57.5 percent as clusters or stretches of multiple amino acids, and the hotspots on MUC2, EPPK1, KMT2C, and TP53 were larger than 50 amino acids. Twelve of 27 nonsense mutation hotspots 44.4 percent were observed in four cancer-related genes, TP53, ARID1A, CDKN2A, and PTEN, suggesting that truncating mutations on some tumor suppressor genes are not randomly distributed as previously assumed. We also show that hotspot mutations have higher mutation allele frequency than non-hotspots, and the hotspot information can be used to prioritize the cancer drivers. Together, the proposed algorithm and the mutation hotspot information can serve as valuable resources in the selection of functional driver mutations and associated genes.

Original languageEnglish
Article number3370687
Pages (from-to)1656-1662
Number of pages7
JournalIEEE/ACM Transactions on Computational Biology and Bioinformatics
Volume16
Issue number5
DOIs
StatePublished - Sep 2019

Bibliographical note

Publisher Copyright:
© 2019 Institute of Electrical and Electronics Engineers Inc.. All rights reserved.

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 3 - Good Health and Well-being
    SDG 3 Good Health and Well-being

Keywords

  • Bioinformatics
  • clustering methods
  • computational biology
  • genetics
  • oncology

Fingerprint

Dive into the research topics of 'Identification of local clusters of mutation hotspots in cancer-related genes and their biological relevance'. Together they form a unique fingerprint.

Cite this