Abstract

Motivation

Chromosomal patterning of gene expression in cancer can arise from aneuploidy, genome disorganization, or abnormal DNA methylation. To map such patterns, we introduce a weighted univariate clustering algorithm to guarantee linear runtime, optimality, and reproducibility.

Results

We present the chromosome clustering method, establish its optimality and runtime, and evaluate its performance. It uses dynamic programming enhanced with an algorithm to reduce search-space in-place to decrease runtime overhead. Using the method, we delineated outstanding genomic zones in 17 human cancer types. We identified strong continuity in dysregulation polarity—dominance by either up- or down-regulated genes in a zone—along chromosomes in all cancer types. Significantly polarized dysregulation zones specific to cancer types are found, offering potential diagnostic biomarkers. Unreported previously, a total of 109 loci with conserved dysregulation polarity across cancer types give insights into pan-cancer mechanisms. Efficient chromosomal clustering opens a window to characterize molecular patterns in cancer genome and beyond.

Availability

Weighted univariate clustering algorithms are implemented within the R package ‘Ckmeans.1d.dp’ (4.0.0 or above), freely available at https://cran.r-project.org/package=Ckmeans.1d.dp

Supplementary information

Supplementary data are available at Bioinformatics online.

This content is only available as a PDF.

Author notes

Current address: Fred Hutchinson Cancer Research Center, Seattle, WA 98196, USA.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
Associate Editor: Russell Schwartz
Russell Schwartz
Associate Editor
Search for other works by this author on:

Supplementary data