In:
Bioinformatics, Oxford University Press (OUP), Vol. 38, No. 22 ( 2022-11-15), p. 5121-5123
Abstract:
Several high-throughput protein–DNA binding methods currently available produce highly reproducible measurements of binding affinity at the level of the k-mer. However, understanding where a k-mer is positioned along a binding site sequence depends on alignment. Here, we present Top-Down Crawl (TDC), an ultra-rapid tool designed for the alignment of k-mer level data in a rank-dependent and position weight matrix (PWM)-independent manner. As the framework only depends on the rank of the input, the method can accept input from many types of experiments (protein binding microarray, SELEX-seq, SMiLE-seq, etc.) without the need for specialized parameterization. Measuring the performance of the alignment using multiple linear regression with 5-fold cross-validation, we find TDC to perform as well as or better than computationally expensive PWM-based methods. Availability and implementation TDC can be run online at https://topdowncrawl.usc.edu or locally as a python package available through pip at https://pypi.org/project/TopDownCrawl. Supplementary information Supplementary data are available at Bioinformatics online.
Type of Medium:
Online Resource
ISSN:
1367-4803
,
1367-4811
DOI:
10.1093/bioinformatics/btac653
Language:
English
Publisher:
Oxford University Press (OUP)
Publication Date:
2022
detail.hit.zdb_id:
1468345-3
SSG:
12