Sensitive and accurate identification of protein-DNA binding events in ChIP-chip assays using higher order derivative analysis

Nucleic Acids Res. 2011 Mar;39(5):1656-65. doi: 10.1093/nar/gkq848. Epub 2010 Nov 4.

Abstract

Immuno-precipitation of protein-DNA complexes followed by microarray hybridization is a powerful and cost-effective technology for discovering protein-DNA binding events at the genome scale. It is still an unresolved challenge to comprehensively, accurately and sensitively extract binding event information from the produced data. We have developed a novel strategy composed of an information-preserving signal-smoothing procedure, higher order derivative analysis and application of the principle of maximum entropy to address this challenge. Importantly, our method does not require any input parameters to be specified by the user. Using genome-scale binding data of two Escherichia coli global transcription regulators for which a relatively large number of experimentally supported sites are known, we show that ∼90% of known sites were resolved to within four probes, or ∼88 bp. Over half of the sites were resolved to within two probes, or ∼38 bp. Furthermore, we demonstrate that our strategy delivers significant quantitative and qualitative performance gains over available methods. Such accurate and sensitive binding site resolution has important consequences for accurately reconstructing transcriptional regulatory networks, for motif discovery, for furthering our understanding of local and non-local factors in protein-DNA interactions and for extending the usefulness horizon of the ChIP-chip platform.

Publication types

  • Evaluation Study
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms*
  • Binding Sites
  • Chromatin Immunoprecipitation*
  • DNA-Binding Proteins / analysis*
  • Escherichia coli Proteins / analysis
  • Factor For Inversion Stimulation Protein / analysis
  • Leucine-Responsive Regulatory Protein / analysis
  • Oligonucleotide Array Sequence Analysis*
  • Sensitivity and Specificity

Substances

  • DNA-Binding Proteins
  • Escherichia coli Proteins
  • Factor For Inversion Stimulation Protein
  • Fis protein, E coli
  • Lrp protein, E coli
  • Leucine-Responsive Regulatory Protein