Entry Date:
December 15, 2006

Biophysics of Protein-DNA Recognition

Principal Investigator Leonid Mirny


We developed a theoretical framework and a computational model of the process by which a DNA-binding protein finds its site on a long DNA molecule. Using this model, we addressed a number of biological questions demonstrating that (i) to find its target fast, a protein should alternate between sliding along DNA and diffusion inside the cell; (ii) a protein needs to be flexible and rapidly change its conformation while searching; (iii) a rapid target search may impose restrictions on the organization of bacterial genomes, a prediction that we verified using bioinformatics and genomics, (iv) fast binding to regulatory sites requires a certain chromatin organization, a prediction that is consistent with recent nucleosome maps.

Using a different approach, an information-theoretical analysis, we asked whether individual motifs of DNA-binding transcription factors contained enough information for precise DNA recognition. The analysis of known motifs gave two strikingly different answers for bacteria and eukaryotes. Bacterial proteins contain sufficient information to recognize cognate sites in their genomes, while eukaryotic transcription factors are much less specific and do not contain sufficient information to find a cognate site in a eukaryotic genome. This information deficiency has a profound biological implication: the widespread binding of eukaryotic transcription factors to thousands of spurious sites on accessible DNA. Our information-theoretical approach provides a new framework for the interpretation of functional genomics measurements and for understanding gene regulation in eukaryotes.