Jinjiang: Jinjiang's WebWatcher on Biology (4)

Protein-DNA binding databases (TF-DNA & ligand-DNA; excluding histone-DNA):

A new curated collection of yeast transcription factor DNA binding specificity data from the Bulyk Lab.

This is just the paper; the web interface for the database is not available yet (28.12.2011).

FlyTF: Drosophila transcription factor database

FlyTF currently contains 129 proteins for which PWMs are available.

TRANSFAC - a commercial database of TFs, their binding sites, regulated genes and PWMs.

TRANSFAC consists of free and paid sections. Provided binding sites are experimentally proved. Human TF weight matrices may be viewed through the web interface of UCSC Genome Browser.

JASPAR: transcription factor binding profile database

The JASPAR CORE database contains a curated, non-redundant set of profiles, derived from published collections of experimentally defined transcription factor binding sites for eukaryotes. The prime difference from TRANSFAC is the open access to the data.

KDBI: Kinetic Data of Biomolecular Interactions

KDBI is a collection of experimentally determined kinetic data of protein-protein, protein-RNA, protein-DNA, protein-ligand, RNA-ligand, DNA-ligand binding events described in the literature.

ProNIT - a database of experimental thermodynamic protein-DNA interaction data.

ProNIT currently contains more than 4900 entries. Each entry has the protein and nucleic acid information, experimental conditions and the following binding thermodynamic data: dissociation constant Kd, energies, stoichiometry of binding and activity (Km and kcat).

UniPROBE - an online database of protein binding microarray data on protein-DNA interactions.

UniPROBE contains data on the preferences of proteins for all possible sequence variants ('words') of length k ('k-mers'), as well as position weight matrix (PWM) and graphical sequence logo representations of the k-mer data. In total, the database currently hosts DNA binding data for 391 nonredundant proteins (individual proteins or in some cases heterodimers) from a diverse collection of organisms.

Drosophila transcription factor weight matrices collected by Daniel Pollard

This is a personal collection. Currently contains ~50 matrices (Last checked: 06.10.2010).

BindingDB - a public database of measured protein-small ligand binding affinities.
DPInteract: DNA-protein interactions for E.coli. (Last updated in 1998).

Calculating TF affinity (binding constant) from weight matrices and directly from experiments:*

TRAP - TRanscription factor Affinity Prediction

TRAP calculates binding affinity based on the matrix description of a given TF and a set of DNA sequences to be annotated (input). It requires the specification of two biophysically-motivated parameters. The freely available program code is written in C. Further details are available in the paper by Roider et al., 2007.

STAP - Sequence To Affinity Prediction

STAP uses a biophysical model to analyzes transcription factor (TF)-DNA binding data, such as ChIP-chip or ChIPSeq data. The program assumes that the measured affinity of a sequence to a TF (TF_exp) in some ChIP-chip or ChIP-seq experiment is determined by: 1) the number and strength of binding sites of TF_exp in this sequence; 2) the presence of other sites that may interact cooperatively with the sites of TF_exp in the neighborhood. Specifically, it takes as input a set of DNA sequences, their binding affinities to some TF as measured by experiments (TF_exp), and the position weight matrices (PWMs) of a set of TFs, including TF_exp. It will learn the relevant parameters of the biophysical model of TF-DNA interaction, including those of TF-DNA interaction and those of TF-TF cooperative interactions. **To be tested.

MatrixREDUCE - Predicting TF binding through alignment-free and affinity-based analysis of orthologous promoter sequences

The input to MatrixREDUCE is a sequence file in FASTA format and an expression data file in tab-delimited text format (missing values are allowed). Output data include PSAMs in numeric and graphical format, parameters of the fitted model, and an HTML summary page.

BayesPI - estimation of TF binding energy matrices, binding affinity and chemical potential from ChIP-Chip experiments

BayesPI integrates Bayesian model regularization with biophysical modeling of protein-DNA interactions and nucleosome positioning to study protein-DNA interactions, using a high-throughput dataset. **To be tested.

Creating PWMs of transcription factors using 3D structure-based computation of protein-DNA free binding energies

The scoring function calibrated against crystallographic data on protein-DNA contacts can recover PWMs, sometimes outperforming experimental PWMs. **To be tested

*Section under construction. Check again later and feel free to submit your links and comments

General-purpose numbers relevant for gene regulation:

BioNumbers—the database of key numbers in molecular and cell biology

Jinjiang

Friday, 24 February 2012

Jinjiang's WebWatcher on Biology (4) - Protein-DNA databases

No comments:

Post a Comment

About Me

Blog Archive