Home

Methods

  • Stegoscripts and statistical model
  • Wordspy algorithm
  • Model optimization
  • Over-represented motif discovery
  • Word clustering

Insights

Results

 

Identifying Arabidopsis cell-cycle trancription factor binding motifs


We applied our method to identify TFBMs of 1,081 cell-cycle regulated genes of A. thaliana, which were identified by a high-throughput expression profiling experiment by the Murray's Lab. After removing the homologues, we had 1,030 genes in the final set. the fasta file is available here. Their promoter sequences were obtained from TAIR database (http://www.arabidopsis.org/). We ran WordSpy to find motifs with lengths upto 10. The gene expression data from Weigel's lab were applied to calculate motif G-scores.

Results:

All putative motifs for Arabidopsis cell-cycle genes.
Putative motif clusters based on G-score ranking.

Putative motif clusters based on Zg-score ranking.

The dictionaries built by Wordspy: