Evolution of gene regulation and developmental systems
A key property of living objects is that each object, whether they are proteins, cells, or whole organisms, has an associated generating process, that is, a decoding process whereby stored information is converted into a complex functioning biological object. For example, generating a protein involves translation and folding; generating an organism involves a cascade of gene regulatory and cell biological processes. We are interested in such "bio-generative processes" and understanding general principles of these processes. Questions include how to infer the organizational structure of such generative processes from available data, how do the generative processes evolve, and how the generative processes and selection on the generative processes affect the final form of the biological object. Currently, we have three related projects in this area.
Whole-genome gene expression regulation and evolution of the transcriptome
Recently, it has become possible to obtain the transcriptional profiles of the entire genome. We are interested in asking how we can deduce the molecular interaction of the genes from such transcriptional profiles and whether there are organizational regularities to the structure of molecular interactions. In particular, we are interested in large-scale properties such as the organization of connectivity (i.e., how many genes interact with a particular gene), the modularity of the interactions, and dimensionality of gene expression (i.e., the degrees of freedom in coordinated gene expression). We are also interested in the evolution of the transcriptome. In this area, we are collaborating the Kevin White at Yale University to study the macro-evolution and mutational dynamics of the transcriptome in Drosophila species. We have generated comparative expression data for six lineages of Drosophila and for mutation accumulation lines. We are interested in the tempo and mode of evolution of gene expression, the mutational effects on gene expression, and changes in functional importance of gene expression through a developmental trajectory.
Dynamics of whole-genome gene expression
Transcriptional regulation is a dynamic process and a transcriptional profile has a naturally associated temporal dimension. We have been developing computational tools and laboratory experiments to understand the dynamics of the whole genome gene expression regulation. We recently developed tools for visualizing and analyzing a transcriptome time-series under periodic events such as the cell cycle. With Paul Sniegowski at Penn we are now generating time-series data from natural strains of yeast. We are pursuing the idea that a comparison of changes in the dynamics of gene expression (so-called heterochrony) will allow us to efficiently infer modular co-regulated gene groups involved in generating the cell cycle.
Protein structure evolution
Proteins are generated by a process of translation and statistical mechanical folding. Proteins also display macroevolutionary diversity in shape and function raising the question whether such diversity is due to chance, molecular function (e.g., particular catalytic activity), or constraints in the folding process. Here we are interested in the contribution of the folding process to the final form. In particular, we are statistically characterizing the regularity of form, which we define as the presence of self-similar substructures, and relating this regularity to the prevalence of particular structures in nature. We have found statistical evidence that common protein structures, so-called superfolds, are more self-similar. We hypothesize that this self-similarity is due to selection on efficient folding. We are currently attempting address whether more self-similar structures are more efficient and robust in their folding properties.