Kim Lab of Computational Evolutionary Biology
Public Private Project1 Project2 Project3 Project4 Archive



Home

People

Projects

Publications

Downloads

Cluster

Jobs

Discussions




Biology Department
School of Arts and Sciences
University of Pennsylvania
103I Lynch Laboratory
433 S University Avenue
Philadelphia, PA 19104 USA

off: (215) 746-5187
lab: (215) 898-8395
fax: (215) 898-8780

email: junhyong@sas.upenn.edu

Index - Changes - Edit - Delete - Search: 

Kim01

The completion of the Drosophila melanogaster genome marks another significant milestone in growth of sequence information. But it also contributes to the ever widening gap between sequence information and biological knowledge. One important approach to reducing this gap is theoretical inference through computational technologies. Multitude of computer programs have been designed to annotate genomic sequence information with biologically relevant information. Here, I suggest that all of these methods have a common structure where the sequence fragments are "coordinatized" by some description method such as Hidden Markov Models. The key to the algorithms lies in constructing the most efficient set of coordinates that allow extrapolation and interpolation from existing knowledge. Efficient extrapolation and interpolation is produced if the sequence fragments acquire a natural geometrical structure in the coordinatized description. Finding such a coordinate frame is an inductive problem with no algorithmic solution. The greater part of the problem of genomic annotation lies in biological modeling of the data rather than in algorithmic improvements.