| Kim Lab of Computational Evolutionary Biology | |
| Public Private Project1 Project2 Project3 Project4 Archive | ||
|
Home People Projects Publications Downloads Cluster Jobs Discussions Biology Department School of Arts and Sciences University of Pennsylvania
103I Lynch Laboratory 433 S University Avenue Philadelphia, PA 19104 USA off: (215) 746-5187 lab: (215) 898-8395 fax: (215) 898-8780 email: junhyong@sas.upenn.edu |
KirkupKimThe analysis of large-scale data sets is rapidly becoming a central problem in phylogenetic biology and molecular evolution. The algorithms for phylogenetic inference are extremely difficult (NP-hard) for many popular estimation methods. Large datasets force the use of heuristic algorithms that are sensitive to the structure of how the function value relates to the function arguments. The relationship between the objective function value and function argument is called the objective function landscape. By examining landscapes constructed from real genetic data for the Maximum Parsimony function we characterized the variables relevant to the success or failure of heuristic algorithms. As expected, the structure of the objective function landscape became more complex and more difficult with increasing numbers of taxa for all search graphs. There were increasing numbers of local optima, decreasing size of the attraction basins, and decreasing correlation between the depth of the optima and their attraction basins. In particular, we found Nearest-Neighbor Interchange search strategy to be inappropriate for heuristic searches. The trees at different local optima can be quite different. We found a topological difference of 40-50% among local optima differing by only 0.5-1.0% in tree length. This indicates a need for different heuristic strategies for large-scale phylogenies such as stochastic search strategies, divide-and-conquer, or direct incremental estimation.
Related pages: Publications | |