| Kim Lab of Computational Evolutionary Biology | |
| Public Private Project1 Project2 Project3 Project4 Archive | ||
|
Home People Projects Publications Downloads Cluster Jobs Discussions Biology Department School of Arts and Sciences University of Pennsylvania
103I Lynch Laboratory 433 S University Avenue Philadelphia, PA 19104 USA off: (215) 746-5187 lab: (215) 898-8395 fax: (215) 898-8780 email: junhyong@sas.upenn.edu |
CIPResSimulationData
Sequence simulation based on complex evolution model is part of the CIPRes project. Here you can download the simulated RNA data.
The simulation was done using the same starting/root RNA sequence with secondary structure, but on trees with different number of taxa. These trees were random-sampled subtrees of a 1-million-taxon ultrametic binary tree, which resembles real phylogenetic trees and was created by Tracy Heath at Hillis/Bull lab in UTexas at Austin.
Eight sets of subtrees were used, the number of taxa ranging from 128 to 16384. Note that these trees are also ultrametric binary trees, and trees with the same number of taxa may not be the same, since again they were randomly sampled. Ultrametric here refers to the absolute time model of branching, the actual expected numbers of change in each lineage is a function of the molecule and is not clock-like.
Simulation parameters were tuned such that the simulated sequences resemble real small subunit rRNA (ssu rRNA) sequences in terms of sequence identity, number of indels, the ratio between substitution and indels, etc.
The dataset is presented in NEXUS format with three blocks. The tree block records the tree used for the simulation. The character block are the aligned ancestral and extant RNA sequences. The ancestral sequences start with "_I" (These trees use the original node labels in the 1-million-taxon tree, so the labels in the subtree are not continuous). In the crimson block, the aligned secondary structures of RNA molecules are listed in Vienna format. An example is given below:
#NEXUS
The CIPRes simulation data are managed by Sheng Guo. Email me for any concern. | |