edu.upenn.crimson
Class Tree

java.lang.Object
  extended by edu.upenn.crimson.Tree
All Implemented Interfaces:
java.lang.Cloneable

public class Tree
extends java.lang.Object
implements java.lang.Cloneable

A rooted tree that is composed of Species objects. Species must be added to the tree using addSpecies() and similarly removed from the tree with removeSpecies().

Version:
$Id: Tree.java,v 1.61 2009/07/21 01:19:12 fisher Exp $
Notes:
Should encapulate Species.add/removeChildren() in order to maintain a HshSet of all leaves/species. This would optimize numSpecies/numLeaves and would allow for the temporary storage of min/max values (ie level and temporal depth). These values could be stored when first computed and zeroed out when a species is added/removed from the tree. Similarly, min/max Stem Lengths could be pre-computed. Would need to also encapsulate setting of stem length and parent.

Field Summary
private  boolean binary
          True when all inner nodes have exactly 2 children.
(package private) static boolean DEBUG_OUTPUT
          If true, then whenever a tree is sampled, the list of species selected will be output to the file 'species.list'.
private  java.lang.String id
          This is a unique name for the Tree object.
private  int maxLevel
          The maximum number of species between the leaves and root.
private  double maxStemLength
          The maximum distance between the species and their parents.
private  double maxTempDepth
          The maximum distance between the leaves and root.
private  int minLevel
          The minimum number of species between the leaves and root.
private  double minStemLength
          The minimum distance between the species and their parents.
private  double minTempDepth
          The minimum distance between the leaves and root.
private  java.lang.String modelID
          Model specific to this Tree object.
private  java.lang.String notes
          Notes specific to this Tree object.
private  int numLeaves
          The number of leaves is stored, in case the tree has not been built from the newick string.
private  int numSpecies
          The number of species is equivalent to speciesList.size().
private  java.util.HashSet partitions
          This contains the partitions loaded for this tree.
private  Species root
          The root Species of the tree.
private  java.util.ArrayList speciesList
          Array containing references to all species in the tree.
private  boolean ultrametric
          True when all leaves have the same temporal depth: (minTempDepth = maxTempDepth)
 
Constructor Summary
Tree(java.lang.String id)
           
Tree(java.lang.String id, boolean addToPool)
           
Tree(java.lang.String id, Species root, Model model, java.lang.String notes, int numSpecies, int numLeaves, boolean binary, boolean ultrametric, int minLevel, int maxLevel, double minTempDepth, double maxTempDepth, double minStemLength, double maxStemLength)
          Trees should not be instantiated by the user, using this routine.
 
Method Summary
private  java.util.Queue addChildrenToQueue(java.util.Queue queue, Species[] children)
          This will add the children to the queue.
 void addPartition(Partition partition)
           
 void addSpecies(Species species)
           
 boolean buildTree()
          This will build the tree from the nexick string.
 void clearStructure()
          This will release the tree structure from memory.
 java.lang.Object clone()
          This will duplicate the tree and all species.
 void collapse()
          This will remove all internal species that have only one child, linking the child to species' parent and adjusting the child's stem lengths accordingly.
 boolean computeBinary()
          Returns true if the tree is binary.
 void computeStats()
          This will compute the various tree statistics.
 java.lang.String getID()
          Get the ID.
static Species getLCA(Species s1, Species s2)
          Returns the LCA for the tree, assuming this species is the root of the tree.
 java.util.ArrayList getLeafCounts(java.util.ArrayList subroots)
          This will return a list of leaf counts for each subtree specified by the subroots.
 java.util.ArrayList getLeaves()
          Returns the set of all leaves in the tree.
 int getLength()
          This is the sum of the length of all partitions.
 int getMaxLevel()
           
 double getMaxStemLength()
           
 double getMaxTempDepth()
           
 int getMinLevel()
           
 double getMinStemLength()
           
 double getMinTempDepth()
           
 java.lang.String getModelID()
          Returns the ID for the model underlying this tree.
 java.lang.String getNewick()
          Returns the newick string for this tree, as stored in the database.
 java.lang.String getNotes()
           
 int getNumLeaves()
          This will return the number of leaves in the tree.
 int getNumPartitions()
          The number of partitions loaded for this tree.
 int getNumSpecies()
          This will return the number of species in the tree.
 java.util.HashSet getPartitions()
          This will return the set of partitions.
 java.util.ArrayList getPartitionsByModel(java.lang.String modelID)
          This will return an ArrayList containing all of the partitions in the current tree that are based on the specified model.
 Species getRoot()
           
 Species getRootLCA(int num)
          This will return the LCA for a random sampling of leaves from the tree.
 java.util.ArrayList getSpeciesByID(java.util.ArrayList speciesIn)
          This will take a list of species IDs and return a list of species objects.
 Species getSpeciesByID(java.lang.String id)
          This steps through every species in the tree and compares the species IDs until it finds the species requested.
 java.util.ArrayList getSpeciesList()
          Returns the list of all species in the tree.
 Tree getSubtree(Species subroot)
          Returns the subtree given the specified root.
 java.util.ArrayList getTreesByLevel(int level)
          Returns the set of root species for any subtrees that are below a specified level.
 java.util.ArrayList getTreesByTempDepth(double tempDepth)
          Returns the set of root species for any subtrees that are below a specified level.
 boolean isBinary()
           
 boolean isBuilt()
          Returns true if the tree has been built from the newick string.
 boolean isUltrametric()
           
 Tree manualSampleByID(java.util.ArrayList oldLeaves)
          This will return a subtree that contains the specified leaves.
 java.lang.String printTree()
          This will print out the tree, one species per line, indenting each level.
 Tree randomSample(int num)
          This will randomly sample the tree, returning a subtree that contains 'num' leaves.
 void removePartition(Partition partition)
           
 void removeSpecies(Species species)
           
 void setModelID(Model model)
          Sets the ID for the model underlying this tree.
 void setNewick(java.lang.String newick)
          This will update the newick string stored in the database.
 void setNotes(java.lang.String notes)
          Updates the local notes field and the TREES table.
 void setRoot(Species root)
          The root's label will be set to "_Root" if there is currently no label.
 java.lang.String toNewick()
          This will print out this species and all children as a newick formatted tree.
 java.lang.String toNewick(boolean incSequence)
          This will print out this species and all children as a newick formatted tree.
 java.lang.String toNewick(boolean incSequence, boolean rooted)
          This will output this species and all children as a newick formatted tree.
 java.lang.String toString()
          This will print lots of specs about the tree.
 Tree uniformSampleByLevel(int level, int num)
          Returns the random, uniform sampling of species from each subtree demarkated by the level value.
 Tree uniformSampleByTempDepth(double tempDepth, int num)
          Returns the random, uniform sampling of species from each subtree demarkated by the temporal depth value.
 Tree weightedSampleByLevel(int level, int num)
          Returns the random, weighted sampling of species from each subtree demarkated by the level value.
 Tree weightedSampleByTempDepth(double tempDepth, int num)
          Returns the random, weighted sampling of species from each subtree demarkated by the temporal depth value.
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

DEBUG_OUTPUT

static final boolean DEBUG_OUTPUT
If true, then whenever a tree is sampled, the list of species selected will be output to the file 'species.list'.

See Also:
Constant Field Values

id

private java.lang.String id
This is a unique name for the Tree object. We don't allow the ID to change after a tree is created.


modelID

private java.lang.String modelID
Model specific to this Tree object.


notes

private java.lang.String notes
Notes specific to this Tree object.


root

private Species root
The root Species of the tree.


speciesList

private java.util.ArrayList speciesList
Array containing references to all species in the tree.


numSpecies

private int numSpecies
The number of species is equivalent to speciesList.size(). It is stored separately, in case the tree has not been built from the newick string.


numLeaves

private int numLeaves
The number of leaves is stored, in case the tree has not been built from the newick string.


binary

private boolean binary
True when all inner nodes have exactly 2 children.


ultrametric

private boolean ultrametric
True when all leaves have the same temporal depth: (minTempDepth = maxTempDepth)


minLevel

private int minLevel
The minimum number of species between the leaves and root. This is inclusive of the root. Thus a leaf that is a child of the root would have a level of 1.


maxLevel

private int maxLevel
The maximum number of species between the leaves and root. This is inclusive of the root. Thus a leaf that is a child of the root would have a level of 1.


minStemLength

private double minStemLength
The minimum distance between the species and their parents.


maxStemLength

private double maxStemLength
The maximum distance between the species and their parents.


minTempDepth

private double minTempDepth
The minimum distance between the leaves and root.


maxTempDepth

private double maxTempDepth
The maximum distance between the leaves and root.


partitions

private java.util.HashSet partitions
This contains the partitions loaded for this tree.

Constructor Detail

Tree

public Tree(java.lang.String id)
     throws InvalidIDException
Throws:
InvalidIDException

Tree

public Tree(java.lang.String id,
            boolean addToPool)
     throws InvalidIDException
Throws:
InvalidIDException

Tree

public Tree(java.lang.String id,
            Species root,
            Model model,
            java.lang.String notes,
            int numSpecies,
            int numLeaves,
            boolean binary,
            boolean ultrametric,
            int minLevel,
            int maxLevel,
            double minTempDepth,
            double maxTempDepth,
            double minStemLength,
            double maxStemLength)
     throws InvalidIDException
Trees should not be instantiated by the user, using this routine. It is used to load trees from the database.

Throws:
InvalidIDException
Method Detail

getID

public java.lang.String getID()
Get the ID.


getModelID

public java.lang.String getModelID()
Returns the ID for the model underlying this tree.


setModelID

public void setModelID(Model model)
Sets the ID for the model underlying this tree. We require a model object rather than an ID, to make sure that the model really exists.


getNotes

public java.lang.String getNotes()

setNotes

public void setNotes(java.lang.String notes)
Updates the local notes field and the TREES table.


getRoot

public Species getRoot()

setRoot

public void setRoot(Species root)
The root's label will be set to "_Root" if there is currently no label. The root will also be added to the species list.


getSpeciesList

public java.util.ArrayList getSpeciesList()
Returns the list of all species in the tree.


getPartitions

public java.util.HashSet getPartitions()
This will return the set of partitions.


getNumSpecies

public int getNumSpecies()
This will return the number of species in the tree.


getNumLeaves

public int getNumLeaves()
This will return the number of leaves in the tree.


getMinLevel

public int getMinLevel()

isBinary

public boolean isBinary()

isUltrametric

public boolean isUltrametric()

getMaxLevel

public int getMaxLevel()

getMinStemLength

public double getMinStemLength()

getMaxStemLength

public double getMaxStemLength()

getMinTempDepth

public double getMinTempDepth()

getMaxTempDepth

public double getMaxTempDepth()

getNewick

public java.lang.String getNewick()
Returns the newick string for this tree, as stored in the database. This is retrieved from the database as needed to conserve RAM, instead of storing it as a string in the Tree object.


setNewick

public void setNewick(java.lang.String newick)
This will update the newick string stored in the database.


isBuilt

public boolean isBuilt()
Returns true if the tree has been built from the newick string.


buildTree

public boolean buildTree()
This will build the tree from the nexick string.


addChildrenToQueue

private java.util.Queue addChildrenToQueue(java.util.Queue queue,
                                           Species[] children)
This will add the children to the queue.


clearStructure

public void clearStructure()
This will release the tree structure from memory. This can be used to free up memory after large trees are built but the built structure is no longer needed.


addSpecies

public void addSpecies(Species species)

removeSpecies

public void removeSpecies(Species species)

getSpeciesByID

public java.util.ArrayList getSpeciesByID(java.util.ArrayList speciesIn)
This will take a list of species IDs and return a list of species objects.


getSpeciesByID

public Species getSpeciesByID(java.lang.String id)
This steps through every species in the tree and compares the species IDs until it finds the species requested.


computeStats

public void computeStats()
This will compute the various tree statistics.


computeBinary

public boolean computeBinary()
Returns true if the tree is binary.


getLeaves

public java.util.ArrayList getLeaves()
Returns the set of all leaves in the tree.


clone

public java.lang.Object clone()
This will duplicate the tree and all species.

Overrides:
clone in class java.lang.Object

toNewick

public java.lang.String toNewick()
This will print out this species and all children as a newick formatted tree. This is meant to be used for trees which are created by the application. When trees are loaded into the system from a NEXUS file, their newick structure is stored in the database and the method "newick()" can be used to reall the structure. Trees that are created by a query do not have newick structures and thus this routine can be used to create that structure. The inner sequence IDs will be included and the tree will be rooted.

Returns:
returns newick string

toNewick

public java.lang.String toNewick(boolean incSequence)
This will print out this species and all children as a newick formatted tree. The tree will be rooted. This is meant to be used for trees which are created by the application. When trees are loaded into the system from a NEXUS file, their newick structure is stored in the database and the method "newick()" can be used to recall the structure. Trees that are created by a query do not have newick structures and thus this routine can be used to create that structure.

Parameters:
incSequence - when true, inner sequence IDs will be included in the newick output
Returns:
returns newick string

toNewick

public java.lang.String toNewick(boolean incSequence,
                                 boolean rooted)
This will output this species and all children as a newick formatted tree. This is meant to be used for trees which are created by the application. When trees are loaded into the system from a NEXUS file, their newick structure is stored in the database and the method "newick()" can be used to recall the structure. Trees that are created by a query do not have newick structures and thus this routine can be used to create that structure.

Parameters:
incSequence - when true, inner sequence IDs will be included in the newick output
rooted - when true, the tree will rooted
Returns:
returns newick string

printTree

public java.lang.String printTree()
This will print out the tree, one species per line, indenting each level. This is meant to be used as a debugging tool and is not meant to be used to print out large trees.

Notes:
This will give a stack overflow for deep trees.

toString

public java.lang.String toString()
This will print lots of specs about the tree.

Overrides:
toString in class java.lang.Object

getNumPartitions

public int getNumPartitions()
The number of partitions loaded for this tree.


getLength

public int getLength()
This is the sum of the length of all partitions.


addPartition

public void addPartition(Partition partition)

removePartition

public void removePartition(Partition partition)

getPartitionsByModel

public java.util.ArrayList getPartitionsByModel(java.lang.String modelID)
This will return an ArrayList containing all of the partitions in the current tree that are based on the specified model.


collapse

public void collapse()
This will remove all internal species that have only one child, linking the child to species' parent and adjusting the child's stem lengths accordingly.


getLeafCounts

public java.util.ArrayList getLeafCounts(java.util.ArrayList subroots)
This will return a list of leaf counts for each subtree specified by the subroots. This can be used in conjunction with getTreesByTempDepth() and getTreesByLevel() to get the leaf counts for subtrees below a threshold value.


getSubtree

public Tree getSubtree(Species subroot)
Returns the subtree given the specified root. This will not create a unique Tree but instead will contain the species objects from the original tree. The trees returned will not be added to the tree pool and are meant to be used as temporary trees. If a unique trees are needed, then use Tree.clone().


getTreesByLevel

public java.util.ArrayList getTreesByLevel(int level)
Returns the set of root species for any subtrees that are below a specified level.


uniformSampleByLevel

public Tree uniformSampleByLevel(int level,
                                 int num)
Returns the random, uniform sampling of species from each subtree demarkated by the level value. The number of species included from each subtree will be 'num' or the total number of leaves in each subtree if a subtree has less than 'num' leaves.


weightedSampleByLevel

public Tree weightedSampleByLevel(int level,
                                  int num)
Returns the random, weighted sampling of species from each subtree demarkated by the level value. The number of leaves included from each subtree is a function of the total number of leaves requested and the relative number of leaves in each subtree. This is computed by pooling all leaves from all subtrees and then randomly selecting leaves from the pool.


getTreesByTempDepth

public java.util.ArrayList getTreesByTempDepth(double tempDepth)
Returns the set of root species for any subtrees that are below a specified level.


uniformSampleByTempDepth

public Tree uniformSampleByTempDepth(double tempDepth,
                                     int num)
Returns the random, uniform sampling of species from each subtree demarkated by the temporal depth value. The number of species included from each subtree will be 'num' or the total number of leaves in each subtree if a subtree has less than 'num' leaves.


weightedSampleByTempDepth

public Tree weightedSampleByTempDepth(double tempDepth,
                                      int num)
Returns the random, weighted sampling of species from each subtree demarkated by the temporal depth value. The number of leaves included from each subtree is a function of the total number of leaves requested and the relative number of leaves in each subtree. This is computed by pooling all leaves from all subtrees and then randomly selecting leaves from the pool.


manualSampleByID

public Tree manualSampleByID(java.util.ArrayList oldLeaves)
This will return a subtree that contains the specified leaves. The tree returned will contain species that are copies of those in the original tree. The input for this method is an ArrayList containing species IDs or species objects. It takes a significant amount of time to convert from species IDs to species objects.


randomSample

public Tree randomSample(int num)
This will randomly sample the tree, returning a subtree that contains 'num' leaves. The tree returned will contain species that are copies of those in the original tree.


getRootLCA

public Species getRootLCA(int num)
This will return the LCA for a random sampling of leaves from the tree.


getLCA

public static Species getLCA(Species s1,
                             Species s2)
Returns the LCA for the tree, assuming this species is the root of the tree. This requires that the level values have been set.




Copyright 2006 Stephen Fisher, Susan Davidson, and Junhyong Kim, University of Pennsylvania.