edu.upenn.gloDB
Class Track

java.lang.Object
  extended by edu.upenn.gloDB.Track
All Implemented Interfaces:
java.lang.Cloneable

public class Track
extends java.lang.Object
implements java.lang.Cloneable

Tracks are collections of Features and allow accessing the Features as a sorted Set (sorted by source and position information) or grouped by the Feature source information.

Version:
$Id: Track.java,v 1.1.2.34 2007/03/01 21:17:33 fisher Exp $

Nested Class Summary
private static class Track.FeatureMaxComparator
           
 
Field Summary
protected  java.util.HashMap attributes
          This is similar to "qualifiers" in GenBank (ex: scores, strand (+/-), phase (within codon)).
private  java.util.TreeSet features
          TreeSet of Feature objects comprising the Track.
protected  java.lang.String id
          This is a unique name for the Track, that is used by the parser to identify the Track.
private static java.util.Random random
          Used to create random IDs.
private  java.util.HashMap sources
          Map of Sequence object IDs to Features.
 
Constructor Summary
Track()
          Create a new Track object and add it to the trackPool.
Track(boolean addToPool)
          Create a new Track object and add the newly create Track object to the trackPool if addToPool is true.
Track(boolean addToPool, java.lang.String id)
          Create a new Track object and add the newly create Track object to the trackPool if addToPool is true.
Track(java.lang.String id)
          Create a new Track object with the specified ID, and add it to the trackPool.
 
Method Summary
 void addAttribute(java.lang.Object key, java.lang.Object value)
          Add an attribute.
 void addFeature(Feature newFeature)
          Adds a Feature to 'features'.
 void addFeature(Feature newFeature, boolean rebuildPool)
          Adds a Feature to 'features'.
 void addFeatures(java.util.TreeSet features)
          This will add 'features' to the current feature set.
 java.lang.Object clone()
          Create a shallow clone of the existing object (clone the structure but not the Objects).
 Track cloneMerged()
          Create a shallow clone of the existing object (clone the structure but not the Objects).
 java.lang.Object cloneTrack(boolean addToPool)
          Create a shallow clone of the existing object (clone the structure but not the Objects).
 Track cluster(java.lang.String id, int maxSpace, int threshold)
          Deprecated. replaced with cluster.py
 int compareTo(java.lang.Object o)
          Compares this object with the specified object for order.
 boolean contains(Feature feature)
          Returns 'true' if 'feature' exists in this Track.
 int contains(int pos)
          Returns '-1' if this Track exists after the integer 'pos', returns '0' if 'pos' is contained in this Track, and '1' if 'pos' occurs after this Track.
 boolean contains(java.lang.String source)
          Returns 'true' if this Track contains any Features on 'source'.
 boolean containsAttribute(java.lang.Object key)
          Returns true if attribute 'key' exists.
 void delAttribute(java.lang.Object key)
          Remove an attribute.
 void erase()
          Erases all Track information, except for the ID.
 java.util.Iterator featureIterator()
          Returns an Iterator over 'features'.
 java.util.TreeSet featuresBySource(Sequence sequence)
          Get the set of Features based on the Sequence object.
 java.util.TreeSet featuresBySource(java.lang.String sequence)
          Get the set of Features based on the Sequence ID.
 void filterOnAttribute(java.lang.String key, java.lang.String value)
          This will remove all Features from this Track that do not contain the specified attribute.
 void filterOnLength(int min, int max)
          This will remove all Features from this Track that are outside of the specifed range.
 void filterOnRepeat(int minR, int maxR, int minW, int maxW)
          This will remove all Features from this Track that do not conform to the repeat criterion.
 void filterOnSeqPos(int min, int max)
          This will remove all Features from this Track that are not within the 'min'/'max' boundaries.
 void filterOnSequence(java.lang.String sequence)
          This will remove all Features from this Track that do not exist on the specified Sequence.
 Track flip()
          Inverts the positions of each feature in the Track.
 java.lang.Object getAttribute(java.lang.Object key)
          Get value for attribute 'key'.
 java.util.HashMap getAttributes()
          Get the attributes.
 java.util.ArrayList getData()
          Returns the sequence data.
 java.lang.String getDataFASTA()
          Returns the sequence data formatted as a multi-sequence FASTA file.
 java.lang.String getDataFormatted()
          Returns the sequence data formatted with "\n" inserted every Sequence.FORMAT_WIDTH characters and blank lines inserted between sequences.
 java.util.TreeSet getFeatures()
          Get the features, sorted by their min values.
 java.util.TreeSet getFeaturesByMax()
          Get the features, sorted by their max values.
 java.lang.String getID()
          Get the ID.
 int getMax()
          Returns the maximum stop position in the Track.
 int getMin()
          Returns the minimum start position in the Track.
 java.util.HashMap getSources()
          Get the map of source Sequences to Features.
 java.util.Set getSourceSet()
          Get the set of source Sequence objects.
 boolean isContiguous()
          Returns 'true' if the Track does not contain gaps between Features.
 boolean isSingleSource()
          Returns 'true' if the Features contained in the Track all refer to the same sequence.
 int length()
          Returns the number of positions contained in the Track.
 void mergeContiguous()
          This will merge all overlapping Features in the Track, creating new Feature objects as necessary.
 Track noRepeats()
          This will return a copy of the track without any duplicate features (based on start/stop values).
 int numFeatures()
          Returns the number of Features contained in the Track.
 int numSources()
          Returns the number of Sources spanned by the Track.
 boolean overlaps(Feature featureB)
          Returns 'true' if the Feature 'featureB' overlaps at least one Feature in this Track.
 boolean overlaps(Track trackB)
          Returns 'true' if a Feature in trackB overlaps at least one Feature in this Track.
static java.lang.String randomID(java.lang.String base)
          Uses 'base' to create a random ID string that doesn't already exist in the trackPool.
 void removeFeature(Feature newFeature)
          Removes a Feature from 'features'.
 void setAttributes(java.util.HashMap attributes)
          Set the attributes.
 void setFeatures(java.util.TreeSet features)
          This will replace 'features' with the TreeSet argument.
 void setID(java.lang.String id)
          Set the ID.
 void setID(java.lang.String id, boolean updatePool)
          Set the ID.
static java.util.TreeSet sortByMax(java.util.TreeSet features)
          Sort the Feature TreeSet by max values.
 java.lang.String toString()
          Only returns Feature start/stop position information.
 java.lang.String toStringFull()
          Returns all description and Feature information.
 java.lang.String toStringMore()
          Only returns Feature start/stop position information.
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

id

protected java.lang.String id
This is a unique name for the Track, that is used by the parser to identify the Track. This is 'protected' to allow ObjectHandles to change the value.


attributes

protected java.util.HashMap attributes
This is similar to "qualifiers" in GenBank (ex: scores, strand (+/-), phase (within codon)).


features

private java.util.TreeSet features
TreeSet of Feature objects comprising the Track.

Notes:
We should be able to remove this set after we create 'sources' since that is a more useful structure for the Feature data.

sources

private java.util.HashMap sources
Map of Sequence object IDs to Features. Maintaining this set slows down the adding/removing of Features but should speed up the management of the Features. It allows for easy access of Features by Sequence. For each sequence, a TreeSet of Features is maintained.

Notes:
This can not be directly changed by the user but rather is created and updated based on features.

random

private static java.util.Random random
Used to create random IDs.

Constructor Detail

Track

public Track()
Create a new Track object and add it to the trackPool.


Track

public Track(java.lang.String id)
Create a new Track object with the specified ID, and add it to the trackPool.


Track

public Track(boolean addToPool)
Create a new Track object and add the newly create Track object to the trackPool if addToPool is true.

Notes:
This should probably be 'protected' instead of 'public' because all Tracks should really be added to trackPool.

Track

public Track(boolean addToPool,
             java.lang.String id)
Create a new Track object and add the newly create Track object to the trackPool if addToPool is true.

Notes:
This should probably be 'protected' instead of 'public' because all Tracks should really be added to trackPool.
Method Detail

setID

public void setID(java.lang.String id)
           throws InvalidIDException
Set the ID. If the new ID is the same as the current ID, then doesn't do anything. If the new ID already exists in the trackPool, then throws an exception.

Parameters:
id - a String that is a unique identifier for the Track.
Throws:
InvalidIDException

setID

public void setID(java.lang.String id,
                  boolean updatePool)
           throws InvalidIDException
Set the ID. If the new ID is the same as the current ID, then doesn't do anything. If the new ID already exists in the trackPool, then throws an exception. If 'updatePool' is true, then the trackPool is updated. 'updatePool' must be true if the Track is in the trackPool, else the trackPool will become out of sync.

Parameters:
id - a String that is a unique identifier for the Track.
Throws:
InvalidIDException

getID

public java.lang.String getID()
Get the ID.


setAttributes

public void setAttributes(java.util.HashMap attributes)
Set the attributes.

Parameters:
attributes - a HashMap of Feature attributes

getAttributes

public java.util.HashMap getAttributes()
Get the attributes.


setFeatures

public void setFeatures(java.util.TreeSet features)
This will replace 'features' with the TreeSet argument. This will update the sources HashMap based on the new set of Features.


getFeatures

public java.util.TreeSet getFeatures()
Get the features, sorted by their min values.


getFeaturesByMax

public java.util.TreeSet getFeaturesByMax()
Get the features, sorted by their max values. The TreeSet returned is effectively a clone of this Track's TreeSet and thus changes to the TreeSet will not be reflected in the Track's 'features' TreeSet.


getSources

public java.util.HashMap getSources()
Get the map of source Sequences to Features.


sortByMax

public static java.util.TreeSet sortByMax(java.util.TreeSet features)
Sort the Feature TreeSet by max values.


filterOnSequence

public void filterOnSequence(java.lang.String sequence)
This will remove all Features from this Track that do not exist on the specified Sequence.


filterOnSeqPos

public void filterOnSeqPos(int min,
                           int max)
This will remove all Features from this Track that are not within the 'min'/'max' boundaries. If 'max' is -1, then goes to maximum Sequence length.

Notes:
Should throw an exception if max < min.

filterOnLength

public void filterOnLength(int min,
                           int max)
This will remove all Features from this Track that are outside of the specifed range.

Notes:
Should throw an exception if max < min.

filterOnRepeat

public void filterOnRepeat(int minR,
                           int maxR,
                           int minW,
                           int maxW)
This will remove all Features from this Track that do not conform to the repeat criterion. Within min/max are used to define the min/max space between features. Repeat min/max are used to define the min/max number of features that must follow in a row, based on the min/max criterion, in order for those features to be included.

Notes:
Should throw an exception if max < min., Need to allow for within values that don't have a min.

filterOnAttribute

public void filterOnAttribute(java.lang.String key,
                              java.lang.String value)
This will remove all Features from this Track that do not contain the specified attribute.


addAttribute

public void addAttribute(java.lang.Object key,
                         java.lang.Object value)
Add an attribute.


delAttribute

public void delAttribute(java.lang.Object key)
Remove an attribute.


containsAttribute

public boolean containsAttribute(java.lang.Object key)
Returns true if attribute 'key' exists.


getAttribute

public java.lang.Object getAttribute(java.lang.Object key)
Get value for attribute 'key'.


addFeatures

public void addFeatures(java.util.TreeSet features)
This will add 'features' to the current feature set. This will update the sources HashMap based on the new set of Features.


addFeature

public void addFeature(Feature newFeature)
Adds a Feature to 'features'. This will update the sources HashMap. If 'features' doesn't exist a new TreeSet will be created. If 'newFeature' is null, then this method won't do anything.


addFeature

public void addFeature(Feature newFeature,
                       boolean rebuildPool)
Adds a Feature to 'features'. This will update the sources HashMap. If 'features' doesn't exist a new TreeSet will be created. If 'newFeature' is null, then this method won't do anything.


removeFeature

public void removeFeature(Feature newFeature)
Removes a Feature from 'features'. This will update the sources HashMap. If 'newFeature' is null, then this method won't do anything.


numFeatures

public int numFeatures()
Returns the number of Features contained in the Track. If Features exactly overlap, they will be still be counted separately.


numSources

public int numSources()
Returns the number of Sources spanned by the Track.


featureIterator

public java.util.Iterator featureIterator()
Returns an Iterator over 'features'.


getSourceSet

public java.util.Set getSourceSet()
Get the set of source Sequence objects.


featuresBySource

public java.util.TreeSet featuresBySource(Sequence sequence)
Get the set of Features based on the Sequence object.


featuresBySource

public java.util.TreeSet featuresBySource(java.lang.String sequence)
Get the set of Features based on the Sequence ID.


getData

public java.util.ArrayList getData()
Returns the sequence data. Sequence data that occurs on different contigs or is non-contiguous with separate items in the ArrayList.

Notes:
Should return sets of Sequences. A new set for each sequence and within each Sequence set, a new set for each non-contiguous Feature. However, if 2 Sequences have same data (ie to contigs are a repeat), then using Sets won't work.

getDataFormatted

public java.lang.String getDataFormatted()
Returns the sequence data formatted with "\n" inserted every Sequence.FORMAT_WIDTH characters and blank lines inserted between sequences.


getDataFASTA

public java.lang.String getDataFASTA()
Returns the sequence data formatted as a multi-sequence FASTA file. New lines ("\n") are inserted every Sequence.FORMAT_WIDTH characters and a blank line is inserted between sequences.


isContiguous

public boolean isContiguous()
Returns 'true' if the Track does not contain gaps between Features. If the Features occur on different sequences , then this will return 'false'.


mergeContiguous

public void mergeContiguous()
This will merge all overlapping Features in the Track, creating new Feature objects as necessary.


cluster

public Track cluster(java.lang.String id,
                     int maxSpace,
                     int threshold)
Deprecated. replaced with cluster.py

This will merge all Features in the Track that are within maxSpace of each other. New Features will be created to span the entire cluster. Threshold sets the minimum number of Features necessary to be considered a cluster and thus included in the output set. A new Track will be returned containing the clusters. This will return 'null' if there is no match.

Parameters:
id - the name of the new Track
maxSpace - the maximum allowed space between Features in a cluster
threshold - the minimum number of Features needed in a cluster, for the cluster to be included in the output

noRepeats

public Track noRepeats()
This will return a copy of the track without any duplicate features (based on start/stop values).


isSingleSource

public boolean isSingleSource()
Returns 'true' if the Features contained in the Track all refer to the same sequence. This is similar to isContiguous() but allows for gaps between Features.


flip

public Track flip()
Inverts the positions of each feature in the Track. For example, if a feature had a start position of 10 and a stop position of 20 on a contig that was 100 positions long, then flipping the feature would result in a new Feature object with a start position of 80 and a stop position of 90. Flipping a Track will result in the creation of new Feature objects for each feature in the Track.

Returns:
Returns a new Track object in which the positions of all features are flipped.
Notes:
Not yet implemented.

getMin

public int getMin()
Returns the minimum start position in the Track. Will return '-1' if there are no features. This will return '-1' if the Track contains features on different contigs (ie isSingleSource() returns 'false').


getMax

public int getMax()
Returns the maximum stop position in the Track. Will return '-1' if there are no Features. Note that the Features are sorted by min values, so it's not clear what the max Feature value is, except by testing each Feature. This will return '-1' if the Track contains Features on different contigs (ie isSingleSource() returns 'false').


length

public int length()
Returns the number of positions contained in the Track. Overlapping positions will only be counted once.


compareTo

public int compareTo(java.lang.Object o)
Compares this object with the specified object for order. Returns a negative integer, zero, or a positive integer as this object is less than, equal to, or greater than the specified object.

Notes:
This is necessary for 'Comparable'., Not yet implemented.

contains

public int contains(int pos)
Returns '-1' if this Track exists after the integer 'pos', returns '0' if 'pos' is contained in this Track, and '1' if 'pos' occurs after this Track.

Notes:
This assumes 'pos' is positive within this Track's Sequence boundaries., Not clear how to deal with Sequences in Tracks., For Tracks, this should test contains() for each Feature within the Track., Not yet implemented.

contains

public boolean contains(Feature feature)
Returns 'true' if 'feature' exists in this Track.


contains

public boolean contains(java.lang.String source)
Returns 'true' if this Track contains any Features on 'source'.


overlaps

public boolean overlaps(Feature featureB)
Returns 'true' if the Feature 'featureB' overlaps at least one Feature in this Track.

Notes:
Should use Sequences to limit the searches

overlaps

public boolean overlaps(Track trackB)
Returns 'true' if a Feature in trackB overlaps at least one Feature in this Track.

Notes:
Should use Sequences to limit the searches

cloneMerged

public Track cloneMerged()
Create a shallow clone of the existing object (clone the structure but not the Objects). This differs from clone() in that the clone will have the Features merged.

Notes:
Although public, this is not meant for use by the end user and does will not add the Track to the ObjectHandles Track pool.

clone

public java.lang.Object clone()
Create a shallow clone of the existing object (clone the structure but not the Objects). This clone will be added to ObjectHandles.trackPool.

Overrides:
clone in class java.lang.Object

cloneTrack

public java.lang.Object cloneTrack(boolean addToPool)
Create a shallow clone of the existing object (clone the structure but not the Objects).

Notes:
This could probably be done in a much more efficient way by cloning each field of a Track, rather than rebuilding the features. However, rebuilding the features allows us to use IGNORE_ATTRIBUTES to remove repeats.

erase

public void erase()
Erases all Track information, except for the ID.


randomID

public static java.lang.String randomID(java.lang.String base)
Uses 'base' to create a random ID string that doesn't already exist in the trackPool.


toString

public java.lang.String toString()
Only returns Feature start/stop position information.

Overrides:
toString in class java.lang.Object

toStringMore

public java.lang.String toStringMore()
Only returns Feature start/stop position information.


toStringFull

public java.lang.String toStringFull()
Returns all description and Feature information.




Copyright 2012 Stephen Fisher and Junhyong Kim, University of Pennsylvania. All Rights Reserved.