latest news

06.29.2019

VisCello; for visualization of single cell data.

access info ...

06.27.2018

Sample data provenance from 1,347 RNAseq samples.

access info ...

06.07.2018

ORNASEQ: Ontology for RNA sequencing.

access info ...

RNA Self Containment

Maintained by Miler Lee License (pdf)


Publication: M.T. Lee and J. Kim. 2008. "Self Containment, a Property of Modular RNA Structures, Distinguishes microRNAs." PLoS Comp. Biol., 4(8):e1000150 doi:10.1371/journal.pcbi.1000150.

The following is an implementation of the Self-Containment Index (SC), as described in Lee & Kim (2008). It measures the robustness of RNA structures to changes in the surrounding sequence context, which we hypothesize to be a hallmark of structural modularity. SC values range from 0.0 (no self containment) to 1.0 (completely self contained).

Web:

Visit the RNA Self Containment Web Server to calculate SC for your RNA sequences online, or download an offline version below.

Download:

Requirements:
    • Two almost-identical versions are provided, one for Unix variants and one for Windows.
    • For Linux/Unix/MacOSX, Python version 2.3 or later, available on all platforms from http://www.python.org/download/ Earlier versions of Python may work but have not been tested. For Windows, Python version 2.4 or later is required.
    • Vienna RNA Secondary Structure Package, version 1.7, available from http://www.tbi.univie.ac.at/~ivo/RNA/ Previous versions back to 1.4 should also work, but we have not tested them. For Windows users, it should be sufficient to download the RNAFold.exe executable and put it in the same directory as the python script.
Running:

The zip archive contains the Python source file along with sample files and a copy of this documentation. To run it, unzip the archive. The python script can be run from the command line, e.g.

>python selfcontain.py -s UGGGAUGAGGUAGUAGGGUAUAUUAGGUCACACCCACC

to calculate SC using the default settings (100 random contexts of length equal to the input string). To calculate SC for the sequences in a FASTA file, use the -i switch:

>python selfcontain.py -i sample.fa

Other parameters can be set; to see the possible switches, run

>python selfcontain.py -h

The Python module can also be imported and incorporated into your own programs. See the example() method in selfcontain.py for example usage.

Caveats:

Calculating SC is computationally intensive. Using the default settings on the sample input file should take about a minute depending on speed of the machine and available memory. Increasing the number of random contexts will proportionally increase the computation time, while increasing the length of the query sequence or contexts scales according to the running time of RNAFold (approximately cubic).

Self containment is only meaningful if the input sequence possesses some predicted secondary structure, so interpret the results accordingly.

When using random sequence contexts, multiple runs on the same sequence may yield slightly different SC values. Correlation between runs will increase when you increase the number of random contexts used.

The python script needs to be able to run RNAFold from the Vienna RNA Secondary Structure Package and assumes that either RNAFold is in the same directory or the directory containing RNAFold is included in the path environment variable. If this is not the case, the path to RNAFold can be manually entered in selfcontain.py by modifying the VIENNA_ROOT variable to the appropriate path where RNAFold lives, e.g. '/usr/bin/'.

To improve efficiency, several queries are passed to RNAFold at once -- the default is 250 for the Unix version. If the script detects that RNAFold returned aberrant output, try reducing the number of queries by changing the RNAFOLD_CHUNK_SIZE variable in selfcontain.py. This is particularly a problem on Windows, so for the script to run reliably (albeit a little slower), RNAFOLD_CHUNK_SIZE should be set to 1.