latest news

06.29.2019

VisCello; for visualization of single cell data.

access info ...

06.27.2018

Sample data provenance from 1,347 RNAseq samples.

access info ...

06.07.2018

ORNASEQ: Ontology for RNA sequencing.

access info ...

Penn SCAP-T Pipeline: Documentation

Back ↩ License (pdf)

Module: STAR

This module will run the STAR aligner. The STAR alignment program must be installed separately (see Requires below).

Usage:
    ngs.sh star [-i inputDir] -p numProc -s species [-se] sampleID
Input:
    sampleID/INPUTDIR/unaligned_1.fq
    sampleID/INPUTDIR/unaligned_2.fq (paired-end reads)
Output:
    sampleID/star/STAR.bam (all alignments)
    sampleID/star/STAR_Unique.bam (uniquely aligned reads)
Requires:
    STAR ( http://code.google.com/p/rna-star )
    samtools ( http://samtools.sourceforge.net/ )
Options:
    -i inputDir - location of source files (default: trim).
    -p numProc - number of cpu to use.
    -s species - species from repository: /lab/repo/resources/star.
    -se - single-end reads (default: paired-end)

Runs STAR using the trimmed files from sampleID/trim. Output is stored in sampleID/star. The STAR stats file (Log.final.out) is renamed sampleID.star.stats.txt. STAR is run on a single machine using numProc number of cores on that machine. Depending on the size of the genome, it is recommended that the machine have at least 32 GB RAM.

STAR paramters hard-coded in ngs_STAR.sh are listed below along with the STAR default (for comparison):

--outFilterScoreMin 0  [STAR default: 0]
--outFilterScoreMinOverLread 0  [STAR default: 0.66]
--outFilterMatchNmin 30  [STAR default: 0]
--outFilterMismatchNmax 100  [STAR default: 10]
--outFilterMismatchNoverLmax 0.3  [STAR default: 0.3]
--outReadsUnmapped Fastx
--genomeLoad LoadAndRemove
--outSAMtype BAM Unsorted  [STAR default: SAM]

for SE reads: --outFilterMatchNminOverLread 0.6 [STAR default: 0.66] for PE reads: --outFilterMatchNminOverLread 0.4 [STAR default: 0.66]