Penn SCAP-T Pipeline

License (pdf)

PennSCAP-T Pipeline is designed to automate the processing of next generation sequencing data. Each processing step in the primary analysis pipeline (e.g. sequence trimming, aligning, etc) is managed as a separate module file. The master script '' is used to run the module files. Module files can be run on their own or the '' module file can be used to link processing steps into a complete workflow.

This pipeline was designed in part through the efforts of the NIH Single Cell Analysis Program - Transcriptome (SCAP-T) project and was used to process the SCAP-T RNAseq data.


Update Log

    • Sep 14, 2017 - PennSCAP-T v2.2
      Notable changes:
      • Adds an option for a pipeline that keeps a smaller, less comprehensive set of output files
      • locates reads with a specific barcode, allowing for mismatches. Barcoded reads can either be trimmed or masked with N
      • BARCODE: is the pipeline module wraper for
      • TRIM: added poly-base trimming, allowing for removal of long strings of the same base from either end of the read and parallelized trimming
      • BLASTQC: allow for short reads (shorter than 50bp)
      • allow for running pre-configured pipeline instances in parallel using a csv config file
      • run Illumina bcl2fastq v2.17
      Detailed release notes are available here.

    • Sep 30, 2016 - PennSCAP-T v2.1
      Notable changes:
      • Renamed various internal and output files
      • BLAST: don’t resample reads if blast already run. numerous tweaks
      • BOWTIE: allow for single end
      • PIPELINE: added more parameters to simplify run types. bug fixes
      • POST: allow setting of GUID
      • STAR: update for STAR 2.4.0 or later. tweaks and bug fixes
      • HTSEQ: exon and intron counting by intersection-strict
      • STATS: added simple (i.e. quiet) mode for output
      • TRIM: allow for parallelized trimming
      • VERSE: added Verse module
      • bug fixes, minor tweaks
      • included more target species, bug fixes
      • allow for parallelized trimming
      • added to merge gene count files based on intersection or union operators
      Detailed release notes are available here.

    • Jan 20, 2015 - PennSCAP-T v2.0
      Notable changes:
      • updated STAR parameters
      • updated HTSeq parameters
      • improved data provenance
      • improved QC output
      Detailed release notes are available here.

    • Feb 26, 2014 - PennSCAP-T v1.0
      Initial release.