latest news

06.29.2019

VisCello; for visualization of single cell data.

access info ...

06.27.2018

Sample data provenance from 1,347 RNAseq samples.

access info ...

06.07.2018

ORNASEQ: Ontology for RNA sequencing.

access info ...

Penn SCAP-T Pipeline

Maintained by Stephen Fisher License (pdf)

PennSCAP-T Pipeline is designed to automate the processing of next generation sequencing data. Each processing step in the primary analysis pipeline (e.g. sequence trimming, aligning, etc) is managed as a separate module file. The master script 'ngs.sh' is used to run the module files. Module files can be run on their own or the 'ngs_PIPELINE.sh' module file can be used to link processing steps into a complete workflow.

This pipeline was designed in part through the efforts of the NIH Single Cell Analysis Program - Transcriptome (SCAP-T) project and was used to process the SCAP-T RNAseq data.

Documentation...

Update Log

    • Sep 14, 2017 - PennSCAP-T v2.2
      Notable changes:
      • Adds an option for a pipeline that keeps a smaller, less comprehensive set of output files
      • kmerFind.py: locates reads with a specific barcode, allowing for mismatches. Barcoded reads can either be trimmed or masked with N
      • BARCODE: is the pipeline module wraper for kmerFind.py
      • TRIM: added poly-base trimming, allowing for removal of long strings of the same base from either end of the read and parallelized trimming
      • BLASTQC: allow for short reads (shorter than 50bp)
      • launcher.sh: allow for running pre-configured pipeline instances in parallel using a csv config file
      • run_bcl2fastq.sh: run Illumina bcl2fastq v2.17
      Detailed release notes are available here.

    • Sep 30, 2016 - PennSCAP-T v2.1
      Notable changes:
      • Renamed various internal and output files
      • BLAST: don’t resample reads if blast already run. numerous tweaks
      • BOWTIE: allow for single end
      • PIPELINE: added more parameters to simplify run types. bug fixes
      • POST: allow setting of GUID
      • STAR: update for STAR 2.4.0 or later. tweaks and bug fixes
      • HTSEQ: exon and intron counting by intersection-strict
      • STATS: added simple (i.e. quiet) mode for output
      • TRIM: allow for parallelized trimming
      • VERSE: added Verse module
      • ngs.sh: bug fixes, minor tweaks
      • parseBlast.py: included more target species, bug fixes
      • trimReads.py: allow for parallelized trimming
      • combineGeneCounts.py: added to merge gene count files based on intersection or union operators
      Detailed release notes are available here.

    • Jan 20, 2015 - PennSCAP-T v2.0
      Notable changes:
      • updated STAR parameters
      • updated HTSeq parameters
      • improved data provenance
      • improved QC output
      Detailed release notes are available here.

    • Feb 26, 2014 - PennSCAP-T v1.0
      Initial release.