Penn SCAP-T Pipeline
Maintained by Stephen Fisher | License (pdf) |
PennSCAP-T Pipeline is designed to automate the processing of next generation sequencing data. Each processing step in the primary analysis pipeline (e.g. sequence trimming, aligning, etc) is managed as a separate module file. The master script 'ngs.sh' is used to run the module files. Module files can be run on their own or the 'ngs_PIPELINE.sh' module file can be used to link processing steps into a complete workflow.
This pipeline was designed in part through the efforts of the NIH Single Cell Analysis Program - Transcriptome (SCAP-T) project and was used to process the SCAP-T RNAseq data.
Documentation...Update Log
- Sep 14, 2017 - PennSCAP-T v2.2
Notable changes:- Adds an option for a pipeline that keeps a smaller, less comprehensive set of output files
- kmerFind.py: locates reads with a specific barcode, allowing for mismatches. Barcoded reads can either be trimmed or masked with N
- BARCODE: is the pipeline module wraper for kmerFind.py
- TRIM: added poly-base trimming, allowing for removal of long strings of the same base from either end of the read and parallelized trimming
- BLASTQC: allow for short reads (shorter than 50bp)
- launcher.sh: allow for running pre-configured pipeline instances in parallel using a csv config file
- run_bcl2fastq.sh: run Illumina bcl2fastq v2.17
- Sep 30, 2016 - PennSCAP-T v2.1
Notable changes:- Renamed various internal and output files
- BLAST: don’t resample reads if blast already run. numerous tweaks
- BOWTIE: allow for single end
- PIPELINE: added more parameters to simplify run types. bug fixes
- POST: allow setting of GUID
- STAR: update for STAR 2.4.0 or later. tweaks and bug fixes
- HTSEQ: exon and intron counting by intersection-strict
- STATS: added simple (i.e. quiet) mode for output
- TRIM: allow for parallelized trimming
- VERSE: added Verse module
- ngs.sh: bug fixes, minor tweaks
- parseBlast.py: included more target species, bug fixes
- trimReads.py: allow for parallelized trimming
- combineGeneCounts.py: added to merge gene count files based on intersection or union operators
- Jan 20, 2015 - PennSCAP-T v2.0
Notable changes:- updated STAR parameters
- updated HTSeq parameters
- improved data provenance
- improved QC output
- Feb 26, 2014 - PennSCAP-T v1.0
Initial release.