latest news

06.29.2019

VisCello; for visualization of single cell data.

access info ...

06.27.2018

Sample data provenance from 1,347 RNAseq samples.

access info ...

06.07.2018

ORNASEQ: Ontology for RNA sequencing.

access info ...

Penn SCAP-T Pipeline: Documentation

Back ↩ License (pdf)

Module: BARCODE

Runs kmerFind.py to select reads with a specific barcode sequence, optionally trimming with -c.

Usage:
	ngs.sh barcode [-i inputDir] [-t numProc] [-p] [-c contaminantsFile] [-m minLen] [-q phredThreshold] [-rN] [-rAT numBases] [-se] sampleID
Input:
	/lab/repo/resources/trim/contaminants.fa (file containing contaminants)
	sampleID/inputDir/unaligned_1.fq
	sampleID/inputDir/unaligned_2.fq (paired-end reads)
Output:
	sampleID/trim/unaligned_1.fq
	sampleID/trim/unaligned_2.fq (paired-end reads)
	sampleID/trim/sampleID.trim.stats.txt
	sampleID/trim/contaminants.fa (contaminants file)
Requires:
	trimReads.py ( https://github.com/safisher/ngs )
Options:
	-i inputDir - location of source files (default: init).
	-t numProc - maximum number of cpu to use.
	-c - Cut reads at the barcode point. Read 1 will be cut from the barcode to the 3' end, read 2 will be cut from the barcode to the 5' end.
	-m - Mask the barcode portion of reads. Sequences on both sides of the barcode will remain unchanged.
	-l length - length of reads
	-pre prefix - sequence to be inserted before barcode sequence (default: none)
	-suf suffix - sequence to be inserted after barcode sequence (default: none)
	-m [AND|OR] - if a both a prefix and a suffix is given, whether to look for reads with [prefix][barcode][suffix] (AND), or reads with either [prefix][barcode] or [barcode][suffix] (OR).
	-se - single-end reads (default: paired-end)
Selected/trimmed data is placed in 'sampleID/barcode.trim'.