Getting Started with CHAMPAGNE¶
Installation¶
For Biowulf users, CHAMPAGNE is installed in the ccbrpipeliner
module. There's no need to perform any other installation steps.
If you'd like to run the pipeline in a different execution environment, take a look at how to run the nextflow pipeline directly.
CHAMPAGNE depends on Nextflow version 25 or later and Singularity or Docker.
Biowulf¶
Champagne is available on Biowulf in the ccbrpipeliner
module. You'll first need to start an interactive session, then load the module:
# start an interactive node
sinteractive --mem=2g --cpus-per-task=2 --gres=lscratch:200
# load the ccbrpipeliner module
module load ccbrpipeliner
Help¶
Run champagne --help
to see the available commands and options.
Usage: champagne [OPTIONS] COMMAND [ARGS]...
CHromAtin iMmuno PrecipitAtion sequencinG aNalysis pipEline
docs: https://ccbr.github.io/CHAMPAGNE
For more options, run: champagne [command] --help
Options:
-v, --version Show the version and exit.
--citation Print the citation in bibtex format and exit.
-h, --help Show this message and exit.
Commands:
run Run the workflow
init Initialize the launch directory
Initialize¶
Initialize your project directory:
champagne init --output /data/$USER/champagne_project
Or if you do not use --output
, your current working directory will be used as default:
champagne init
Prepare input files¶
Sample manifest¶
This file is a CSV file that contains information about the samples to be processed. It is passed to the input
parameter.
The following columns are required:
sample
: sample ID; does not need to be a unique column.rep
: replicate number of sample ID; does not need to be a unique column.fastq_1
: absolute path to R1 of sample ID.fastq_2
: absolute path to R2 of sample ID (optional, only for paired-end reads).antibody
: name of the antibody used for the sample.input
: the sampleID of the input control; this must match a sample in the sheet.
Example for a single-end project:
samplesheet.csv
sample,rep,fastq_1,fastq_2,antibody,control
sampleA,1,/path/to/sample_1.R1.fastq.gz,,Ab,inputA
sampleA,2,/path/to/sample_2.R1.fastq.gz,,Ab,inputA
inputA,1,/path/to/sample1.R1.fastq.gz,,,
inputA,2,/path/to/sample1.R1.fastq.gz,,,
Example for a paired-end project:
samplesheet.csv
sample,rep,fastq_1,fastq_2,antibody,control
sample1,1,/path/to/sample_1.R1.fastq.gz,/path/to/sample_1.R2.fastq.gz,Ab,input1
sample1,2,/path/to/sample_2.R1.fastq.gz,/path/to/sample_1.R2.fastq.gz,Ab,input1
input1,1,/path/to/input_1.R1.fastq.gz,/path/to/input_1.R2.fastq.gz,,
input1,2,/path/to/input_2.R1.fastq.gz,/path/to/input_2.R2.fastq.gz,,
For more examples, view the sample sheet files in the assets/
directory on GitHub.
Contrasts (optional)¶
Contrasts are specified as a TSV file and is passed to the contrasts
parameter. Each row is a unique contrast for differential analysis.
Columns:
contrast_name
: name of the contrast. Must be unique and contain no spaces.group1
: comma-separated list of sample IDs in group 1.group2
: comma-separated list of sample IDs in group 2.
The following is an example contrast file with two contrasts
assets/contrasts_full_mm10.tsv
contrast_name group1 group2
antibody CTCF_ChIP_macrophage_p20,CTCF_ChIP_MEF_p20 CTCF_ChIP_macrophage_p3
celltype_macrophage_vs_fibroblast CTCF_ChIP_macrophage_p20,CTCF_ChIP_macrophage_p3 CTCF_ChIP_MEF_p20
The sample sheet for this dataset contains all of the sample IDs specified in the contrasts file.
assets/samplesheet_full_mm10.csv
sample,rep,fastq_1,fastq_2,antibody,input
CTCF_ChIP_macrophage_p20,1,/data/CCBR_Pipeliner/testdata/chipseq/SRR3081748_1.fastq.gz,,CTCF,WCE_p20
CTCF_ChIP_macrophage_p20,2,/data/CCBR_Pipeliner/testdata/chipseq/SRR3081749_1.fastq.gz,,CTCF,WCE_p20
CTCF_ChIP_macrophage_p3,1,/data/CCBR_Pipeliner/testdata/chipseq/SRR3081750_1.fastq.gz,,CTCF,WCE_p3
CTCF_ChIP_macrophage_p3,2,/data/CCBR_Pipeliner/testdata/chipseq/SRR3081751_1.fastq.gz,,CTCF,WCE_p3
CTCF_ChIP_MEF_p20,1,/data/CCBR_Pipeliner/testdata/chipseq/SRR3081752_1.fastq.gz,,CTCF,WCE_p20
CTCF_ChIP_MEF_p20,2,/data/CCBR_Pipeliner/testdata/chipseq/SRR3081753_1.fastq.gz,,CTCF,WCE_p20
WCE_p3,,/data/CCBR_Pipeliner/testdata/chipseq/SRR3081772_1.fastq.gz,,,
WCE_p20,,/data/CCBR_Pipeliner/testdata/chipseq/SRR3081773_1.fastq.gz,,,
Parameters file (optional)¶
you can create a YAML file with the parameters you want to set. This is useful for managing multiple parameters or for sharing configurations with others. Here's an example YAML file with some common parameters:
assets/params.yml
input: './assets/samplesheet_full_mm10.csv'
contrasts: './assets/contrasts_full_mm10.csv'
genome: mm10
run_gem: false
run_chipseeker: false
run_qc: true
You will then pass this file to the -params-file
option when running the pipeline:
champagne run --output /data/$USER/champagne_project \
-params-file assets/params.yml
Run¶
champagne run
is the main command to run the pipeline. Here's the output of champagne run --help
:
Usage: champagne run [OPTIONS] [NEXTFLOW_ARGS]...
Run the workflow
Note: you must first run `champagne init --output <output_dir>` to
initialize the output directory.
docs: https://ccbr.github.io/CHAMPAGNE
Options:
--output DIRECTORY Output directory path for champagne init & run.
Equivalent to nextflow launchDir. Defaults to your
current working directory.
--mode TEXT Run mode (slurm, local) [default: slurm]
-F, --forceall Force all processes to run (i.e. do not use nextflow
-resume)
-h, --help Show this message and exit.
Nextflow options:
-profile <profile> Nextflow profile to use (e.g. test)
-params-file <file> Nextflow params file to use (e.g. assets/params.yml)
-preview Preview the processes that will run without executing them
EXAMPLES:
Execute with slurm:
champagne run --output path/to/outdir --mode slurm
Preview the processes that will run:
champagne run .--output path/to/outdir --mode local -preview
Add nextflow args (anything supported by `nextflow run`):
champagne run --output path/to/outdir --mode slurm -profile test
champagne run --output path/to/outdir --mode slurm -profile test -params-file assets/params.yml
Any nextflow argument can also be passed to champagne run, such as -profile
, -preview
, or -params-file
. These are always prepended with a single hyphen.
Pipeline parameters can also be passed via the command line. These are always prepended with a double-hyphen.
Preview¶
Run a local preview:
champagne run \
--output /data/$USER/champagne_project \
--input assets/samplesheet_test_mm10.csv \
--contrasts assets/contrasts_test_mm10.tsv \
--genome mm10 \
--mode local \
-preview
Stub run¶
Launch a local stub run to view processes that will run, output blank files, and download containers:
champagne run \
--output /data/$USER/champagne_project \
--input assets/samplesheet_test_mm10.csv \
--contrasts assets/contrasts_test_mm10.tsv \
--genome mm10 \
--mode local \
-stub
Run with slurm¶
Launch a pipeline run with slurm:
champagne run \
--output /data/$USER/champagne_project \
--input assets/samplesheet_test_mm10.csv \
--contrasts assets/contrasts_test_mm10.tsv \
--genome mm10 \
--mode slurm