Pipeline Tutorial

Welcome to the CARLISLE Pipeline Tutorial! This guide walks you through running the CARLISLE pipeline using the provided test dataset on the NIH Biowulf HPC environment.


Getting Started

Before beginning, review the Getting Started Guide for installation, environment setup, and dependency loading instructions.


Step 1. Set Your Working Directory

Navigate to your project directory on Biowulf:

cd /path/to/your/project/directory

Step 2. Initialize the Pipeline

Load the CARLISLE module and initialize your working directory:

module load ccbrpipeliner
carlisle --runmode=init --workdir=/path/to/output/dir

This command copies the required configuration, manifest, and Snakefiles into your chosen output directory (WORKDIR). Initialization must be done before any other CARLISLE operation.


Submitting the Test Data

The test dataset provided with CARLISLE enables you to validate the installation and confirm correct execution. The test includes minimal FASTQ files, configurations, and manifests.

Step 3. Run the Test Command

Execute the built-in test run to validate pipeline functionality:

carlisle --runmode=runtest --workdir=/path/to/output/dir

This command prepares the test data, performs a dry-run to validate workflow dependencies, and then submits the pipeline to the Biowulf SLURM cluster.


Expected Output

During a successful test run, you should see a job summary similar to the one below, detailing the number of tasks executed per Snakemake rule:

Job stats:
job                              count    min threads    max threads
-----------------------------  -------  -------------  -------------
DESeq                                  24              1              1
align                                   9             56             56
alignstats                              9              2              2
all                                     1              1              1
bam2bg                                  9              4              4
create_contrast_data_files             24              1              1
create_contrast_peakcaller_files       12              1              1
create_reference                        1             32             32
create_replicate_sample_table           1              1              1
diffbb                                 24              1              1
filter                                 18              2              2
findMotif                              96              6              6
gather_alignstats                       1              1              1
go_enrichment                          12              1              1
gopeaks_broad                          16              2              2
gopeaks_narrow                         16              2              2
macs2_broad                            16              2              2
macs2_narrow                           16              2              2
make_counts_matrix                     24              1              1
multiqc                                 2              1              1
qc_fastqc                               9              1              1
rose                                   96              2              2
seacr_relaxed                          16              2              2
seacr_stringent                        16              2              2
spikein_assessment                      1              1              1
trim                                    9             56             56
total                                 478              1             56

💡 Tip: This job summary confirms successful rule execution, resource allocation, and workflow orchestratio