Output¶

When you initialize champagne, the project output directory (specified with --output) is populated with assets/, conf/, and nextflow.config. After running the pipeline, the output directory will also contain results/, log/, work/, and submit_slurm.sh (if using --mode slurm).

/data/$USER/champagne_project
├── assets/
├── conf/
├── log/
├── nextflow.config
├── results/
├── submit_slurm.sh
└── work/

Directory	Description
`assets/`	Contains example sample sheets, contrasts, and other input files.
`conf/`	Contains configuration files for the pipeline.
`log/`	Contains log files for the pipeline run, including a slurm log if `--mode slurm` is used.
`nextflow.config`	The Nextflow configuration file for the pipeline.
`results/`	Contains the results files from the pipeline run.
`submit_slurm.sh`	A script to submit the pipeline run to SLURM (only created if `--mode slurm` is used).
`work/`	Contains the working directory for Nextflow.

Results¶

Workflow output files will be written to the results/ directory by default in your pipeline run output directory.

For example, if you ran champagne with champagne run --output /data/$USER/champagne_project, the results files will be in /data/$USER/champagne_project/results/.

All paths listed below are relative to the results/ directory.

Pipeline information¶

pipeline_info/ contains information about the pipeline run.

pipeline_info/bco.json is the BioCompute Object (BCO) for the pipeline run, which is a standardized format for sharing computational workflows and results.

pipeline_info/dag.html is the DAG (Directed Acyclic Graph) of the pipeline run, which shows the workflow execution order and dependencies.

The execution report, timeline, and trace file generated by Nextflow are also included in this directory and are named with the timestamp of the pipeline run.

Quality Control¶

qc/ contains all quality control files.

MultiQC¶

qc/multiqc/multiqc_report.html is the MultiQC report for the entire run, which summarizes the quality control results for all samples.

View an example multiqc report here.

Input files passed to multiqc are in qc/multiqc/input/.

fastQC¶

qc/fastqc/raw and qc/fastqc/trimmed contain FastQC reports for raw and trimmed reads, respectively.

deepTools¶

qc/deeptools/ contains plots and intermediate files generated by deepTools, including QC metrics, PCA plots, scatter plots, and correlation heatmaps.

Phantom Peak Qual Tools (PPQT)¶

qc/phantompeakqualtools/ contains the PPQT reports and fragment lengths for each sample.

Alignments¶

align/bam/ contains the sorted and deduped BAM files for each sample.

Bigwigs¶

bigwigs/ contains the bigwig files for each sample; one before removing input reads ({id}.bw) and one with input reads removed ({id}.inputnorm.bw).

These are normalized with the method set by the deeptools_normalize_samples parameter.

If using a spike-in genome, these are normalized according the spike-in parameters set. See the Spike-in normalization doc for more details.

Peaks¶

peaks/{peak_caller}/ directories are created for each peak caller. Available peak callers are macs2_narrow, macs2_broad, gem, and sicer. See the run parameters and peak caller parameters for customization options.

peaks/{peak_caller}/replicates/ contains called peaks for each sample.

Differential analysis¶

If any sample has only one replicate, manorm is used for differential analysis. Otherwise, diffbind is used.

manorm results are in peaks/{peak_caller}/diff/manorm/{contrast}/.

diffbind results are in peaks/{peak_caller}/diff/diffbind/{contrast}/.

Consensus Peaks¶

peaks/{peak_caller}/consensus/{consensus_method} contains the consensus peaks for each peak caller.

Motifs¶

peaks/{peak_caller}/consensus/{consensus_method}/motifs contains HOMER and MEME AME results.

Homer results are in motifs/homer/, organized with one directory per sample (e.g. {id}_homer/) and contain a background fasta, target fasta, results HTML, and a subdirectory containing motif files and logos.

If the target and background fasta files from homer are not blank, they are used for MEME AME for motif enrichment analysis. These results files are in motifs/meme/.

Annotations¶

peaks/{peak_caller}/consensus/{consensus_method}/annotations contains peak annotations from ChIPseeker.

Custom Genome¶

If you specified a custom genome with custom genome parameters, the prepared genome files will be in genome/{genome}. These include the genome BWA index, blacklist BWA index, and a nextflow config file to reuse the genome in future runs.

Working Directory¶

work/ is where Nextflow stores intermediate files during the pipeline run. If you resubmit the pipeline, the working directory will be reused so the pipeline can resume from where it previously stopped. After successfully completing the pipeline run, you can delete the working directory to save disk space.