spooker
spooker
SPOOKER 👻
This command is designed to be used as part of the OnComplete/OnSuccess/OnError handlers as part of Snakemake and Nextflow pipelines. It collects metadata about the pipeline run, bundles it into a tarball, and saves it to a common location for later retrieval.
Run spooker --help
for more information.
See spooker
for the main function
Functions
Name | Description |
---|---|
cli | spooker 👻 |
get_spooker_dict | Generates a metadata dictionary summarizing the state and logs of a pipeline run. |
spooker | Processes a pipeline output directory to generate metadata, tree JSON, and SLURM job log JSON, |
cli
spooker.cli(outdir, name, version, path, debug)
spooker 👻
This command is designed to be used as part of the OnComplete/OnSuccess/OnError handlers as part of Snakemake and Nextflow pipelines. It collects metadata about the pipeline run, bundles it into a tarball, and saves it to a common location for later retrieval.
get_spooker_dict
spooker.get_spooker_dict(
pipeline_outdir,
pipeline_name,
pipeline_version,
pipeline_path, )
Generates a metadata dictionary summarizing the state and logs of a pipeline run.
Parameters
Name | Type | Description | Default |
---|---|---|---|
pipeline_outdir | pathlib.Path | Path to the pipeline output directory. | required |
pipeline_name | str | Name of the pipeline. | required |
pipeline_version | str | Version of the pipeline. | required |
pipeline_path | str | Path to the pipeline definition or script. | required |
Returns
Name | Type | Description |
---|---|---|
dict | A dictionary containing: - “outdir_tree”: String representation of the output directory tree. - “pipeline_metadata”: Metadata about the pipeline run. - “jobby”: JSON-formatted job log records. - “master_job_log”: Contents of the main job log file. - “failed_jobs”: Logs of failed jobs. |
spooker
spooker.spooker(
pipeline_outdir,
pipeline_name,
pipeline_version,
pipeline_path,=True,
clean=False,
debug )
Processes a pipeline output directory to generate metadata, tree JSON, and SLURM job log JSON, then stages the file on an HPC cluster.
Parameters
Name | Type | Description | Default |
---|---|---|---|
pipeline_outdir | pathlib.Path | Path to the pipeline output directory. | required |
pipeline_version | str | Version of the pipeline being processed. | required |
pipeline_name | str | Name of the pipeline being processed. | required |
pipeline_path | str | Path to the pipeline source code or configuration. | required |
clean | bool | Whether to delete the generated metadata file after staging. Defaults to True. | True |
debug | bool | Whether to enable debug mode for the HPC cluster. Defaults to False. | False |
Returns
Name | Type | Description |
---|---|---|
pathlib.Path: Path to the staged metadata file on the HPC cluster. |
Raises
Name | Type | Description |
---|---|---|
FileNotFoundError | If the pipeline output directory does not exist. |
Notes
- The function collects metadata, generates a tree JSON representation of the pipeline directory, and extracts job log information.
- The metadata is written to a compressed JSON file and staged on an HPC cluster.
- If
clean
is True, the local metadata file is deleted after staging.