paths
paths
Functions
| Name | Description |
|---|---|
| create_tar_archive | Creates a compressed tar archive (.tar.gz) containing the specified files. |
| get_tree | Generate a directory tree structure using the tree command-line utility |
| glob_files | Collects files from a specified directory and its subdirectories that match a list of patterns. |
| load_tree | Load a tree structure from a string, attempting to parse it as JSON or |
| run_du | Calculates the total size of a directory in bytes using the du shell command. |
create_tar_archive
paths.create_tar_archive(files, tar_filename)Creates a compressed tar archive (.tar.gz) containing the specified files.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| files | list of pathlib.Path | A list of file paths to include in the archive. | required |
| tar_filename | str | The name of the output tar.gz file. | required |
get_tree
paths.get_tree(pipeline_outdir, args='-aJ --du')Generate a directory tree structure using the tree command-line utility
Note: when using -J with –du, the output is not valid JSON due to extra trailing commas. It can be parsed with ast.literal_eval rather than json.loads.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| pipeline_outdir | str | The path to the directory for which the tree structure will be generated. | required |
| args | str | Additional arguments to pass to the tree command. Defaults to “-aJ” for including hidden files and formatting output as JSON |
'-aJ --du' |
Returns
| Name | Type | Description |
|---|---|---|
| str | The directory tree structure as a string, stripped of any | |
| leading or trailing whitespace. |
glob_files
paths.glob_files(
pipeline_outdir,
patterns=['snakemake.log', '.nextflow.log', '*.jobby*', 'master.log', 'runtime_statics*'],
)Collects files from a specified directory and its subdirectories that match a list of patterns.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| pipeline_outdir | str | The base directory to search for files. | required |
| patterns | list of str | A list of glob patterns to match files. Defaults to: [ “snakemake.log”, “.nextflow.log”, “.jobby”, “master.log”, “runtime_statics*“, ]. | ['snakemake.log', '.nextflow.log', '*.jobby*', 'master.log', 'runtime_statics*'] |
Returns
| Name | Type | Description |
|---|---|---|
set of pathlib.Path: A set of pathlib.Path objects representing the matched files. |
load_tree
paths.load_tree(tree_str)Load a tree structure from a string, attempting to parse it as JSON or Python literal.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| tree_str | str | The string representation of the tree structure. | required |
Returns: dict: The parsed tree structure as a dictionary.
run_du
paths.run_du(dirpath)Calculates the total size of a directory in bytes using the du shell command.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| dirpath | str | Path to the directory whose size is to be calculated. | required |
Returns: int or float: The size of the directory in bytes. Returns NaN if the size cannot be determined. Raises: Issues a warning if the directory size cannot be parsed or if the du command fails.