jobby
jobby
Display job information for past slurm job IDs
ABOUT
jobby
will take your past jobs and display their job information. Why? We have pipelines running on several different clusters and job schedulers. jobby
is an attempt to centralize and abstract the process of querying different job schedulers. On each supported target system, jobby
will attempt to determine the best method for getting job information to return to the user in a standardized format and unified cli.
Many thanks to the original author: Skyler Kuhn (@skchronicles)
Original source: OpenOmics/mr-seek
REQUIRES
- python>=3.5
DISCLAIMER
PUBLIC DOMAIN NOTICE
NIAID Collaborative Bioinformatics Resource (NCBR)
National Institute of Allergy and Infectious Diseases (NIAID)
This software/database is a "United States Government Work" under
the terms of the United States Copyright Act. It was written as
part of the author's official duties as a United States Government
employee and thus cannot be copyrighted. This software is freely
available to the public for use.
Although all reasonable efforts have been taken to ensure the
accuracy and reliability of the software and data, NCBR do not and
cannot warrant the performance or results that may be obtained by
using this software or data. NCBR and NIH disclaim all warranties,
express or implied, including warranties of performance,
merchantability or fitness for any particular purpose.
Please cite the author and NIH resources like the "Biowulf Cluster"
in any work or product based on this material.
USAGE
$ jobby [OPTIONS] JOB_ID [JOB_ID …]
EXAMPLE
$ jobby 18627545 15627516 58627597
Classes
Name | Description |
---|---|
Colors | Class encoding for ANSI escape sequences for styling terminal text. |
Colors
jobby.Colors()
Class encoding for ANSI escape sequences for styling terminal text. Any string that is formatting with these styles must be terminated with the escape sequence, i.e. Colors.end
.
Functions
Name | Description |
---|---|
add_missing | Adds missing information to a list. This can be used |
convert_size | Converts bytes to a human readable format. |
dashboard_cli | Biowulf-specific tool to get SLURM job information. |
err | Prints any provided args to standard error. |
fatal | Prints any provided args to standard error |
get_toolkit | Finds the best suited tool from a list of possible choices. Assumes tool list is already |
jobby | Wrapper to each supported job scheduler: slurm, etc. |
parsed_arguments | Parses user-provided command-line arguments. This requires |
sacct | Generic tool to get SLURM job information. |
sge | Displays SGE job information to standard output. |
slurm | Displays SLURM job information to standard output. |
to_bytes | Convert a human readable size unit into bytes. |
uge | Displays UGE job information to standard output. |
which | Checks if an executable is in $PATH. |
add_missing
jobby.add_missing(linelist, insertion_dict)
Adds missing information to a list. This can be used to add missing job information fields to the results of job querying tool.
Parameters
Name | Type | Description | Default |
---|---|---|---|
linelist | list[str] | List containing job information for each field of interest. | required |
insertion_dict | dict[int, Union [str, list[str]]] |
Dictionary used to insert missing information to a given index, where the keys are indices of the linelist and the values are information to add. The indices should be zero-based. Multiple consecutive values should be inserted at once as a list. |
required |
Returns
Name | Type | Description |
---|---|---|
list[str]: The updated list with the missing information added. |
Example
add_missing([0,1,2,3,4], {3:[‘+’,‘++’], 1:‘-’, 4:‘@’}) >> [0, ‘-’, 1, 2, ‘+’, ‘++’, 3, ‘@’, 4]
convert_size
jobby.convert_size(size_bytes)
Converts bytes to a human readable format.
Parameters
Name | Type | Description | Default |
---|---|---|---|
size_bytes | int | Size in bytes to convert. | required |
Returns
Name | Type | Description |
---|---|---|
str | Human readable size in the format ‘X.YZUNIT’. |
Example
convert_size(1024) ‘1.0KiB’
dashboard_cli
=1, tmp_dir=None) jobby.dashboard_cli(jobs, threads
Biowulf-specific tool to get SLURM job information. HPC staff recommend using this over the default slurm sacct
command for performance reasons. By default, the dashboard_cli
returns information for the following fields: jobid state submit_time partition nodes cpus mem timelimit gres dependency queued_time state_reason start_time elapsed_time end_time cpu_max mem_max eval Runs command: $ dashboard_cli jobs
–joblist 12345679,12345680
–fields FIELD,FIELD,FIELD
–tab –archive
Parameters
Name | Type | Description | Default |
---|---|---|---|
jobs | list | List of job identifiers. | required |
threads | int | Number of threads to use. | 1 |
tmp_dir | str | Temporary directory to use. | None |
Returns
Name | Type | Description |
---|---|---|
None |
err
*message, **kwargs) jobby.err(
Prints any provided args to standard error. kwargs can be provided to modify print function’s behavior.
Parameters
Name | Type | Description | Default |
---|---|---|---|
*message | Values printed to standard error. | () |
|
**kwargs | Key words to modify print function behavior. | {} |
fatal
*message, **kwargs) jobby.fatal(
Prints any provided args to standard error and exits with an exit code of 1.
Parameters
Name | Type | Description | Default |
---|---|---|---|
*message | Values printed to standard error. | () |
|
**kwargs | Key words to modify print function behavior. | {} |
get_toolkit
jobby.get_toolkit(tool_list)
Finds the best suited tool from a list of possible choices. Assumes tool list is already ordered from the best to worst choice. The first tool found in a user’s $PATH is returned.
Parameters
Name | Type | Description | Default |
---|---|---|---|
tool_list | list[str] | List of ordered tools to find. | required |
Returns
Name | Type | Description |
---|---|---|
str | First tool found in tool_list. |
Raises
Name | Type | Description |
---|---|---|
SystemExit | If no tools are found in the user’s $PATH. |
jobby
jobby.jobby(args)
Wrapper to each supported job scheduler: slurm, etc. Each scheduler has a custom handler to most effectively get and parse job information.
Parameters
Name | Type | Description | Default |
---|---|---|---|
sub_args | argparse.Namespace | Parsed command-line arguments. | required |
Returns
Name | Type | Description |
---|---|---|
None |
parsed_arguments
jobby.parsed_arguments(name, description)
Parses user-provided command-line arguments. This requires argparse and textwrap packages. To create custom help formatting a text wrapped docstring is used to create the help message for required options. As so, the help message for require options must be suppressed. If a new required argument is added to the cli, it must be updated in the usage statement docstring below.
Parameters
Name | Type | Description | Default |
---|---|---|---|
name | str | Name of the pipeline or command-line tool. | required |
description | str | Short description of pipeline or command-line tool. | required |
sacct
=1, tmp_dir=None) jobby.sacct(jobs, threads
Generic tool to get SLURM job information. sacct
should be available on all SLURM clusters. The dashboard_cli
is prioritized over using sacct
due to perform reasons; however, this method will be portable across different SLURM clusters. To get maximum memory usage for a job, we will need to parse the MaxRSS field from the $SLURM_JOBID.batch
lines. Returns job information for the following fields: jobid jobname state partition reqtres alloccpus reqmem maxrss timelimit reserved start end elapsed nodelist user workdir To get maximum memory usage for a job, we will need to parse the MaxRSS fields from the $SLURM_JOBID.batch
lines. Runs command: $ sacct -j 12345679,12345680
–fields FIELD,FIELD,FIELD
-P –delimiter $’ ’
Parameters
Name | Type | Description | Default |
---|---|---|---|
jobs | list | List of job identifiers. | required |
threads | int | Number of threads to use. | 1 |
tmp_dir | str | Temporary directory to use. | None |
Returns
Name | Type | Description |
---|---|---|
None |
sge
jobby.sge(jobs, threads, tmp_dir)
Displays SGE job information to standard output.
Parameters
Name | Type | Description | Default |
---|---|---|---|
jobs | list | List of job objects to be processed. | required |
threads | int | Number of threads to be used. | required |
tmp_dir | str | Temporary directory for job processing. | required |
Returns
Name | Type | Description |
---|---|---|
None |
slurm
jobby.slurm(jobs, threads, tmp_dir)
Displays SLURM job information to standard output.
Parameters
Name | Type | Description | Default |
---|---|---|---|
jobs | list | List of job identifiers. | required |
threads | int | Number of threads to use. | required |
tmp_dir | str | Temporary directory to use. | required |
Returns
Name | Type | Description |
---|---|---|
None |
to_bytes
jobby.to_bytes(size)
Convert a human readable size unit into bytes. Returns None if cannot convert/parse provided size.
Parameters
Name | Type | Description | Default |
---|---|---|---|
size | str | Human readable size unit to convert. | required |
Returns
Name | Type | Description |
---|---|---|
int | Size in bytes. |
Example
to_bytes(‘1.0KiB’) 1024
uge
jobby.uge(jobs, threads, tmp_dir)
Displays UGE job information to standard output.
Parameters
Name | Type | Description | Default |
---|---|---|---|
jobs | list | A list of job identifiers. | required |
threads | int | The number of threads to use. | required |
tmp_dir | str | The temporary directory to use. | required |
Returns
Name | Type | Description |
---|---|---|
None |
which
=None) jobby.which(cmd, path
Checks if an executable is in $PATH.
Parameters
Name | Type | Description | Default |
---|---|---|---|
cmd | str | Name of the executable to check. | required |
path | list | Optional list of PATHs to check. Defaults to $PATH. | None |
Returns
Name | Type | Description |
---|---|---|
bool | True if the executable is in PATH, False otherwise. |