jobby

jobby

Display job information for past slurm job IDs

ABOUT

jobby will take your past jobs and display their job information. Why? We have pipelines running on several different clusters and job schedulers. jobby is an attempt to centralize and abstract the process of querying different job schedulers. On each supported target system, jobby will attempt to determine the best method for getting job information to return to the user in a standardized format and unified cli.

Many thanks to the original author: Skyler Kuhn (@skchronicles)

Original source: OpenOmics/mr-seek

REQUIRES

  • python>=3.5

DISCLAIMER

PUBLIC DOMAIN NOTICE

        NIAID Collaborative Bioinformatics Resource (NCBR)

   National Institute of Allergy and Infectious Diseases (NIAID)
This software/database is a "United  States Government Work" under
the terms of the United  States Copyright Act.  It was written as
part of the author's official duties as a United States Government
employee and thus cannot be copyrighted. This software is freely
available to the public for use.

Although all  reasonable  efforts have been taken  to ensure  the
accuracy and reliability of the software and data, NCBR do not and
cannot warrant the performance or results that may  be obtained by
using this software or data. NCBR and NIH disclaim all warranties,
express  or  implied,  including   warranties   of   performance,
merchantability or fitness for any particular purpose.

Please cite the author and NIH resources like the "Biowulf Cluster"
in any work or product based on this material.

USAGE

$ jobby [OPTIONS] JOB_ID [JOB_ID …]

EXAMPLE

$ jobby 18627545 15627516 58627597

Classes

Name Description
Colors Class encoding for ANSI escape sequences for styling terminal text.

Colors

jobby.Colors()

Class encoding for ANSI escape sequences for styling terminal text. Any string that is formatting with these styles must be terminated with the escape sequence, i.e. Colors.end.

Functions

Name Description
add_missing Adds missing information to a list. This can be used
convert_size Converts bytes to a human readable format.
dashboard_cli Biowulf-specific tool to get SLURM job information.
err Prints any provided args to standard error.
fatal Prints any provided args to standard error
get_toolkit Finds the best suited tool from a list of possible choices. Assumes tool list is already
jobby Wrapper to each supported job scheduler: slurm, etc.
parsed_arguments Parses user-provided command-line arguments. This requires
sacct Generic tool to get SLURM job information.
sge Displays SGE job information to standard output.
slurm Displays SLURM job information to standard output.
to_bytes Convert a human readable size unit into bytes.
uge Displays UGE job information to standard output.
which Checks if an executable is in $PATH.

add_missing

jobby.add_missing(linelist, insertion_dict)

Adds missing information to a list. This can be used to add missing job information fields to the results of job querying tool.

Parameters

Name Type Description Default
linelist list[str] List containing job information for each field of interest. required
insertion_dict dict[int, Union[str, list[str]]] Dictionary used to insert missing information to a given index, where the keys are indices of the linelist and the values are information to add. The indices should be zero-based. Multiple consecutive values should be inserted at once as a list. required

Returns

Name Type Description
list[str]: The updated list with the missing information added.

Example

add_missing([0,1,2,3,4], {3:[‘+’,‘++’], 1:‘-’, 4:‘@’}) >> [0, ‘-’, 1, 2, ‘+’, ‘++’, 3, ‘@’, 4]

convert_size

jobby.convert_size(size_bytes)

Converts bytes to a human readable format.

Parameters

Name Type Description Default
size_bytes int Size in bytes to convert. required

Returns

Name Type Description
str Human readable size in the format ‘X.YZUNIT’.

Example

convert_size(1024) ‘1.0KiB’

dashboard_cli

jobby.dashboard_cli(jobs, threads=1, tmp_dir=None)

Biowulf-specific tool to get SLURM job information. HPC staff recommend using this over the default slurm sacct command for performance reasons. By default, the dashboard_cli returns information for the following fields: jobid state submit_time partition nodes cpus mem timelimit gres dependency queued_time state_reason start_time elapsed_time end_time cpu_max mem_max eval Runs command: $ dashboard_cli jobs
–joblist 12345679,12345680
–fields FIELD,FIELD,FIELD
–tab –archive

Parameters

Name Type Description Default
jobs list List of job identifiers. required
threads int Number of threads to use. 1
tmp_dir str Temporary directory to use. None

Returns

Name Type Description
None

err

jobby.err(*message, **kwargs)

Prints any provided args to standard error. kwargs can be provided to modify print function’s behavior.

Parameters

Name Type Description Default
*message Values printed to standard error. ()
**kwargs Key words to modify print function behavior. {}

fatal

jobby.fatal(*message, **kwargs)

Prints any provided args to standard error and exits with an exit code of 1.

Parameters

Name Type Description Default
*message Values printed to standard error. ()
**kwargs Key words to modify print function behavior. {}

get_toolkit

jobby.get_toolkit(tool_list)

Finds the best suited tool from a list of possible choices. Assumes tool list is already ordered from the best to worst choice. The first tool found in a user’s $PATH is returned.

Parameters

Name Type Description Default
tool_list list[str] List of ordered tools to find. required

Returns

Name Type Description
str First tool found in tool_list.

Raises

Name Type Description
SystemExit If no tools are found in the user’s $PATH.

jobby

jobby.jobby(args)

Wrapper to each supported job scheduler: slurm, etc. Each scheduler has a custom handler to most effectively get and parse job information.

Parameters

Name Type Description Default
sub_args argparse.Namespace Parsed command-line arguments. required

Returns

Name Type Description
None

parsed_arguments

jobby.parsed_arguments(name, description)

Parses user-provided command-line arguments. This requires argparse and textwrap packages. To create custom help formatting a text wrapped docstring is used to create the help message for required options. As so, the help message for require options must be suppressed. If a new required argument is added to the cli, it must be updated in the usage statement docstring below.

Parameters

Name Type Description Default
name str Name of the pipeline or command-line tool. required
description str Short description of pipeline or command-line tool. required

sacct

jobby.sacct(jobs, threads=1, tmp_dir=None)

Generic tool to get SLURM job information. sacct should be available on all SLURM clusters. The dashboard_cli is prioritized over using sacct due to perform reasons; however, this method will be portable across different SLURM clusters. To get maximum memory usage for a job, we will need to parse the MaxRSS field from the $SLURM_JOBID.batch lines. Returns job information for the following fields: jobid jobname state partition reqtres alloccpus reqmem maxrss timelimit reserved start end elapsed nodelist user workdir To get maximum memory usage for a job, we will need to parse the MaxRSS fields from the $SLURM_JOBID.batch lines. Runs command: $ sacct -j 12345679,12345680
–fields FIELD,FIELD,FIELD
-P –delimiter $’ ’

Parameters

Name Type Description Default
jobs list List of job identifiers. required
threads int Number of threads to use. 1
tmp_dir str Temporary directory to use. None

Returns

Name Type Description
None

sge

jobby.sge(jobs, threads, tmp_dir)

Displays SGE job information to standard output.

Parameters

Name Type Description Default
jobs list List of job objects to be processed. required
threads int Number of threads to be used. required
tmp_dir str Temporary directory for job processing. required

Returns

Name Type Description
None

slurm

jobby.slurm(jobs, threads, tmp_dir)

Displays SLURM job information to standard output.

Parameters

Name Type Description Default
jobs list List of job identifiers. required
threads int Number of threads to use. required
tmp_dir str Temporary directory to use. required

Returns

Name Type Description
None

to_bytes

jobby.to_bytes(size)

Convert a human readable size unit into bytes. Returns None if cannot convert/parse provided size.

Parameters

Name Type Description Default
size str Human readable size unit to convert. required

Returns

Name Type Description
int Size in bytes.

Example

to_bytes(‘1.0KiB’) 1024

uge

jobby.uge(jobs, threads, tmp_dir)

Displays UGE job information to standard output.

Parameters

Name Type Description Default
jobs list A list of job identifiers. required
threads int The number of threads to use. required
tmp_dir str The temporary directory to use. required

Returns

Name Type Description
None

which

jobby.which(cmd, path=None)

Checks if an executable is in $PATH.

Parameters

Name Type Description Default
cmd str Name of the executable to check. required
path list Optional list of PATHs to check. Defaults to $PATH. None

Returns

Name Type Description
bool True if the executable is in PATH, False otherwise.