BashJob

class hailtop.batch.job.BashJob(batch, token, *, name=None, attributes=None, shell=None)

Bases: hailtop.batch.job.Job

Object representing a single bash job to execute.

Examples

Create a batch object:

>>> b = Batch()

Create a new bash job that prints hello to a temporary file t.ofile:

>>> j = b.new_job()
>>> j.command(f'echo "hello" > {j.ofile}')

Write the temporary file t.ofile to a permanent location

>>> b.write_output(j.ofile, 'hello.txt')

Execute the DAG:

>>> b.run()

Notes

This class should never be created directly by the user. Use Batch.new_job() or Batch.new_bash_job() instead.

Methods

command

Set the job’s command to execute.

declare_resource_group

Declare a resource group for a job.

image

Set the job’s docker image.

command(command)

Set the job’s command to execute.

Examples

Simple job with no output files:

>>> b = Batch()
>>> j = b.new_job()
>>> j.command(f'echo "hello"')
>>> b.run()

Simple job with one temporary file j.ofile that is written to a permanent location:

>>> b = Batch()
>>> j = b.new_job()
>>> j.command(f'echo "hello world" > {j.ofile}')
>>> b.write_output(j.ofile, 'output/hello.txt')
>>> b.run()

Two jobs with a file interdependency:

>>> b = Batch()
>>> j1 = b.new_job()
>>> j1.command(f'echo "hello" > {j1.ofile}')
>>> j2 = b.new_bash_job()
>>> j2.command(f'cat {j1.ofile} > {j2.ofile}')
>>> b.write_output(j2.ofile, 'output/cat_output.txt')
>>> b.run()

Specify multiple commands in the same job:

>>> b = Batch()
>>> t = b.new_job()
>>> j.command(f'echo "hello" > {j.tmp1}')
>>> j.command(f'echo "world" > {j.tmp2}')
>>> j.command(f'echo "!" > {j.tmp3}')
>>> j.command(f'cat {j.tmp1} {j.tmp2} {j.tmp3} > {j.ofile}')
>>> b.write_output(j.ofile, 'output/concatenated.txt')
>>> b.run()

Notes

This method can be called more than once. It’s behavior is to append commands to run to the set of previously defined commands rather than overriding an existing command.

To declare a resource file of type JobResourceFile, use either the get attribute syntax of job.{identifier} or the get item syntax of job[‘identifier’]. If an object for that identifier doesn’t exist, then one will be created automatically (only allowed in the command() method). The identifier name can be any valid Python identifier such as ofile5000.

All JobResourceFile are temporary files and must be written to a permanent location using Batch.write_output() if the output needs to be saved.

Only resources can be referred to in commands. Referencing a batch.Batch or Job will result in an error.

Parameters

command (str) – A bash command.

Return type

BashJob

Returns

Same job object with command appended.

declare_resource_group(**mappings)

Declare a resource group for a job.

Examples

Declare a resource group:

>>> b = Batch()
>>> input = b.read_input_group(bed='data/example.bed',
...                            bim='data/example.bim',
...                            fam='data/example.fam')
>>> j = b.new_job()
>>> j.declare_resource_group(tmp1={'bed': '{root}.bed',
...                                'bim': '{root}.bim',
...                                'fam': '{root}.fam',
...                                'log': '{root}.log'})
>>> j.command(f'plink --bfile {input} --make-bed --out {j.tmp1}')
>>> b.run()  

Warning

Be careful when specifying the expressions for each file as this is Python code that is executed with eval!

Parameters

mappings (Dict[str, Any]) – Keywords (in the above example tmp1) are the name(s) of the resource group(s). File names may contain arbitrary Python expressions, which will be evaluated by Python eval. To use the keyword as the file name, use {root} (in the above example {root} will be replaced with tmp1).

Return type

BashJob

Returns

Same job object with resource groups set.

image(image)

Set the job’s docker image.

Examples

Set the job’s docker image to ubuntu:20.04:

>>> b = Batch()
>>> j = b.new_job()
>>> (j.image('ubuntu:20.04')
...   .command(f'echo "hello"'))
>>> b.run()  
Parameters

image (str) – Docker image to use.

Return type

BashJob

Returns

Same job object with docker image set.