Python API
This is the API documentation for Batch, and provides detailed information on the Python programming interface.
Use import hailtop.batch
to access this functionality.
Batches
A Batch
is an object that represents the set of jobs to run
and the order or dependencies between the jobs. Each Job
has
an image in which to execute commands and settings for storage,
memory, and CPU. A BashJob
is a subclass of Job
that runs bash commands while a PythonJob
executes Python
functions.
Object representing the distributed acyclic graph (DAG) of jobs to run. |
|
Object representing a single job to execute. |
|
Object representing a single bash job to execute. |
|
Object representing a single Python job to execute. |
Resources
A Resource
is an abstract class that represents files in a Batch
and
has two subtypes: ResourceFile
and ResourceGroup
.
A single file is represented by a ResourceFile
which has two subtypes:
InputResourceFile
and JobResourceFile
. An InputResourceFile is used
to specify files that are inputs to a Batch
. These files are not generated as outputs from a
Job
. Likewise, a JobResourceFile is a file that is produced by a job. JobResourceFiles
generated by one job can be used in subsequent job, creating a dependency between the jobs.
A ResourceGroup
represents a collection of files that should be treated as one unit. All files
share a common root, but each file has its own extension.
A PythonResult
stores the output from running a PythonJob
.
Abstract class for resources. |
|
Class representing a single file resource. |
|
Class representing a resource from an input file. |
|
Class representing an intermediate file from a job. |
|
Class representing a mapping of identifiers to a resource file. |
|
Class representing a result from a Python job. |
Batch Pool Executor
A BatchPoolExecutor
provides roughly the same interface as the Python
standard library’s concurrent.futures.Executor
. It facilitates
executing arbitrary Python functions in the cloud.
An executor which executes Python functions in the cloud. |
|
Backends
A Backend
is an abstract class that can execute a Batch
. Currently,
there are two types of backends: LocalBackend
and ServiceBackend
. The
local backend executes a batch on your local computer by running a shell script. The service
backend executes a batch on Google Compute Engine VMs operated by the Hail team
(Batch Service). You can access the UI for the Batch Service
at https://batch.hail.is.
The type of value returned by |
|
Abstract class for backends. |
|
Backend that executes batches on a local computer. |
|
Backend that executes batches on Hail's Batch Service on Google Cloud. |
Utilities
Build a new Python image with dill and the specified pip packages installed. |
|
Concatenate files using tree aggregation. |
|
Merge binary PLINK files using tree aggregation. |