ServiceBackend
- class hailtop.batch.backend.ServiceBackend(*args, billing_project=None, bucket=None, remote_tmpdir=None, google_project=None, token=None, regions=None)
Bases:
Backend
[Batch
]Attributes
Backend that executes batches on Hail's Batch Service on Google Cloud.
Methods
Execute a batch.
Get the supported cloud regions
- rtype:
- ANY_REGION = ['any_region']
Backend that executes batches on Hail’s Batch Service on Google Cloud.
Examples
>>> service_backend = ServiceBackend(billing_project='my-billing-account', remote_tmpdir='gs://my-bucket/temporary-files/') >>> b = Batch(backend=service_backend) >>> b.run() >>> service_backend.close()
If the Hail configuration parameters batch/billing_project and batch/remote_tmpdir were previously set with
hailctl config set
, then one may elide the billing_project and remote_tmpdir parameters.>>> service_backend = ServiceBackend() >>> b = Batch(backend=service_backend) >>> b.run() >>> service_backend.close()
- Parameters:
billing_project – Name of billing project to use.
bucket – Name of bucket to use. Should not include the
gs://
prefix. Cannot be used with remote_tmpdir. Temporary data will be stored in the “/batch” folder of this bucket. This argument is deprecated. Use remote_tmpdir instead.remote_tmpdir – Temporary data will be stored in this cloud storage folder. Cannot be used with deprecated argument bucket. Paths should match a GCS URI like gs://<BUCKET_NAME>/<PATH> or an ABS URI of the form https://<ACCOUNT_NAME>.blob.core.windows.net/<CONTAINER_NAME>/<PATH>.
google_project – If specified, the project to use when authenticating with Google Storage. Google Storage is used to transfer serialized values between this computer and the cloud machines that execute Python jobs.
token – The authorization token to pass to the batch client. Should only be set for user delegation purposes.
regions – Cloud region(s) to run jobs in. Use py:staticmethod:.ServiceBackend.supported_regions to list the available regions to choose from. Use py:attribute:.ServiceBackend.ANY_REGION to signify the default is jobs can run in any available region. The default is jobs can run in any region unless a default value has been set with hailctl. An example invocation is hailctl config set batch/regions “us-central1,us-east1”.
- _run(batch, dry_run, verbose, delete_scratch_on_exit, wait=True, open=False, disable_progress_bar=False, callback=None, token=None, **backend_kwargs)
Execute a batch.
Warning
This method should not be called directly. Instead, use
batch.Batch.run()
and passServiceBackend
specific arguments as key-word arguments.- Parameters:
batch (
Batch
) – Batch to execute.dry_run (
bool
) – If True, don’t execute code.verbose (
bool
) – If True, print debugging output.delete_scratch_on_exit (
bool
) – If True, delete temporary directories with intermediate files.wait (
bool
) – If True, wait for the batch to finish executing before returning.open (
bool
) – If True, open the UI page for the batch.disable_progress_bar (
bool
) – If True, disable the progress bar.callback (
Optional
[str
]) – If not None, a URL that will receive at most one POST request after the entire batch completes.token (
Optional
[str
]) – If not None, a string used for idempotency of batch submission.
- Return type:
Batch
- static supported_regions()
Get the supported cloud regions
Examples
>>> regions = ServiceBackend.supported_regions()
- Returns:
A list of the supported cloud regions