hailtop.batch.utils.concatenate

hailtop.batch.utils.concatenate(b, files, image=None, branching_factor=100)

Concatenate files using tree aggregation.

Examples

Create and execute a batch that concatenates output files:

>>> b = Batch()
>>> j1 = b.new_job()
>>> j1.command(f'touch {j1.ofile}')
>>> j2 = b.new_job()
>>> j2.command(f'touch {j2.ofile}')
>>> j3 = b.new_job()
>>> j3.command(f'touch {j3.ofile}')
>>> files = [j1.ofile, j2.ofile, j3.ofile]
>>> ofile = concatenate(b, files, branching_factor=2)
>>> b.run()
Parameters:
  • b (Batch) – Batch to add concatenation jobs to.

  • files (List[ResourceFile]) – List of files to concatenate.

  • branching_factor (int) – Grouping factor when concatenating files.

  • image (Optional[str]) – Image to use. Must have the cat command.

Return type:

ResourceFile

Returns:

Concatenated output file.