CheckpointerABC#

class coffea.processor.CheckpointerABC[source]#

Bases: object

ABC for a generalized checkpointer

Checkpointers are used to save chunk outputs to disk, and reload them if the same chunk is processed again. This is useful for long-running jobs that may be interrupted (resumable processing).

Examples

>>> from datetime import datetime
>>> from coffea import processor
>>> from coffea.processor import SimpleCheckpointer

# create a checkpointer that stores checkpoints in a directory with the current date/time # (you may want to use a more specific directory in practice) >>> datestring = datetime.now().strftime(“%Y%m%d%H”) >>> checkpointer = SimpleCheckpointer(checkpoint_dir=f”checkpoints/{datestring}”, verbose=True)

# pass the checkpointer to a Runner >>> run = processor.Runner(…, checkpointer=checkpointer) >>> output = run(…)

After the run, the checkpoints will be stored in the directory checkpoints/{datestring}. On a subsequent run, if the same chunks are processed (and the same checkpointer, or rather checkpoint_dir is used), the results will be loaded from disk instead of being recomputed.

Methods Summary

load(metadata, processor_instance)

save(output, metadata, processor_instance)

Methods Documentation

abstract load(metadata: Any, processor_instance: ProcessorABC) Accumulatable | None[source]#
abstract save(output: Accumulatable, metadata: Any, processor_instance: ProcessorABC) None[source]#