get_failed_steps_for_fileset#

coffea.dataset_tools.get_failed_steps_for_fileset(fileset: DataGroupSpec, report_dict: dict[str, awkward.Array])[source]#

Modify the input fileset to only contain the files and row-ranges for failed processing jobs as specified in the supplied report.

Parameters:
  • fileset (DataGroupSpec) – The set of datasets to be reduced to only contain files and row-ranges that have previously encountered failed file access.

  • report_dict (dict[str, awkward.Array]) – The computed file-access error reports from dask-awkward, indexed by dataset name.

Returns:

out – The reduced dataset with only the row-ranges and files that failed processing, according to the input report.

Return type:

DataGroupSpec