apply_to_fileset
- coffea.dataset_tools.apply_to_fileset(data_manipulation: ~coffea.processor.processor.ProcessorABC | ~typing.Callable[[~dask_awkward.lib.core.Array], ~dask.base.DaskMethodsMixin | ~typing.Dict[~typing.Hashable, ~dask.base.DaskMethodsMixin] | ~typing.Set[~dask.base.DaskMethodsMixin] | ~typing.List[~dask.base.DaskMethodsMixin] | ~typing.Tuple[~dask.base.DaskMethodsMixin] | ~typing.Tuple[~dask.base.DaskMethodsMixin | ~typing.Dict[~typing.Hashable, ~dask.base.DaskMethodsMixin] | ~typing.Set[~dask.base.DaskMethodsMixin] | ~typing.List[~dask.base.DaskMethodsMixin] | ~typing.Tuple[~dask.base.DaskMethodsMixin], ...]], fileset: ~typing.Dict[str, ~coffea.dataset_tools.preprocess.DatasetSpec] | ~typing.Dict[str, ~coffea.dataset_tools.preprocess.DatasetSpecOptional], schemaclass: ~coffea.nanoevents.schemas.base.BaseSchema = <class 'coffea.nanoevents.schemas.nanoaod.NanoAODSchema'>, uproot_options: dict[str, ~typing.Any] = {}) dict[str, DaskMethodsMixin | Dict[Hashable, DaskMethodsMixin] | Set[DaskMethodsMixin] | List[DaskMethodsMixin] | Tuple[DaskMethodsMixin] | Tuple[DaskMethodsMixin | Dict[Hashable, DaskMethodsMixin] | Set[DaskMethodsMixin] | List[DaskMethodsMixin] | Tuple[DaskMethodsMixin], ...]] | tuple[dict[str, DaskMethodsMixin | Dict[Hashable, DaskMethodsMixin] | Set[DaskMethodsMixin] | List[DaskMethodsMixin] | Tuple[DaskMethodsMixin] | Tuple[DaskMethodsMixin | Dict[Hashable, DaskMethodsMixin] | Set[DaskMethodsMixin] | List[DaskMethodsMixin] | Tuple[DaskMethodsMixin], ...]], Array][source]
Apply the supplied function or processor to the supplied fileset (set of datasets).
- Parameters:
data_manipulation (ProcessorABC or GenericHEPAnalysis) – The user analysis code to run on the input dataset
fileset (FilesetSpec | FilesetSpecOptional) – The data to be acted upon by the data manipulation passed in. Metadata within the fileset should be dask-serializable.
schemaclass (BaseSchema, default NanoAODSchema) – The nanoevents schema to interpret the input dataset with.
uproot_options (dict[str, Any], default {}) – Options to pass to uproot. Pass at least {“allow_read_errors_with_report”: True} to turn on file access reports.
- Returns:
out (dict[str, DaskOutputType]) – The output of the analysis workflow applied to the datasets, keyed by dataset name.
report (dask_awkward.Array, optional) – The file access report for running the analysis on the input dataset. Needs to be computed in simultaneously with the analysis to be accurate.