apply_to_dataset#
- coffea.dataset_tools.apply_to_dataset(data_manipulation: ~coffea.processor.processor.ProcessorABC | ~collections.abc.Callable[[~dask_awkward.lib.core.Array], ~dask.base.DaskMethodsMixin | dict[~collections.abc.Hashable, ~dask.base.DaskMethodsMixin] | set[~dask.base.DaskMethodsMixin] | list[~dask.base.DaskMethodsMixin] | tuple[~dask.base.DaskMethodsMixin] | tuple[~dask.base.DaskMethodsMixin | dict[~collections.abc.Hashable, ~dask.base.DaskMethodsMixin] | set[~dask.base.DaskMethodsMixin] | list[~dask.base.DaskMethodsMixin] | tuple[~dask.base.DaskMethodsMixin], ...]], dataset: ~coffea.dataset_tools.filespec.DatasetSpec | dict, schemaclass: ~coffea.nanoevents.schemas.base.BaseSchema = <class 'coffea.nanoevents.schemas.nanoaod.NanoAODSchema'>, metadata: dict[~collections.abc.Hashable, ~typing.Any] = {}, uproot_options: dict[str, ~typing.Any] = {}) DaskMethodsMixin | dict[Hashable, DaskMethodsMixin] | set[DaskMethodsMixin] | list[DaskMethodsMixin] | tuple[DaskMethodsMixin] | tuple[DaskMethodsMixin | dict[Hashable, DaskMethodsMixin] | set[DaskMethodsMixin] | list[DaskMethodsMixin] | tuple[DaskMethodsMixin], ...] | tuple[DaskMethodsMixin | dict[Hashable, DaskMethodsMixin] | set[DaskMethodsMixin] | list[DaskMethodsMixin] | tuple[DaskMethodsMixin] | tuple[DaskMethodsMixin | dict[Hashable, DaskMethodsMixin] | set[DaskMethodsMixin] | list[DaskMethodsMixin] | tuple[DaskMethodsMixin], ...], Array][source]#
Apply the supplied function or processor to the supplied dataset.
- Parameters:
data_manipulation (
ProcessorABCorGenericHEPAnalysis) – The user analysis code to run on the input datasetdataset (
DatasetSpec | dict) – The data to be acted upon by the data manipulation passed in.schemaclass (
BaseSchema, defaultNanoAODSchema) – The nanoevents schema to interpret the input dataset with.metadata (
dict[Hashable,Any], default{}) – Metadata for the dataset that is accessible by the input analysis. Should also be dask-serializable.uproot_options (
dict[str,Any], default{}) – Options to pass to uproot. Pass at least {“allow_read_errors_with_report”: True} to turn on file access reports.
- Returns:
out (
DaskOutputType) – The output of the analysis workflow applied to the datasetreport (
dask_awkward.Array, optional) – The file access report for running the analysis on the input dataset. Needs to be computed in simultaneously with the analysis to be accurate.