DataGroupSpec#

class coffea.dataset_tools.DataGroupSpec(root: RootModelRootType = PydanticUndefined)[source]#

Bases: RootModel[dict[str, DatasetSpec]], MutableMapping

Attributes Summary

model_config

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

num_entries

Compute the total number of entries across all files in all datasets, if available.

num_selected_entries

Compute the total number of selected entries across all files (calculated from steps), if available.

steps

Get the steps per dataset file, if available.

Methods Summary

filter_datasets([filter_name, filter_callable])

Filter files by a regex pattern on the dataset names(filter_name) or callable applied to DatasetSpecs (filter_callable).

filter_files([filter_name, filter_callable])

Filter files by a regex pattern on the file names(filter_name) or callable applied to Filespecs (filter_callable).

limit_files(max_files[, per_dataset])

Limit the number of files.

limit_steps(max_steps[, per_file])

Limit the steps

preprocess_data(data)

Attributes Documentation

model_config: ClassVar[ConfigDict] = {}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

num_entries#

Compute the total number of entries across all files in all datasets, if available.

num_selected_entries#

Compute the total number of selected entries across all files (calculated from steps), if available.

steps#

Get the steps per dataset file, if available.

Methods Documentation

filter_datasets(filter_name: str | None = None, filter_callable: Callable[[DatasetSpec], bool] | None = None) Self[source]#

Filter files by a regex pattern on the dataset names(filter_name) or callable applied to DatasetSpecs (filter_callable).

filter_files(filter_name: str | None = None, filter_callable: Callable[[CoffeaROOTFileSpec | CoffeaParquetFileSpec | CoffeaROOTFileSpecOptional | CoffeaParquetFileSpecOptional], bool] | None = None) Self[source]#

Filter files by a regex pattern on the file names(filter_name) or callable applied to Filespecs (filter_callable).

limit_files(max_files: int | slice | None, per_dataset: bool = True) Self[source]#

Limit the number of files.

limit_steps(max_steps: int | slice, per_file: bool = False) Self[source]#

Limit the steps

classmethod preprocess_data(data: Any) Any[source]#