filter_files#

coffea.dataset_tools.filter_files(fileset: ~coffea.dataset_tools.filespec.DataGroupSpec, thefilter: ~collections.abc.Callable[[tuple[str, ~coffea.dataset_tools.filespec.CoffeaROOTFileSpec | ~coffea.dataset_tools.filespec.CoffeaParquetFileSpec] | ~coffea.dataset_tools.filespec.InputFiles | ~coffea.dataset_tools.filespec.PreprocessedFiles], bool] = <function _default_filter>) DataGroupSpec[source]#

Modify the input fileset so that only the files of each dataset that pass the filter remain.

Parameters:
  • fileset (DataGroupSpec) – The set of datasets to be sliced.

  • thefilter (Callable[[tuple[str, CoffeaROOTFileSpec | CoffeaParquetFileSpec] | InputFiles | PreprocessedFiles], bool], default filters empty files) – How to filter the files in the each dataset.

Returns:

out – The reduced fileset with only the files specified by thefilter left.

Return type:

DataGroupSpec