NanoEventsFactory
- class coffea.nanoevents.NanoEventsFactory(schema, mapping, partition_key, cache=None, is_dask=False)[source]
Bases:
objectA factory class to build NanoEvents objects.
For most users, it is advisable to construct instances via methods like
from_rootso that the constructor args are properly set.Methods Summary
events()Build events
from_parquet(file[, treepath, entry_start, ...])Quickly build NanoEvents from a parquet file
from_preloaded(array_source[, entry_start, ...])Quickly build NanoEvents from a pre-loaded array source
from_root(file[, treepath, entry_start, ...])Quickly build NanoEvents from a root file
Methods Documentation
- events()[source]
Build events
- Returns:
If the NanoEventsFactory is running in delayed mode (Dask), this is a Dask awkward array of the events. If the mapping also produces a report, the output will be a tuple (events, report). If the factory is not running in delayed mode, this is an awkward array of the events.
- Return type:
out
- classmethod from_parquet(file, treepath=uproot._util.unset, entry_start=None, entry_stop=None, runtime_cache=None, persistent_cache=None, schemaclass=<class 'coffea.nanoevents.schemas.nanoaod.NanoAODSchema'>, metadata=None, parquet_options={}, skyhook_options={}, access_log=None, delayed=True)[source]
Quickly build NanoEvents from a parquet file
- Parameters:
file (str, pathlib.Path, pyarrow.NativeFile, or python file-like) – The filename or already opened file using e.g.
uproot.open()treepath (str, optional) – Name of the tree to read in the file
entry_start (int, optional) – Start at this entry offset in the tree (default 0)
entry_stop (int, optional) – Stop at this entry offset in the tree (default end of tree)
runtime_cache (dict, optional) – A dict-like interface to a cache object. This cache is expected to last the duration of the program only, and will be used to hold references to materialized awkward arrays, etc.
persistent_cache (dict, optional) – A dict-like interface to a cache object. Only bare numpy arrays will be placed in this cache, using globally-unique keys.
schemaclass (BaseSchema) – A schema class deriving from
BaseSchemaand implementing the desired view of the filemetadata (dict, optional) – Arbitrary metadata to add to the
base.NanoEventsobjectparquet_options (dict, optional) – Any options to pass to
pyarrow.parquet.ParquetFileaccess_log (list, optional) – Pass a list instance to record which branches were lazily accessed by this instance
delayed – Nanoevents will use dask as a backend to construct a delayed task graph representing your analysis.
- Returns:
out – A NanoEventsFactory instance built from the file at
file.- Return type:
- classmethod from_preloaded(array_source, entry_start=None, entry_stop=None, runtime_cache=None, persistent_cache=None, schemaclass=<class 'coffea.nanoevents.schemas.nanoaod.NanoAODSchema'>, metadata=None, access_log=None)[source]
Quickly build NanoEvents from a pre-loaded array source
- Parameters:
array_source (Mapping[str, awkward.Array]) – A mapping of names to awkward arrays, it must have a metadata attribute with uuid, num_rows, and path sub-items.
entry_start (int, optional) – Start at this entry offset in the tree (default 0)
entry_stop (int, optional) – Stop at this entry offset in the tree (default end of tree)
runtime_cache (dict, optional) – A dict-like interface to a cache object. This cache is expected to last the duration of the program only, and will be used to hold references to materialized awkward arrays, etc.
persistent_cache (dict, optional) – A dict-like interface to a cache object. Only bare numpy arrays will be placed in this cache, using globally-unique keys.
schemaclass (BaseSchema) – A schema class deriving from
BaseSchemaand implementing the desired view of the filemetadata (dict, optional) – Arbitrary metadata to add to the
base.NanoEventsobjectaccess_log (list, optional) – Pass a list instance to record which branches were lazily accessed by this instance
- Returns:
out – A NanoEventsFactory instance built from information in
array_source.- Return type:
- classmethod from_root(file, treepath=uproot._util.unset, entry_start=None, entry_stop=None, steps_per_file=uproot._util.unset, runtime_cache=None, persistent_cache=None, schemaclass=<class 'coffea.nanoevents.schemas.nanoaod.NanoAODSchema'>, metadata=None, uproot_options={}, access_log=None, iteritems_options={}, use_ak_forth=True, delayed=True, known_base_form=None, decompression_executor=None, interpretation_executor=None)[source]
Quickly build NanoEvents from a root file
- Parameters:
file (a string or dict input to
uproot.open()oruproot.dask()or auproot.reading.ReadOnlyDirectory) – The filename or dict of filenames including the treepath (as it would be passed directly touproot.open()oruproot.dask()) already opened file using e.g.uproot.open().treepath (str, optional) – Name of the tree to read in the file. Used only if
fileis auproot.reading.ReadOnlyDirectory.entry_start (int, optional (eager mode only)) – Start at this entry offset in the tree (default 0)
entry_stop (int, optional (eager mode only)) – Stop at this entry offset in the tree (default end of tree)
steps_per_file (int, optional) – Partition files into this many steps (previously “chunks”)
runtime_cache (dict, optional) – A dict-like interface to a cache object. This cache is expected to last the duration of the program only, and will be used to hold references to materialized awkward arrays, etc.
persistent_cache (dict, optional) – A dict-like interface to a cache object. Only bare numpy arrays will be placed in this cache, using globally-unique keys.
schemaclass (BaseSchema) – A schema class deriving from
BaseSchemaand implementing the desired view of the filemetadata (dict, optional) – Arbitrary metadata to add to the
base.NanoEventsobjectuproot_options (dict, optional) – Any options to pass to
uproot.openoruproot.daskaccess_log (list, optional) – Pass a list instance to record which branches were lazily accessed by this instance
use_ak_forth – Toggle using awkward_forth to interpret branches in root file.
delayed – Nanoevents will use dask as a backend to construct a delayed task graph representing your analysis.
known_base_form – If the base form of the input file is known ahead of time we can skip opening a single file and parsing metadata.
method) (interpretation_executor (None or Executor with a submit) – see: https://github.com/scikit-hep/uproot5/blob/main/src/uproot/_dask.py#L109
method) – see: https://github.com/scikit-hep/uproot5/blob/main/src/uproot/_dask.py#L113
- Returns:
out – A NanoEventsFactory instance built from the file at
file.- Return type: