coffea - Columnar Object Framework For Effective Analysis
Basic tools and wrappers for enabling not-too-alien syntax when running columnar Collider HEP analysis.
coffea is a prototype package for pulling together all the typical needs of a high-energy collider physics (HEP) experiment analysis using the scientific python ecosystem. It makes use of uproot and awkward-array to provide an array-based syntax for manipulating HEP event data in an efficient and numpythonic way. There are sub-packages that implement histogramming, plotting, and look-up table functionalities that are needed to convey scientific insight, apply transformations to data, and correct for discrepancies in Monte Carlo simulations compared to data.
coffea also supplies facilities for horizontally scaling an analysis in order to reduce time-to-insight in a way that is largely independent of the resource the analysis is being executed on. By making use of modern big-data technologies like Apache Spark, parsl, Dask , and Work Queue, it is possible with coffea to scale a HEP analysis from a testing on a laptop to: a large multi-core server, computing clusters, and super-computers without the need to alter or otherwise adapt the analysis code itself.
coffea is a HEP community project collaborating with iris-hep and is currently a prototype. We welcome input to improve its quality as we progress towards a sensible refactorization into the scientific python ecosystem and a first release. Please feel free to contribute at our github repo!
Installation
Install coffea like any other Python package:
pip install coffea
or similar (use sudo
, --user
, virtualenv
, or pip-in-conda if you wish).
For more details, see the Installing coffea section of the documentation.
Strict dependencies
Python (3.8+)
The following are installed automatically when you install coffea with pip:
numpy (1.22+);
uproot for interacting with ROOT files and handling their data transparently;
awkward-array to manipulate complex-structured columnar data, such as jagged arrays;
numba just-in-time compilation of python functions;
scipy for many statistical functions;
matplotlib as a plotting backend;
and other utility packages, as enumerated in
pyproject.toml
.
Documentation
All documentation is hosted at https://coffea-hep.readthedocs.io/
Citation
If you would like to cite this code in your work, you can use the zenodo DOI indicated in CITATION.cff
, or the latest DOI. You may also cite the proceedings:
“N. Smith et al 2020 EPJ Web Conf. 245 06012”
“L. Gray et al 2023 J. Phys.: Conf. Ser. 2438 012033”
- Installing coffea
- Coffea by Example
- Reading data with coffea NanoEvents
- Applying corrections to columnar data
- Coffea Processors
- Running inference tools
- Dataset discovery tools
- PackedSelection in Coffea 2023
- Coffea concepts
- API Reference Guide
- In coffea Namespace
- coffea.analysis_tools
- coffea.btag_tools
- Classes
- BTagScaleFactor
BTagScaleFactor
BTagScaleFactor.LOOSE
BTagScaleFactor.MEDIUM
BTagScaleFactor.TIGHT
BTagScaleFactor.RESHAPE
BTagScaleFactor.FLAV_B
BTagScaleFactor.FLAV_C
BTagScaleFactor.FLAV_UDSG
BTagScaleFactor.FLAV_B
BTagScaleFactor.FLAV_C
BTagScaleFactor.FLAV_UDSG
BTagScaleFactor.LOOSE
BTagScaleFactor.MEDIUM
BTagScaleFactor.RESHAPE
BTagScaleFactor.TIGHT
BTagScaleFactor.__call__()
BTagScaleFactor.eval()
BTagScaleFactor.readcsv()
- BTagScaleFactor
- Class Inheritance Diagram
- Classes
- coffea.dataset_tools
- coffea.jetmet_tools
- coffea.lookup_tools
- coffea.lumi_tools
- coffea.ml_tools
- coffea.nanoevents
- Classes
- NanoEventsFactory
- BaseSchema
- NanoAODSchema
NanoAODSchema
NanoAODSchema.all_cross_references
NanoAODSchema.error_missing_event_ids
NanoAODSchema.event_ids
NanoAODSchema.mixins
NanoAODSchema.nested_index_items
NanoAODSchema.nested_items
NanoAODSchema.special_items
NanoAODSchema.warn_missing_crossrefs
NanoAODSchema.behavior()
NanoAODSchema.v5()
NanoAODSchema.v6()
NanoAODSchema.v7()
- PFNanoAODSchema
- TreeMakerSchema
- PHYSLITESchema
- DelphesSchema
- PDUNESchema
- ScoutingNanoAODSchema
- FCC
- FCCSchema
- Class Inheritance Diagram
- Classes
- coffea.nanoevents.methods.base
- coffea.nanoevents.methods.candidate
- coffea.nanoevents.methods.nanoaod
- coffea.nanoevents.methods.vector
- coffea.processor
- coffea.util
- Not in coffea Namespace
- In coffea Namespace