PackedSelection#
- class coffea.analysis_tools.PackedSelection(dtype='uint32')[source]#
Bases:
objectStore several boolean arrays in a compact manner
This class can store several boolean arrays in a memory-efficient mannner and evaluate arbitrary combinations of boolean requirements in an CPU-efficient way. Supported inputs are 1D numpy or awkward arrays.
- Parameters:
dtype (
numpy.dtypeorstr) – internal bitwidth of the packed array, which governs the maximum number of selections storable in this object. The default value isuint32, which allows up to 32 booleans to be stored, but if a smaller or larger number of selections needs to be stored, one can chooseuint16oruint64instead.
Attributes Summary
Is the PackedSelection in delayed mode?
What is the maximum supported number of selections in this PackedSelection?
Current list of mask names available
Methods Summary
add(name, selection[, fill_value])Add a new boolean array
add_multiple(selections[, fill_value])Add multiple boolean arrays at once, see
addfor detailsall(*names)Shorthand for
require, where all the values are True.allfalse(*names)Shorthand for
require, where all the values are False.any(*names)Return a mask vector corresponding to an inclusive OR of requirements
cutflow(*names[, commonmask, weights, ...])Compute the cutflow for a set of selections
nminusone(*names[, commonmask, weights, ...])Compute the "N-1" style selection for a set of selections
require(**names)Return a mask vector corresponding to specific requirements
Attributes Documentation
- delayed_mode#
Is the PackedSelection in delayed mode?
- Returns:
True if the PackedSelection is in delayed mode.
- Return type:
- maxitems#
What is the maximum supported number of selections in this PackedSelection?
- Returns:
Maximum number of selections that can be stored given the current dtype.
- Return type:
- names#
Current list of mask names available
Methods Documentation
- add(name, selection, fill_value=False)[source]#
Add a new boolean array
- Parameters:
name (
str) – name of the selectionselection (
numpy.ndarrayorawkward.Array) – a flat array of typeboolor?bool. If this is not the first selection added, it must also have the same shape as previously added selections. If the array is option-type, null entries will be filled withfill_value.fill_value (
bool, optional) – All masked entries will be filled as specified (default:False)
- add_multiple(selections, fill_value=False)[source]#
Add multiple boolean arrays at once, see
addfor details
- all(*names)[source]#
Shorthand for
require, where all the values are True. If no arguments are given, all the added selections are required to be True.
- allfalse(*names)[source]#
Shorthand for
require, where all the values are False. If no arguments are given, all the added selections are required to be False.
- any(*names)[source]#
Return a mask vector corresponding to an inclusive OR of requirements
- Parameters:
*names (
args) – The named selections to allow
Examples
If
>>> selection.names ['cut1', 'cut2', 'cut3']
then
>>> selection.any("cut1", "cut2") array([True, False, True, ...])
returns a boolean array where an entry is True if the corresponding entries
cut1 == Trueorcut2 == False, andcut3arbitrary.
- cutflow(*names, commonmask=None, weights=None, weightsmodifier=None)[source]#
Compute the cutflow for a set of selections
Returns an object which can return a list of the number of events that pass all the previous selections including the current one after each named selection is applied consecutively. The first element of the returned list is the total number of events before any selections are applied. The last element is the final number of events that pass after all the selections are applied. Can also return a cutflow histogram as a
hist.Histobject where the bin heights are the number of events of the cutflow list. If the PackedSelection is in delayed mode, the elements of the list will be dask_awkward Arrays that can be computed whenever the user wants. If the histogram is requested, those delayed arrays will be computed in the process in order to set the bin heights.- Parameters:
*names (
args) – The named selections to use, need to be a subset of the selections already addedcommonmask (
boolean numpy.ndarrayordask_awkward.lib.core.Array, optional) – A common mask which is applied for all the selections, including the initial one. Default is None.weights (
coffea.analysis_tools.Weights instance, optional) – The Weights object to use for the cutflow. If not provided, the cutflow will be unweighted.modifier (
str, optional) – The modifier to use for the weights. Default is None which results in Weights.weight() being called without a modifier.
- Returns:
Wrapper class describing the cutflow results. See the
Cutflowdocumentation for details.- Return type:
- nminusone(*names, commonmask=None, weights=None, weightsmodifier=None)[source]#
Compute the “N-1” style selection for a set of selections
The N-1 style selection for a set of selections, returns an object which can return a list of the number of events that pass all the other selections ignoring one at a time. The first element of the returned list is the total number of events before any selections are applied. The last element is the final number of events that pass if all selections are applied. It also returns a list of boolean mask vectors of which events pass the N-1 selection each time. Can also return a histogram as a
hist.Histobject where the bin heights are the number of events of the N-1 selection list. If the PackedSelection is in delayed mode, the elements of those lists will be dask_awkward Arrays that can be computed whenever the user wants. If the histogram is requested, the delayed arrays of the number of events list will be computed in the process in order to set the bin heights.- Parameters:
*names (
args) – The named selections to use, need to be a subset of the selections already addedcommonmask (
boolean numpy.ndarrayordask_awkward.lib.core.Array, optional) – A common mask which is applied for all the selections, including the initial one. Default is None.weights (
coffea.analysis_tools.Weights instance, optional) – The Weights object to use for the cutflow. If not provided, the cutflow will be unweighted.modifier (
str, optional) – The modifier to use for the weights. Default is None which results in Weights.weight() being called without a modifier.
- Returns:
Wrapper class describing the N-1 results. See the
NminusOnedocumentation for details.- Return type:
- require(**names)#
Return a mask vector corresponding to specific requirements
Specify an exact requirement on an arbitrary subset of the masks
- Parameters:
**names (
kwargs) – Each argument to require specific value for, in formarg=Trueorarg=False.
Examples
If
>>> selection.names ['cut1', 'cut2', 'cut3']
then
>>> selection.require(cut1=True, cut2=False) array([True, False, True, ...])
returns a boolean array where an entry is True if the corresponding entries
cut1 == True,cut2 == False, andcut3arbitrary.