{ "cells": [ { "cell_type": "markdown", "id": "26b5b0d7-e935-4de9-aba7-eb8183f71fa4", "metadata": {}, "source": [ "# Getting Partial Results with `split_fileset`\n", "\n", "When processing large filesets, individual files or datasets can fail — a broken XRootD link, a network timeout, a corrupt file. By default, one failure aborts the whole run and you lose all partial progress.\n", "\n", "This notebook shows two utilities from `coffea.dataset_tools` that let you work around this:\n", "\n", "- **`split_fileset`** — divides a fileset into independently-processable chunks so a single failure only loses one chunk\n", "- **`hash_fileset`** — produces a stable content-based hash for a chunk, enabling on-disk caching so successfully processed chunks are never re-run\n", "\n", "These work alongside the `use_result_type=True` flag on `processor.Runner`, which changes the runner's return value from a plain accumulator to an `Ok(Accumulatable)`/`Err(Exception)` result object — letting you decide what to do with each failed chunk instead of catching exceptions." ] }, { "cell_type": "code", "execution_count": 1, "id": "cda0d64d-6b39-47ed-9ad2-0741ad6663ba", "metadata": {}, "outputs": [], "source": [ "import hist\n", "import coffea.processor as processor\n", "from coffea.nanoevents import schemas\n", "from coffea.dataset_tools import split_fileset, hash_fileset\n", "from coffea.util import load, save" ] }, { "cell_type": "markdown", "id": "0b1308e0-8735-467e-8666-e509136eff1a", "metadata": {}, "source": [ "## 1. Processor\n", "\n", "No changes are needed to the processor itself, `split_fileset` works with any existing `ProcessorABC` subclass." ] }, { "cell_type": "code", "execution_count": 2, "id": "26495a49-c13a-472c-9d3f-d20a2aba36d2", "metadata": {}, "outputs": [], "source": [ "class Processor(processor.ProcessorABC):\n", " def __init__(self):\n", " dataset_axis = hist.axis.StrCategory(name=\"dataset\", label=\"\", categories=[], growth=True)\n", " MET_axis = hist.axis.Regular(name=\"MET\", label=\"MET [GeV]\", bins=50, start=0, stop=100)\n", " self.output = processor.dict_accumulator({\n", " 'MET': hist.Hist(dataset_axis, MET_axis),\n", " 'cutflow': processor.defaultdict_accumulator(int)\n", " })\n", "\n", " def process(self, events):\n", " dataset = events.metadata[\"dataset\"]\n", " MET = events.MET.pt\n", " self.output['cutflow']['all events'] += len(MET)\n", " self.output['cutflow']['number of chunks'] += 1\n", " self.output['MET'].fill(dataset=dataset, MET=MET)\n", " return self.output\n", "\n", " def postprocess(self, accumulator):\n", " pass" ] }, { "cell_type": "markdown", "id": "3de49799-5e69-46cf-9be0-3f946369bf4f", "metadata": {}, "source": [ "## 2. Setting up the Runner with `use_result_type=True`\n", "\n", "By default, `processor.Runner.__call__` returns the accumulator directly or raises on error.\n", "Setting `use_result_type=True` changes this: the runner returns an `Ok(output)` on success or `Err(exception)` on failure, without raising.\n", "\n", "`use_result_type` works as an extension of `skipbadfiles`: the `skipbadfiles` value picks which exception types are considered \"expected\" (broken file, missing tree, etc.), and `use_result_type=True` flips the handling of those exceptions from a silent skip + warning to an explicit `Err` return. Exceptions outside that set still propagate as real bugs.\n", "\n", "`skipbadfiles` is therefore **required** when `use_result_type=True`. The most common values are:\n", "\n", "- `skipbadfiles=True` — match `OSError` (the default file/network family). Good for most IO-failure cases.\n", "- `skipbadfiles=(FileNotFoundError, uproot.exceptions.KeyInFileError)` — restrict to a specific set.\n", "\n", "With `split_fileset` you process each chunk independently and decide per-chunk what to do based on result type:\n", "\n", "```python\n", "run_result = run(chunk, processor_instance=Processor())\n", "if run_result.is_ok():\n", " output, metrics = run_result.unwrap()\n", "else:\n", " print(f\"Chunk failed: {run_result.exception}\")\n", "```\n", "\n", "The `Result` API:\n", "- `result.is_ok()` — `True` if the chunk processed successfully\n", "- `result.unwrap()` — returns the value (`output` or `(output, metrics)` when `savemetrics=True`), or raises if `Err`\n", "- `result.exception` — the exception, if `Err`\n", "- `result.value` — when an executor with `recoverable=True` produced a partial accumulator before the failure, this holds the salvaged output (otherwise `None`)" ] }, { "cell_type": "code", "execution_count": 3, "id": "96becdbb-1cd6-426f-8b1f-188664422ecd", "metadata": {}, "outputs": [], "source": [ "executor = processor.FuturesExecutor()\n", "\n", "run = processor.Runner(\n", " executor=executor,\n", " schema=schemas.NanoAODSchema,\n", " savemetrics=True,\n", " skipbadfiles=True, # which exception types count as \"expected\"\n", " use_result_type=True, # ... and turn those into Err instead of silent skip\n", ")" ] }, { "cell_type": "markdown", "id": "f28a6f96-c05f-4725-8f76-d137152e1959", "metadata": {}, "source": [ "## 3. Splitting a fileset into chunks\n", "\n", "`split_fileset` divides a fileset into a list of smaller filesets (chunks). Each chunk can be processed independently, so a failure in one chunk does not affect the others.\n", "\n", "> **Note:** These chunks are *fileset-level* splits — a grouping of files — not the row-level chunks that coffea's executor creates internally when processing a single file.\n", "\n", "**Parameters:**\n", "\n", "- `fileset` — accepts any of the schemas understood by `processor.Runner`:\n", " - `{dataset: {\"files\": {path: treename, ...}}}` (per-file treename — used in this notebook)\n", " - `{dataset: {\"files\": [path, ...], \"treename\": ...}}` (dataset-level treename)\n", " - `{dataset: [path, ...]}` (bare list — requires `treename=` below)\n", "- `strategy` — `\"by_dataset\"`: one dataset per chunk; `None` (default): all datasets together in each chunk\n", "- `percentage` — integer that divides 100 evenly (e.g. 20, 25, 50); each chunk gets this percentage of each dataset's files\n", "- `datasets` — list, callable, or tuple to restrict which datasets are included\n", "- `treename` — required when any dataset uses list-format files without its own `treename` field; the value is folded into each chunk so the chunks remain self-contained for caching\n", "\n", "**Splitting modes:**\n", "\n", "- `strategy=\"by_dataset\"` → one chunk per dataset\n", "- `percentage=20` → 5 mixed chunks, each containing 20% of *every* dataset\n", "- `strategy=\"by_dataset\", percentage=20` → `N_datasets × 5` chunks, each dataset split separately\n", "- `datasets=[\"X\"]` + any of the above → strategy applied only to the listed datasets\n", "\n", "File paths are sorted before slicing into bins, so chunk composition is deterministic regardless of input dict insertion order — important for stable cache keys via `hash_fileset`." ] }, { "cell_type": "code", "execution_count": 4, "id": "4621c2ff-47f7-4d1c-98d8-0e02784a7b0c", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "strategy='by_dataset': 2 chunks\n", "[{'SingleMu_0': {'files': {'root://eeeeeeospublic.cern.ch//eos/opendata/cms/mc/RunIISummer20UL16NanoAODv9/GluGluHToWWTo2L2Nu_M125_TuneCP5_13TeV_powheg2_JHUGenV714_pythia8/NANOAODSIM/106X_mcRun2_asymptotic_v17-v2/260000/A41320F6-C9F9-574C-8DD2-BD98C200E4EE.root': 'Events',\n", " 'root://eospublic.cern.ch//eos/opendata/cms/mc/RunIISummer20UL16NanoAODv9/GluGluHToWWTo2L2Nu_M125_TuneCP5_13TeV_powheg2_JHUGenV714_pythia8/NANOAODSIM/106X_mcRun2_asymptotic_v17-v2/260000/A7FEFB1C-387F-2B4D-A111-C53CC9371EC7.root': 'Events',\n", " 'root://eospublic.cern.ch//eos/opendata/cms/mc/RunIISummer20UL16NanoAODv9/GluGluHToWWTo2L2Nu_M125_TuneCP5_13TeV_powheg2_JHUGenV714_pythia8/NANOAODSIM/106X_mcRun2_asymptotic_v17-v2/260000/AB10FBAB-92C0-C043-933D-117FCC5704BA.root': 'Events',\n", " 'root://eospublic.cern.ch//eos/opendata/cms/mc/RunIISummer20UL16NanoAODv9/GluGluHToWWTo2L2Nu_M125_TuneCP5_13TeV_powheg2_JHUGenV714_pythia8/NANOAODSIM/106X_mcRun2_asymptotic_v17-v2/260000/C6E8BB7F-7F54-0C4C-9EDF-479C7DBB12E4.root': 'Events',\n", " 'root://eospublic.cern.ch//eos/opendata/cms/mc/RunIISummer20UL16NanoAODv9/GluGluHToWWTo2L2Nu_M125_TuneCP5_13TeV_powheg2_JHUGenV714_pythia8/NANOAODSIM/106X_mcRun2_asymptotic_v17-v2/260000/CB90AA65-868A-F548-A291-3837A3113162.root': 'Events'}}},\n", " {'SingleMu_1': {'files': {'root://eospublic.cern.ch//eos/opendata/cms/mc/RunIISummer20UL16NanoAODv9/GluGluHToWWTo2L2Nu_M125_TuneCP5_13TeV_powheg2_JHUGenV714_pythia8/NANOAODSIM/106X_mcRun2_asymptotic_v17-v2/260000/18B53494-657F-5744-8131-58ABA4EE00ED.root': 'Events',\n", " 'root://eospublic.cern.ch//eos/opendata/cms/mc/RunIISummer20UL16NanoAODv9/GluGluHToWWTo2L2Nu_M125_TuneCP5_13TeV_powheg2_JHUGenV714_pythia8/NANOAODSIM/106X_mcRun2_asymptotic_v17-v2/260000/2CCE1139-F301-C341-AE1E-4D27AF294018.root': 'Events',\n", " 'root://eospublic.cern.ch//eos/opendata/cms/mc/RunIISummer20UL16NanoAODv9/GluGluHToWWTo2L2Nu_M125_TuneCP5_13TeV_powheg2_JHUGenV714_pythia8/NANOAODSIM/106X_mcRun2_asymptotic_v17-v2/260000/300C603C-F1DD-4A40-B4DD-F4E0B239A460.root': 'Events',\n", " 'root://eospublic.cern.ch//eos/opendata/cms/mc/RunIISummer20UL16NanoAODv9/GluGluHToWWTo2L2Nu_M125_TuneCP5_13TeV_powheg2_JHUGenV714_pythia8/NANOAODSIM/106X_mcRun2_asymptotic_v17-v2/260000/39251543-EE21-9C4C-80D5-5D9178F55C71.root': 'Events',\n", " 'root://eospublic.cern.ch//eos/opendata/cms/mc/RunIISummer20UL16NanoAODv9/GluGluHToWWTo2L2Nu_M125_TuneCP5_13TeV_powheg2_JHUGenV714_pythia8/NANOAODSIM/106X_mcRun2_asymptotic_v17-v2/260000/3BD29D89-9C4A-E743-8616-C6806281BF12.root': 'Events'}}}]\n", "\n", "strategy='by_dataset', percentage=20: 10 chunks\n", "\n", "percentage=20 (mixed): 5 chunks\n" ] } ], "source": [ "import pprint\n", "\n", "fileset = {\n", " 'SingleMu_0': {\n", " \"files\": {\n", " # broken link (intentional — demonstrates error handling)\n", " \"root://eeeeeeospublic.cern.ch//eos/opendata/cms/mc/RunIISummer20UL16NanoAODv9/GluGluHToWWTo2L2Nu_M125_TuneCP5_13TeV_powheg2_JHUGenV714_pythia8/NANOAODSIM/106X_mcRun2_asymptotic_v17-v2/260000/A41320F6-C9F9-574C-8DD2-BD98C200E4EE.root\": \"Events\",\n", " \"root://eospublic.cern.ch//eos/opendata/cms/mc/RunIISummer20UL16NanoAODv9/GluGluHToWWTo2L2Nu_M125_TuneCP5_13TeV_powheg2_JHUGenV714_pythia8/NANOAODSIM/106X_mcRun2_asymptotic_v17-v2/260000/A7FEFB1C-387F-2B4D-A111-C53CC9371EC7.root\": \"Events\",\n", " \"root://eospublic.cern.ch//eos/opendata/cms/mc/RunIISummer20UL16NanoAODv9/GluGluHToWWTo2L2Nu_M125_TuneCP5_13TeV_powheg2_JHUGenV714_pythia8/NANOAODSIM/106X_mcRun2_asymptotic_v17-v2/260000/AB10FBAB-92C0-C043-933D-117FCC5704BA.root\": \"Events\",\n", " \"root://eospublic.cern.ch//eos/opendata/cms/mc/RunIISummer20UL16NanoAODv9/GluGluHToWWTo2L2Nu_M125_TuneCP5_13TeV_powheg2_JHUGenV714_pythia8/NANOAODSIM/106X_mcRun2_asymptotic_v17-v2/260000/C6E8BB7F-7F54-0C4C-9EDF-479C7DBB12E4.root\": \"Events\",\n", " \"root://eospublic.cern.ch//eos/opendata/cms/mc/RunIISummer20UL16NanoAODv9/GluGluHToWWTo2L2Nu_M125_TuneCP5_13TeV_powheg2_JHUGenV714_pythia8/NANOAODSIM/106X_mcRun2_asymptotic_v17-v2/260000/CB90AA65-868A-F548-A291-3837A3113162.root\": \"Events\",\n", " }\n", " },\n", " 'SingleMu_1': {\n", " \"files\": {\n", " \"root://eospublic.cern.ch//eos/opendata/cms/mc/RunIISummer20UL16NanoAODv9/GluGluHToWWTo2L2Nu_M125_TuneCP5_13TeV_powheg2_JHUGenV714_pythia8/NANOAODSIM/106X_mcRun2_asymptotic_v17-v2/260000/18B53494-657F-5744-8131-58ABA4EE00ED.root\": \"Events\",\n", " \"root://eospublic.cern.ch//eos/opendata/cms/mc/RunIISummer20UL16NanoAODv9/GluGluHToWWTo2L2Nu_M125_TuneCP5_13TeV_powheg2_JHUGenV714_pythia8/NANOAODSIM/106X_mcRun2_asymptotic_v17-v2/260000/2CCE1139-F301-C341-AE1E-4D27AF294018.root\": \"Events\",\n", " \"root://eospublic.cern.ch//eos/opendata/cms/mc/RunIISummer20UL16NanoAODv9/GluGluHToWWTo2L2Nu_M125_TuneCP5_13TeV_powheg2_JHUGenV714_pythia8/NANOAODSIM/106X_mcRun2_asymptotic_v17-v2/260000/300C603C-F1DD-4A40-B4DD-F4E0B239A460.root\": \"Events\",\n", " \"root://eospublic.cern.ch//eos/opendata/cms/mc/RunIISummer20UL16NanoAODv9/GluGluHToWWTo2L2Nu_M125_TuneCP5_13TeV_powheg2_JHUGenV714_pythia8/NANOAODSIM/106X_mcRun2_asymptotic_v17-v2/260000/39251543-EE21-9C4C-80D5-5D9178F55C71.root\": \"Events\",\n", " \"root://eospublic.cern.ch//eos/opendata/cms/mc/RunIISummer20UL16NanoAODv9/GluGluHToWWTo2L2Nu_M125_TuneCP5_13TeV_powheg2_JHUGenV714_pythia8/NANOAODSIM/106X_mcRun2_asymptotic_v17-v2/260000/3BD29D89-9C4A-E743-8616-C6806281BF12.root\": \"Events\",\n", " }\n", " }\n", "}\n", "\n", "# strategy=\"by_dataset\": one chunk per dataset -> 2 chunks\n", "chunks_1 = split_fileset(fileset, strategy=\"by_dataset\")\n", "print(f\"strategy='by_dataset': {len(chunks_1)} chunks\")\n", "pprint.pprint(chunks_1)\n", "\n", "print()\n", "# strategy=\"by_dataset\", percentage=20: 5 chunks per dataset -> 10 chunks total\n", "chunks_2 = split_fileset(fileset, strategy=\"by_dataset\", percentage=20)\n", "print(f\"strategy='by_dataset', percentage=20: {len(chunks_2)} chunks\")\n", "\n", "print()\n", "# percentage=20: 5 mixed chunks — each chunk contains 20% of SingleMu_0 + 20% of SingleMu_1\n", "chunks_3 = split_fileset(fileset, percentage=20)\n", "print(f\"percentage=20 (mixed): {len(chunks_3)} chunks\")" ] }, { "cell_type": "markdown", "id": "6b8993c6-41f8-4f51-985a-06348e576cbd", "metadata": {}, "source": [ "## 4. Example 1: `strategy=\"by_dataset\"`\n", "\n", "The fileset has two datasets: `SingleMu_0` (which contains a broken link) and `SingleMu_1` (all valid files).\n", "With `strategy=\"by_dataset\"`, each dataset is its own chunk — so the failure in `SingleMu_0` leaves `SingleMu_1` unaffected.\n", "\n", "Expected result: the run for `SingleMu_0` returns `Err`, the run for `SingleMu_1` returns `Ok` and its output is accumulated." ] }, { "cell_type": "code", "execution_count": 5, "id": "ba247645-de2a-4606-9b46-4ba3db86f40d", "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "632a394952254ac6bf29c0258742ade1", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Output()" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stderr", "output_type": "stream", "text": [ "/home/iason/Dropbox/work/pyhep_dev/coffea/src/coffea/processor/executor.py:1277: UserWarning: Performed attempt 1 out of 4\n", " warnings.warn(\n", "/home/iason/Dropbox/work/pyhep_dev/coffea/src/coffea/processor/executor.py:1277: UserWarning: Performed attempt 2 out of 4\n", " warnings.warn(\n", "/home/iason/Dropbox/work/pyhep_dev/coffea/src/coffea/processor/executor.py:1277: UserWarning: Performed attempt 3 out of 4\n", " warnings.warn(\n", "/home/iason/Dropbox/work/pyhep_dev/coffea/src/coffea/processor/executor.py:1277: UserWarning: Performed attempt 4 out of 4\n", " warnings.warn(\n" ] }, { "data": { "text/html": [ "
\n" ], "text/plain": [] }, "metadata": {}, "output_type": "display_data" }, { "name": "stderr", "output_type": "stream", "text": [ "loky.process_executor._RemoteTraceback: \n", "\"\"\"\n", "Traceback (most recent call last):\n", " File \"/home/iason/micromamba/envs/py3.14/lib/python3.14/site-packages/loky/process_executor.py\", line 490, in _process_worker\n", " r = call_item()\n", " File \"/home/iason/micromamba/envs/py3.14/lib/python3.14/site-packages/loky/process_executor.py\", line 291, in __call__\n", " return self.fn(*self.args, **self.kwargs)\n", " ~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^\n", " File \"/home/iason/Dropbox/work/pyhep_dev/coffea/src/coffea/processor/executor.py\", line 1289, in automatic_retries\n", " raise e\n", " File \"/home/iason/Dropbox/work/pyhep_dev/coffea/src/coffea/processor/executor.py\", line 1275, in automatic_retries\n", " return func(*args, **kwargs)\n", " File \"/home/iason/Dropbox/work/pyhep_dev/coffea/src/coffea/processor/executor.py\", line 1359, in metadata_fetcher_root\n", " with uproot.open(\n", " ~~~~~~~~~~~^\n", " {item.filename: None}, timeout=xrootdtimeout, **uproot_options\n", " ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n", " ) as file:\n", " ^\n", " File \"/home/iason/micromamba/envs/py3.14/lib/python3.14/site-packages/uproot/reading.py\", line 144, in open\n", " file = ReadOnlyFile(\n", " file_path,\n", " ...<5 lines>...\n", " **options,\n", " )\n", " File \"/home/iason/micromamba/envs/py3.14/lib/python3.14/site-packages/uproot/reading.py\", line 563, in __init__\n", " self._source = source_cls(file_path, **self._options)\n", " ~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n", " File \"/home/iason/micromamba/envs/py3.14/lib/python3.14/site-packages/uproot/source/fsspec.py\", line 63, in __init__\n", " self._open()\n", " ~~~~~~~~~~^^\n", " File \"/home/iason/micromamba/envs/py3.14/lib/python3.14/site-packages/uproot/source/fsspec.py\", line 70, in _open\n", " self._fo = self._open_file.__enter__()\n", " ~~~~~~~~~~~~~~~~~~~~~~~~~^^\n", " File \"/home/iason/micromamba/envs/py3.14/lib/python3.14/site-packages/fsspec/core.py\", line 105, in __enter__\n", " f = self.fs.open(self.path, mode=mode)\n", " File \"/home/iason/micromamba/envs/py3.14/lib/python3.14/site-packages/fsspec_xrootd/xrootd.py\", line 802, in open\n", " f = self._open(\n", " path,\n", " ...<4 lines>...\n", " **kwargs,\n", " )\n", " File \"/home/iason/micromamba/envs/py3.14/lib/python3.14/site-packages/fsspec_xrootd/xrootd.py\", line 759, in _open\n", " return XRootDFile(\n", " self,\n", " ...<5 lines>...\n", " **kwargs,\n", " )\n", " File \"/home/iason/micromamba/envs/py3.14/lib/python3.14/site-packages/fsspec_xrootd/xrootd.py\", line 867, in __init__\n", " self._hosts = self._locate_sources(path)\n", " ~~~~~~~~~~~~~~~~~~~~^^^^^^\n", " File \"/home/iason/micromamba/envs/py3.14/lib/python3.14/site-packages/fsspec_xrootd/xrootd.py\", line 967, in _locate_sources\n", " raise OSError(\"XRootD error: \" + status.message)\n", "OSError: XRootD error: [FATAL] Invalid address\n", "\"\"\"\n", "\n", "The above exception was the direct cause of the following exception:\n", "\n", "Traceback (most recent call last):\n", " File \"/home/iason/Dropbox/work/pyhep_dev/coffea/src/coffea/processor/executor.py\", line 717, in _processwith\n", " merged = _watcher(FH, self, reducer, pool)\n", " File \"/home/iason/Dropbox/work/pyhep_dev/coffea/src/coffea/processor/executor.py\", line 498, in _watcher\n", " batch = FH.fetch(len(FH.completed))\n", " File \"/home/iason/Dropbox/work/pyhep_dev/coffea/src/coffea/processor/executor.py\", line 382, in fetch\n", " raise bad_futures[0].exception()\n", " File \"/home/iason/micromamba/envs/py3.14/lib/python3.14/site-packages/loky/process_executor.py\", line 490, in _process_worker\n", " r = call_item()\n", " ^^^^^^^^^^^\n", " File \"/home/iason/micromamba/envs/py3.14/lib/python3.14/site-packages/loky/process_executor.py\", line 291, in __call__\n", " return self.fn(*self.args, **self.kwargs)\n", " ^^^^^^^^^^^^^^^\n", " File \"/home/iason/Dropbox/work/pyhep_dev/coffea/src/coffea/processor/executor.py\", line 1289, in automatic_retries\n", " raise e\n", " \n", " File \"/home/iason/Dropbox/work/pyhep_dev/coffea/src/coffea/processor/executor.py\", line 1275, in automatic_retries\n", " return func(*args, **kwargs)\n", " ^^^^^^^\n", " File \"/home/iason/Dropbox/work/pyhep_dev/coffea/src/coffea/processor/executor.py\", line 1359, in metadata_fetcher_root\n", " with uproot.open(\n", " ^^^^^^^^^^^^^^^\n", " File \"/home/iason/micromamba/envs/py3.14/lib/python3.14/site-packages/uproot/reading.py\", line 144, in open\n", " file = ReadOnlyFile(\n", " ^^^^^^^^^^^^^^^^^\n", " File \"/home/iason/micromamba/envs/py3.14/lib/python3.14/site-packages/uproot/reading.py\", line 563, in __init__\n", " self._source = source_cls(file_path, **self._options)\n", " ^^^^^^^^^^^^^^^\n", " File \"/home/iason/micromamba/envs/py3.14/lib/python3.14/site-packages/uproot/source/fsspec.py\", line 63, in __init__\n", " self._open()\n", " ~~~~~~~~~~^^\n", " File \"/home/iason/micromamba/envs/py3.14/lib/python3.14/site-packages/uproot/source/fsspec.py\", line 70, in _open\n", " self._fo = self._open_file.__enter__()\n", " ^^^^^^^^^^^^^^^\n", " File \"/home/iason/micromamba/envs/py3.14/lib/python3.14/site-packages/fsspec/core.py\", line 105, in __enter__\n", " f = self.fs.open(self.path, mode=mode)\n", " ^^^^^^^^^^^\n", " File \"/home/iason/micromamba/envs/py3.14/lib/python3.14/site-packages/fsspec_xrootd/xrootd.py\", line 802, in open\n", " f = self._open(\n", " ^^^^^^^^^^^\n", " File \"/home/iason/micromamba/envs/py3.14/lib/python3.14/site-packages/fsspec_xrootd/xrootd.py\", line 759, in _open\n", " return XRootDFile(\n", " ^^^^^^^^^^^^^^^\n", " File \"/home/iason/micromamba/envs/py3.14/lib/python3.14/site-packages/fsspec_xrootd/xrootd.py\", line 867, in __init__\n", " self._hosts = self._locate_sources(path)\n", " ^^^^^^^^^^^\n", " File \"/home/iason/micromamba/envs/py3.14/lib/python3.14/site-packages/fsspec_xrootd/xrootd.py\", line 967, in _locate_sources\n", " raise OSError(\"XRootD error: \" + status.message)\n", " ^^^^^^^^^^^\n", "OSError: XRootD error: [FATAL] Invalid address\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Error processing chunk: XRootD error: [FATAL] Invalid address\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "0115067f6e9048ed915b82a3b4902d37", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Output()" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n" ], "text/plain": [] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "3f717169494a4ac2b598bd07aa86a68a", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Output()" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n" ], "text/plain": [] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "result = None\n", "\n", "for chunk in chunks_1:\n", " run_result = run(chunk, processor_instance=Processor())\n", " if run_result.is_ok():\n", " output, metrics = run_result.unwrap()\n", " else:\n", " # user can implement their own logic on how to treat failed chunks\n", " print(f\"Error processing chunk: {run_result.exception}\")\n", " continue\n", " if result is None:\n", " result = output\n", " else:\n", " result += output" ] }, { "cell_type": "code", "execution_count": 6, "id": "5b4a5e6b-2a85-46c4-b77b-625535fb8549", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[StairsArtists(stairs=