Going fast on a small-size computing cluster - INSPIRE

DataBETA

Going fast on a small-size computing cluster

Niclas Steve Eich
(
- RWTH Aachen U.
)
,
Martin Erdmann
(
- RWTH Aachen U.
)
,
Svenja Diekmann
(
- RWTH Aachen U.
)
,
Manfred Peter Fackeldey
(
- RWTH Aachen U.
)
,
Benjamin Fischer
(
- RWTH Aachen U.
)

2023

6 pages

Published in:

J.Phys.Conf.Ser. 2438 (2023) 1, 012042

Contribution to:

- ACAT 2021

Published: 2023

DOI:

10.1088/1742-6596/2438/1/012042

reference search0 citations

Citations per year

0 Citations

Abstract: (IOP)

Fast turnaround times for LHC physics analyses are essential for scientific success. The ability to quickly perform optimizations and consolidation studies is critical. At the same time, computing demands and complexities are rising with the upcoming data taking periods and new technologies, such as deep learning. We present a show-case of the HH→bbWW analysis at the CMS experiment, where we process 𝒪(1 − 10)TB of data on 100 threads in a few hours. This analysis is based on the columnar NanoAOD data format, makes use of the NumPy ecosystem and HEP specific tools, in particular Coffea and Dask. Data locality, especially IO latency, is optimized by employing a multi-level caching structure using local file storage and on-worker SSD caches. We process thousands of events simultaneously within a single thread, thus enabling straightforward use of vectorized operations. Resource intensive computing tasks, such as GPU accelerated DNN inference and histogram aggregation in the 𝒪(10)GB regime, are offloaded to dedicated workers. The analysis consists of hundreds of distinctly different workloads and is steered through a workflow management tool ensuring reproducibility throughout the development process up to journal publication.

References(17)

Figures(0)

[1]

Array programming with NumPy

Charles R. Harris
(
- Unlisted, US, UT
)
,
K. Jarrod Millman
(
- UC, Berkeley and
- LBL, Berkeley
)
,
Stéfan J. van der Walt
(
- Stellenbosch U. and
- UC, Berkeley
)
,
Ralf Gommers
(
- Texas U.
)
,
Pauli Virtanen
(
- Jyvaskyla U.
)

et al.

- Nature 585 (2020) 7825, 357-362
•
e-Print:
- 2006.10256
•
DOI:
- 10.1038/s41586-020-2649-2

[2]

Scikit-hep website

•
- https://scikit-hep.org

DOI:
- 10.5281/zenodo.5762406

DOI:
- 10.5281/zenodo.5767911

DOI:
- 10.5281/zenodo.5548612

DOI:
- 10.5281/zenodo.5828686

DOI:
- 10.5281/zenodo.5750673

[8]

Dask Development TeamDask: Library for dynamic task scheduling URL

•
- https://dask.org

[9]

dask-jobqueue source code

- https://github.com/dask/dask-jobqueueaccessedon12.01.2022

[10]

dask-jobqueue blog entry

[11]

:,,Tensorflow-serving: Flexible, high-performance ml serving (Preprint

C. Olston
,
N. Fiedel
,
K. Gorovoy
,
J. Harmsen
,
L. Lao

et al.

•
e-Print:
- 1712.06139

[12]

Python 3 Reference Manual

G. Van Rossum
,
F.L. Drake

[13]

Anaconda software distribution URL

A Development Environment for Visual Physics Analysis

H.P. Bretz
(
- Aachen, Tech. Hochsch.
)
,
M. Brodski
(
- Aachen, Tech. Hochsch.
)
,
M. Erdmann
(
- Aachen, Tech. Hochsch.
)
,
R. Fischer
(
- Aachen, Tech. Hochsch.
)
,
A. Hinzmann
(
- Aachen, Tech. Hochsch.
)

et al.

- JINST 7 (2012) T08005
•
e-Print:
- 1205.4912
•
DOI:
- 10.1088/1748-0221/7/08/T08005

[15]

NANOAOD: a new compact event data format in CMS

CMS

Collaboration

•

Karl Ehatäht
(
- NICPB, Tallinn
)

for the collaboration.

- EPJ Web Conf. 245 (2020) 06002
•
DOI:
- 10.1051/epjconf/202024506002

[16]

ROOT: An object oriented data analysis framework

R. Brun
(
- CERN
)
,
F. Rademakers
(
- NIKHEF, Amsterdam and
- Hewlett-Packard, Geneva
)

- Nucl.Instrum.Meth.A 389 (1997) 81-86
•
DOI:
- 10.1016/S0168-9002(97)00048-X

[17]

Yuanyuan Zhou

James F. Philbin