Abstract: (IOP)
The need for nested data structures and combinatorial operations on arbitrary length lists has prevented particle physicists from adopting array-based data analysis frameworks, such as R, MATLAB, Numpy, and Pandas. These array frameworks work well for purely rectangular tables and hypercubes, but arrays of variable length arrays, called “jagged arrays,” are out of their scope. However, jagged arrays are a fundamental feature of particle physics data, as well as combining them to search for particle decays. To bridge this gap, we developed the awkward-array library, and in this paper we present feedback from some of the first physics groups using it for their analyses. They report similar computational performance to analysis code written in C++, but are split on the ease-of-use of array syntax. In a series of four phone interviews, all users noted how different array programming is from imperative programming, but whereas some found it easier in all aspects, others said it was more difficult to write, yet easier to read.
  • Jim
    • Pivarski
  • Jim
    • Pivarski
  • Jim Pivarski, Jaydeep Nandi, David Lange, and Peter Elmer. Columnar data processing for hep analysis. In European Physical Journal Web of Conferences, volume
  • Andrea Rizzi and Giovanni
    • Petrucciani. A
  • Nick Smith. zpeak.ipynb and zpeak.h/c [analysis examples]. [software], April
  • Nick Smith, Lindsey Gray, Matteo Cremonisi, Bo Jayatilaka, Oliver Gutsche, Allison Hall, Kevin Pedro (Fermilab), and Andrew Melo (Vanderbilt). Coffea: the case for columnar analysis [presentation]. HOW: HSF, OSG, WLCG workshop, March