Columnar data processing for HEP analysis
2019
8 pages
Published in:
- EPJ Web Conf. 214 (2019) 06026
Contribution to:
- Published: 2019
Citations per year
Abstract: (EDP Sciences)
In the last stages of data analysis, physicists are often forced to choose between simplicity and execution speed. In High Energy Physics (HEP), high-level languages like Python are known for ease of use but also very slow execution. However, Python is used in speed-critical data analysis in other fields of science and industry. In those fields, most operations are performed on Numpy arrays in an array programming style; this style can be adopted for HEP by introducing variable-sized, nested data structures. We describe how array programming may be extended for HEP use-cases and an implementation known as awkward-array. We also present integration with ROOT, Apache Arrow, and Parquet, as well as preliminary performance results.- programming
- performance
- data analysis method
- numerical calculations
- numerical methods
References(15)
Figures(0)