Using Binary File Format Description Languages for Documenting, Parsing, and Verifying Raw Data in TAIGA Experiment

A. Demichev; A. Hmelnov; A. Kryukov; A. Mikhailov; A. Shigarov; D. Kostunin; D. Shipilov; D. Zhurov; E. Korosteleva; E. Postnikov

arxiv: 1812.01324 · v1 · pith:6GY3QZYYnew · submitted 2018-12-04 · 🌌 astro-ph.IM · cs.DC

Using Binary File Format Description Languages for Documenting, Parsing, and Verifying Raw Data in TAIGA Experiment

I. Bychkov , A. Demichev , J. Dubenskaya , O. Fedorov , A. Hmelnov , Y. Kazarina , E. Korosteleva , D. Kostunin

show 8 more authors

A. Kryukov A. Mikhailov M.D. Nguyen S. Polyakov E. Postnikov A. Shigarov D. Shipilov D. Zhurov

This is my paper

classification 🌌 astro-ph.IM cs.DC

keywords dataastroparticlebinaryexperimentformatlanguagestaigadescription

0 comments

read the original abstract

The paper is devoted to the issues of raw binary data documenting, parsing and verifying in astroparticle data lifecycle. The long-term preservation of raw data of astroparticle experiments as originally generated is essential for re-running analyses and reproducing research results. The selected high-quality raw data should have detailed documentation and accompanied by open software tools for access to them. We consider applicability of binary file format description languages to specify, parse and verify raw data of the Tunka Advanced Instrument for cosmic rays and Gamma Astronomy (TAIGA) experiment. The formal specifications are implemented for five data formats of the experiment and provide automatic generation of source code for data reading libraries in target programming languages (e.g. C++, Java, and Python). These libraries were tested on TAIGA data. They showed a good performance and help us to locate the parts with corrupted data. The format specifications can be used as metadata for exchanging of astroparticle raw data. They can also simplify software development for data aggregation from various sources for the multi-messenger analysis.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Development of a data infrastructure for a global data and analysis center in astroparticle physics
astro-ph.IM 2019-07 unverdicted novelty 3.0

GRADLCI develops a distributed data management system for open access to KASCADE and Tunka-133 cosmic-ray data to support joint analysis.
Metadata Extraction from Raw Astroparticle Data of TAIGA Experiment
astro-ph.IM 2019-07 unverdicted novelty 2.0

An extensible metadata extractor concept is presented to automatically collect and unify descriptive metadata from all TAIGA raw data formats.