pith. sign in

arxiv: 1807.10056 · v2 · pith:QZKOYRDMnew · submitted 2018-07-26 · 💻 cs.DC

FINJ: A Fault Injection Tool for HPC Systems

classification 💻 cs.DC
keywords finjfaultinjectionsystemstoolconditionsexperimentsallowing
0
0 comments X
read the original abstract

We present FINJ, a high-level fault injection tool for High-Performance Computing (HPC) systems, with a focus on the management of complex experiments. FINJ provides support for custom workloads and allows generation of anomalous conditions through the use of fault-triggering executable programs. FINJ can also be integrated seamlessly with most other lower-level fault injection tools, allowing users to create and monitor a variety of highly-complex and diverse fault conditions in HPC systems that would be difficult to recreate in practice. FINJ is suitable for experiments involving many, potentially interacting nodes, making it a very versatile design and evaluation tool.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.