pith. sign in

arxiv: 1702.02313 · v1 · pith:57RY6GJ2new · submitted 2017-02-08 · 💻 cs.AR

FASHION: Fault-Aware Self-Healing Intelligent On-chip Network

classification 💻 cs.AR
keywords routerfaultsfashionnetworkallowscomponentconnectedconnectivity
0
0 comments X
read the original abstract

To avoid packet loss and deadlock scenarios that arise due to faults or power gating in multicore and many-core systems, the network-on-chip needs to possess resilient communication and load-balancing properties. In this work, we introduce the Fashion router, a self-monitoring and self-reconfiguring design that allows for the on-chip network to dynamically adapt to component failures. First, we introduce a distributed intelligence unit, called Self-Awareness Module (SAM), which allows the router to detect permanent component failures and build a network connectivity map. Using local information, SAM adapts to faults, guarantees connectivity and deadlock-free routing inside the maximal connected subgraph and keeps routing tables up-to-date. Next, to reconfigure network links or virtual channels around faulty/power-gated components, we add bidirectional link and unified virtual channel structure features to the Fashion router. This version of the router, named Ex-Fashion, further mitigates the negative system performance impacts, leads to larger maximal connected subgraph and sustains a relatively high degree of fault-tolerance. To support the router, we develop a fault diagnosis and recovery algorithm executed by the Built-In Self-Test, self-monitoring, and self-reconfiguration units at runtime to provide fault-tolerant system functionalities. The Fashion router places no restriction on topology, position or number of faults. It drops 54.3-55.4% fewer nodes for same number of faults (between 30 and 60 faults) in an 8x8 2D-mesh over other state-of-the-art solutions. It is scalable and efficient. The area overheads are 2.311% and 2.659% when implemented in 8x8 and 16x16 2D-meshes using the TSMC 65nm library at 1.38GHz clock frequency.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.