Majid Dadashi, Layali Rashid and Karthik Pattabiraman, Poster presentation at the 9th Workshop on Silicon Errors in Logic – System Effects (SELSE), 2013. [PDF file]
This paper is superceded by the following conference paper.
Abstract: Recent studies have shown that intermittent faults have gained increased prominence on being responsible for computer system failures. This category of faults is harder to diagnose in comparison with permanent faults. Full hardware diagnosis techniques incur significant power and area overheads. Software layer diagnosis techniques have zero area overhead but limited visibility into many micro-architectural structures and hence cannot diagnose faults in them. We propose SCRIBE, a simple hardware infrastructure to enable fine-grained software layer diagnosis. SCRIBE records the detailed micro-architectural resource usage of each instruction in the processor and exposes it to the software diagnosis layer. Our evaluation indicates that SCRIBE has an overhead of 12 to 23% depending on the processor type.