Paruj Ratanaworabhan, Martin Burtscher, Darko Kirovski, Rahul Nagpal, Benjamin Zorn and Karthik Pattabiraman, Proceedings of the International Symposium on the Principles and Practice of Parallel Programming (PPoPP), 2009. [ PDF File | Talk ]
You can find the technical report version here.
Abstract: Because data races represent a hard-to-manage class of errors in concurrent programs, numerous approaches to detect them have been proposed and evaluated. We specifically consider asymmetric races, a subclass of all race conditions, where a programmer’s thread correctly acquires and releases a lock for a given variable, while another thread causes a race by improperly accessing this variable. We introduce ToleRace, a runtime system that allows programs to either tolerate or detect asymmetric races based on local replication of shared state. ToleRace provides an approximation of atomicity in critical sections by creating local copies of shared variables when a critical section is entered and propagating the appropriate copy when the critical section is exited. We characterize the possible interleavings that can cause races and precisely describe the effect of ToleRace in each case. We study the theoretical aspects of an oracle that knows exactly what type of interleaving has occurred. Then, we present a software implementation of ToleRace on top of a dynamic instrumentation tool. We evaluate our implementation on multithreaded applications from the SPLASH2 and PARSEC suites and show that its overhead is acceptable, i.e., a factor of two on average.
This paper is superceded by the following journal paper.