DPOR in Practice¶
This is a practical guide to using explore_dpor() for systematic
concurrency testing. For the underlying algorithm and theory, see
DPOR: Dynamic Partial Order Reduction.
What DPOR does¶
DPOR and the invariant have separate jobs:
DPOR decides which interleavings to run. It watches every shared-memory access, detects which operations conflict (access the same object with at least one write), and uses that information to skip redundant orderings. Two interleavings that only differ in the order of independent operations (e.g. two reads, or accesses to different objects) are equivalent, so DPOR runs only one representative from each equivalence class.
The invariant decides whether a bug occurred. After all threads finish,
explore_dpor()calls your invariant on the final state. DPOR has no built-in notion of “correct” — it doesn’t know thatcounter == 1is wrong andcounter == 2is right. You supply that judgement via the invariant.
(The name “invariant” is standard in tools like this — loom, CHESS, etc. — even though it is technically a postcondition checked once after the threads finish, not a continuously-monitored loop invariant.)
Putting the two together:
from frontrun.dpor import explore_dpor
result = explore_dpor(
setup=MyState, # called fresh each execution
threads=[thread_a, thread_b], # each receives the state
invariant=lambda s: s.is_consistent(), # checked after all threads finish
)
if not result.property_holds:
print(f"Bug found after {result.num_explored} executions")
DPOR picks an interleaving (based on conflict analysis).
setup()creates fresh state; the threads run under that interleaving.The invariant checks the final state.
Repeat until all distinct interleavings are covered.
What it can and cannot find¶
DPOR explores alternative interleavings only where it detects a conflict — two threads accessing the same object with at least one write. It detects conflicts by instrumenting Python bytecode, so only operations that are visible at the bytecode level register as conflicts.
Operations DPOR sees (and will explore reorderings of):
Attribute reads and writes (
self.x,obj.field = ...)Subscript reads and writes (
d[key],lst[i] = ...)Lock acquire and release (
threading.Lock,threading.RLock)Thread spawn and join
Socket I/O (
connect,send,sendall,sendto,recv,recv_into,recvfrom) — detected via automatic monkey-patching whendetect_io=True(the default). Two threads accessing the samehost:portendpoint conflict; different endpoints are independent.File opens (
builtins.open) — read vs write determined by mode, resource identity from the resolved file path.
Beyond invariant violations, DPOR also detects deadlocks (via wait-for-graph cycle detection) and crashes (unhandled exceptions in any thread are re-raised after the execution completes).
Operations DPOR does not see (and will therefore not explore):
Database operations. Two threads calling
cursor.execute("UPDATE ...")on the same row look like independent C function calls to the tracer — DPOR sees no conflict between them and only runs one interleaving.Opaque C-extension I/O. Database drivers, Redis clients, and other libraries that manage sockets entirely in C code bypass the monkey-patches, so their operations appear independent.
C-extension shared state. Shared state modified inside C code (NumPy arrays, etc.) is not tracked at the bytecode level.
The consequence is not that DPOR “can’t run” on such code — it will run fine, it just won’t explore the interesting schedules. Because the external operations look independent, DPOR concludes that reordering them cannot change the outcome and skips all the alternative interleavings where the bugs hide.
For these cases, use trace markers with explicit scheduling instead — you annotate the points where interleaving matters and enumerate the orderings by hand.
Thread functions should also avoid external side effects (writing to files,
sending network requests, modifying global state outside the setup object).
DPOR replays each interleaving from scratch, so side effects from one replay
will interfere with the next.
Basic usage¶
The explore_dpor() function is the main entry point:
from frontrun.dpor import explore_dpor
class Counter:
def __init__(self):
self.value = 0
def increment(self):
temp = self.value
self.value = temp + 1
result = explore_dpor(
setup=Counter,
threads=[lambda c: c.increment(), lambda c: c.increment()],
invariant=lambda c: c.value == 2,
)
assert not result.property_holds
assert result.num_explored == 2 # only 2 of 6 interleavings needed
Parameters:
setupA callable that creates fresh shared state. Called once per execution so that each interleaving starts from a clean slate.
threadsA list of callables, each receiving the state returned by
setup. The length of this list determines the number of threads.invariantA predicate over the shared state that defines what “correct” means. Called after all threads finish each execution. Return
Trueif the state is valid,Falseif there is a bug. DPOR decides which interleavings to try; the invariant decides whether each one passed.preemption_bound(default: 2)Maximum number of preemptions (context switches away from a runnable thread) per execution. A bound of 2 catches the vast majority of real bugs. Set to
Nonefor unbounded exploration, but be aware that this can be exponentially slower.max_executions(default: None)Safety cap on total executions. Useful for CI where you want a time bound.
max_branches(default: 100,000)Maximum scheduling points per execution. Prevents runaway on programs with very long traces.
timeout_per_run(default: 5.0)Timeout in seconds for each individual execution.
reproduce_on_failure(default: 10)When a counterexample is found, replay the same schedule this many times to measure how reproducible the failure is. The reproduction count and percentage appear in
result.explanation. Set to 0 to skip.
Interpreting results¶
explore_dpor() returns an InterleavingResult (the same type used by
explore_interleavings):
@dataclass
class InterleavingResult:
property_holds: bool # True if invariant held everywhere
num_explored: int = 0 # total interleavings tried
counterexample: list[int] | None = None # first failing schedule
failures: list[tuple[int, list[int]]] = ... # all (execution_num, schedule) pairs
explanation: str | None = None # human-readable trace of the race
reproduction_attempts: int = 0 # number of replay attempts
reproduction_successes: int = 0 # how many replays reproduced the failure
counterexample is a list of thread IDs representing the order in
which threads were scheduled. For example, [0, 0, 1, 1] means thread 0
ran for two steps, then thread 1 ran for two steps.
When a race is found, explanation contains a formatted trace showing the
interleaved source lines, the conflict pattern (lost update, write-write, etc.),
and reproduction statistics. This is the same output for both explore_dpor
and explore_interleavings.
If num_explored is 1, your threads probably don’t share any
state — the engine saw no conflicts and skipped everything. This is a sign
that either the test is trivially correct or the shared state is not being
accessed in a way the tracer can see (e.g. through a C extension).
Example: verifying that a lock fixes a race¶
A common pattern is to first show that a race exists, then show that adding a lock eliminates it:
import threading
from frontrun.dpor import explore_dpor
class UnsafeCounter:
def __init__(self):
self.value = 0
def increment(self):
temp = self.value
self.value = temp + 1
class SafeCounter:
def __init__(self):
self.value = 0
self.lock = threading.Lock()
def increment(self):
with self.lock:
temp = self.value
self.value = temp + 1
def test_unsafe_counter_has_race():
result = explore_dpor(
setup=UnsafeCounter,
threads=[lambda c: c.increment(), lambda c: c.increment()],
invariant=lambda c: c.value == 2,
)
assert not result.property_holds
def test_safe_counter_is_correct():
result = explore_dpor(
setup=SafeCounter,
threads=[lambda c: c.increment(), lambda c: c.increment()],
invariant=lambda c: c.value == 2,
)
assert result.property_holds