How Frontrun Works

Frontrun provides four approaches for controlling thread interleaving. They operate at different levels of granularity and make different trade-offs between interpretability, coverage, and the kinds of bugs they can find.

DPOR (Systematic Exploration)

DPOR (Dynamic Partial Order Reduction) systematically explores every meaningfully different thread interleaving. Where the bytecode explorer samples randomly, DPOR guarantees completeness: every distinct interleaving is tried exactly once and redundant orderings are never re-run.

Like the bytecode explorer, DPOR instruments at the opcode level and needs no manual markers. It automatically detects attribute reads/writes, subscript accesses, and lock operations. The exploration engine is written in Rust for performance.

from frontrun.dpor import explore_dpor

class Counter:
    def __init__(self):
        self.value = 0

    def increment(self):
        temp = self.value
        self.value = temp + 1

result = explore_dpor(
    setup=Counter,
    threads=[lambda c: c.increment(), lambda c: c.increment()],
    invariant=lambda c: c.value == 2,
)
assert result.property_holds, result.explanation

When a race is found, result.explanation contains an interleaved source-line trace showing which threads accessed which shared state and the conflict pattern. DPOR’s traces are more interpretable than the bytecode explorer’s because DPOR knows why the interleaving matters — it detected specific conflicting accesses to the same object and chose to explore the alternative ordering.

Scope: DPOR explores alternative schedules where it detects a conflict — either two threads accessing the same Python object (with at least one write) or two threads performing I/O on the same network endpoint or file path. When run under the frontrun CLI, a native LD_PRELOAD library intercepts C-level I/O operations (send, recv, read, write, etc.) and feeds them into the DPOR engine, so even opaque database drivers (libpq, mysqlclient) and Redis clients (hiredis) are covered — see C-Level I/O Interception below.

DPOR does detect many C-level operations: list.append, dict.__setitem__, and similar mutating methods are seen via sys.setprofile; builtins like sorted(), sum(), min(), max(), and str.join() are registered as passthrough reads on their arguments; and container constructors (list(), dict(), enumerate(), zip(), etc.) are tracked as reads on their inputs.

The gap is C-level iteration interleaving. DPOR treats each C call as a single atomic operation, but under PEP 703 (free-threaded Python), C functions that iterate via PyIter_Next — such as list(od.keys()) — acquire and release the per-object lock on each element, allowing another thread to mutate the collection between iterations. When both sides of a race are single C opcodes (e.g. list(od.keys()) vs od.move_to_end("a")), no bytecode-level tool can expose the interleaving. See PEP-703-REPORT.md for details.

DPOR also cannot see shared state managed entirely inside a C extension without any I/O or Python-visible operations — for example, in-process mutations of NumPy arrays or C-level caches with no Python API calls.

For a practical guide see DPOR in Practice. For the algorithm details and theory see DPOR: Dynamic Partial Order Reduction.

Bytecode Instrumentation

Bytecode instrumentation automatically instruments functions at the opcode level — no markers needed.

How It Works:

Each thread is run with a sys.settrace callback that sets f_trace_opcodes = True on every frame, so the callback fires at every bytecode instruction rather than every source line. At each opcode the thread calls scheduler.wait_for_turn(), which blocks until the schedule says it’s that thread’s turn. Only user code is traced — stdlib and threading internals are skipped.

Because the scheduler controls which thread runs each opcode, any blocking call that happens in C code (like threading.Lock.acquire()) would deadlock — the blocked thread holds a scheduler turn but can’t make progress. To prevent this, all standard threading and queue primitives (Lock, RLock, Semaphore, BoundedSemaphore, Event, Condition, Queue, LifoQueue, PriorityQueue) are monkey-patched with cooperative versions that spin-yield via the scheduler instead of blocking. The patching is scoped to each test run: primitives are replaced before setup() and restored afterwards.

explore_interleavings() does property-based exploration in the style of Hypothesis: it generates random opcode-level schedules and checks that an invariant holds under each one, returning any counterexample schedule.

from frontrun.bytecode import explore_interleavings

class Counter:
    def __init__(self, value=0):
        self.value = value

    def increment(self):
        temp = self.value
        self.value = temp + 1

def test_counter_is_atomic():
    result = explore_interleavings(
        setup=lambda: Counter(value=0),
        threads=[
            lambda c: c.increment(),
            lambda c: c.increment(),
        ],
        invariant=lambda c: c.value == 2,
        max_attempts=200,
        max_ops=200,
        seed=42,
    )

    assert result.property_holds, result.explanation

explore_interleavings() often finds races very quickly — sometimes on the first attempt — because even a single random schedule has a reasonable chance of interleaving the critical section. It can also catch races that are invisible to DPOR: if a C extension mutates shared state without any I/O (e.g. in-process C-level mutations), bytecode exploration may stumble into the bad interleaving through random scheduling even though neither tool can see the C-level conflict directly.

The trade-off is interpretability. When a race is found, result.explanation contains an interleaved source-line trace and a best-effort conflict classification, but the bytecode explorer doesn’t know why the interleaving matters the way DPOR does. The reproduce_on_failure parameter (default 10) controls how many times the counterexample schedule is replayed to measure reproducibility.

Controlled Interleaving (Internal/Advanced):

The controlled_interleaving context manager and run_with_schedule function allow running threads under a specific opcode-level schedule. These are primarily intended for debugging this library or building tooling on top of it, rather than for general use in tests.

Note

Opcode-level schedules are not stable across Python versions. CPython does not guarantee that the same source code will compile to the same bytecode between minor releases, so a specific schedule that reproduces a race on Python 3.12 may not reproduce the same interleaving on 3.13. Counterexample schedules returned by explore_interleavings are likewise best treated as ephemeral debugging artifacts rather than long-lived test fixtures.

The async variant (frontrun.async_shuffler) uses natural await boundaries rather than opcodes, so its schedules are stable — see that module for details.

Trace Markers

Trace Markers use lightweight comment-based markers to define synchronization points in your code, requiring no semantic code changes.

How It Works:

Each thread is run with a sys.settrace callback that fires on every source line. The callback scans each line for # frontrun: <name> comments using a MarkerRegistry that caches marker locations per file. When a marker is hit, the thread calls ThreadCoordinator.wait_for_turn() which blocks until the schedule says it’s that thread’s turn to proceed past that marker. This gives deterministic control over the order threads reach each synchronization point, without changing any executable code — markers are just comments.

A marker gates the code that follows it. Name markers after the operation they gate (e.g. read_value, write_balance) rather than with temporal prefixes like before_ or after_.

from frontrun.common import Schedule, Step
from frontrun.trace_markers import TraceExecutor

class Counter:
    def __init__(self):
        self.value = 0

    def increment(self):
        temp = self.value  # frontrun: read_value
        temp += 1
        self.value = temp  # frontrun: write_value

Async Support:

Async trace markers use the same comment-based syntax. Each async task runs in its own thread (via asyncio.run), with the same sys.settrace mechanism controlling interleaving between tasks.

The synchronization contract:

  • A marker gates the next await expression (or the line it’s on if inline). When a task reaches a marker, it pauses until the scheduler grants it a turn. Only then does the gated await execute.

  • Between two markers, the task runs without interruption from other scheduled tasks. Any intermediate await calls within that span complete normally.

  • Because async code can only interleave at await points, markers should be placed to gate the await expressions whose ordering you want to control.

from frontrun import TraceExecutor
from frontrun.common import Schedule, Step

class AsyncCounter:
    def __init__(self):
        self.value = 0

    async def get_count(self):
        return self.value

    async def set_count(self, value):
        self.value = value

    async def increment(self):
        # frontrun: read_counter
        current = await self.get_count()
        # frontrun: write_counter
        await self.set_count(current + 1)

Marker Schedule Exploration

Marker schedule exploration bridges the gap between manual trace markers (which require knowing the bug-triggering interleaving in advance) and bytecode exploration (which searches an enormous opcode-level space). It uses # frontrun: comments as the vocabulary for schedule generation, then systematically or randomly explores all valid orderings.

Search space comparison:

Approach

2 threads, 100 opcodes each

2 threads, 5 markers each

Bytecode

2200 ≈ 1060

Markers

C(10, 5) = 252

For two threads with five markers each, the marker-level search space is just 252 valid orderings — small enough to explore exhaustively with completeness guarantees.

Exhaustive exploration:

explore_marker_interleavings() generates every valid interleaving of thread markers (preserving per-thread order), runs each one against real code via TraceExecutor, and checks an invariant. When it passes, the invariant is proven correct for all marker-level interleavings.

from frontrun.trace_markers import explore_marker_interleavings

class Counter:
    def __init__(self):
        self.value = 0

    def increment(self):
        temp = self.value  # frontrun: read_value
        self.value = temp + 1  # frontrun: write_value

result = explore_marker_interleavings(
    setup=Counter,
    threads={
        "t1": (lambda c: c.increment(), ["read_value", "write_value"]),
        "t2": (lambda c: c.increment(), ["read_value", "write_value"]),
    },
    invariant=lambda c: c.value == 2,
)
# result.property_holds is False — the lost-update race is found
# result.counterexample is a Schedule that reproduces the bug

When a violation is found, result.counterexample is a Schedule object that can be replayed directly with TraceExecutor for deterministic reproduction.

Hypothesis integration:

marker_schedule_strategy() is a Hypothesis strategy that generates valid Schedule objects. It integrates with Hypothesis’s shrinking to produce minimal counterexamples:

from hypothesis import given
from frontrun.trace_markers import marker_schedule_strategy, TraceExecutor

@given(schedule=marker_schedule_strategy(
    threads={"t1": ["read_value", "write_value"],
             "t2": ["read_value", "write_value"]},
))
def test_counter_is_atomic(schedule):
    counter = Counter()
    executor = TraceExecutor(schedule)
    executor.run("t1", counter.increment)
    executor.run("t2", counter.increment)
    executor.wait(timeout=5.0)
    assert counter.value == 2

Enumeration:

all_marker_schedules() returns all valid interleavings as a list of Schedule objects. The count equals the multinomial coefficient (k₁ + k₂ + + kₙ)! / (k₁! · k₂! · · kₙ!) where each kᵢ is the number of markers for thread i.

from frontrun.trace_markers import all_marker_schedules

schedules = all_marker_schedules(
    threads={"t1": ["a", "b"], "t2": ["x", "y"]},
)
assert len(schedules) == 6  # C(4,2) = 4! / (2! · 2!)

Complementary workflow:

The three marker-level tools are complementary:

  1. Use bytecode exploration or DPOR to discover a race automatically.

  2. Add # frontrun: markers at the identified race window for regression testing.

  3. Use explore_marker_interleavings() to verify the fix eliminates all problematic interleavings, not just the one counterexample.

Automatic I/O Detection

Both the bytecode explorer and DPOR automatically detect socket and file I/O operations. This is enabled by default (detect_io=True) and works by monkey-patching socket.socket methods and builtins.open to report resource accesses to the scheduler.

Python-level detection (monkey-patching):

  • Sockets: connect, send, sendall, sendto, recv, recv_into, recvfrom

  • Files: open() (read vs write determined by mode)

Resource identity is derived from the socket’s peer address (host:port) or the file’s resolved path. Two threads accessing the same endpoint or file are treated as conflicting; different endpoints are independent.

C-Level I/O Interception

When run under the frontrun CLI, a native LD_PRELOAD library (libfrontrun_io.so) intercepts libc I/O functions directly. This covers opaque C extensions — database drivers (libpq, mysqlclient), Redis clients, HTTP libraries, and anything else that calls libc’s send(), recv(), read(), write(), etc.

Intercepted functions: connect, send, sendto, sendmsg, write, writev, recv, recvfrom, recvmsg, read, readv, close

The library maintains a process-global file-descriptor → resource map:

connect(fd, sockaddr{127.0.0.1:5432}, ...)  →  fd=7 → "socket:127.0.0.1:5432"
send(fd=7, ...)                              →  report write to "socket:127.0.0.1:5432"
recv(fd=7, ...)                              →  report read from "socket:127.0.0.1:5432"
close(fd=7)                                  →  remove fd=7 from map

Events are communicated to the Python side via a pipe (FRONTRUN_IO_FD). An IOEventDispatcher reads the pipe on a background thread and delivers events to registered listeners. When DPOR is active, a _PreloadBridge listener routes events to the DPOR engine for conflict analysis.

Building:

make build-io    # builds and copies libfrontrun_io.so into the frontrun package

Usage:

frontrun pytest -vv tests/           # I/O interception + monkey-patching
frontrun python examples/orm_race.py  # same, for scripts