Skip to content

fix(profiler): snapshot c.p.pids under RLock before Flush — resolve data race#102

Open
KorsarOfficial wants to merge 1 commit intoyandex:mainfrom
KorsarOfficial:fix/pids-flush-mutex
Open

fix(profiler): snapshot c.p.pids under RLock before Flush — resolve data race#102
KorsarOfficial wants to merge 1 commit intoyandex:mainfrom
KorsarOfficial:fix/pids-flush-mutex

Conversation

@KorsarOfficial
Copy link

Problem

The Flush method on continuousProfilingSampleConsumer iterates c.p.pids without holding c.p.pidsmu. The code has an explicit // TODO: hold mutex here comment acknowledging the race.

Concurrent addTracedPid / removeTracedPid calls mutate the map while Flush iterates it, which is undefined behavior in Go (potential panic or silent corruption).

Solution

Snapshot-then-iterate: copy the map under a short RLock, release the lock, then flush each consumer without holding any lock.

c.p.pidsmu.RLock() pidsSnapshot := make(map[...]*trackedProcess, len(c.p.pids)) for k, v := range c.p.pids { pidsSnapshot[k] = v } c.p.pidsmu.RUnlock() for pid, process := range pidsSnapshot { ... }

Why snapshot instead of holding RLock for the full loop: Each process.sampleConsumer.Flush(ctx) may be slow (network I/O, serialization). Holding the lock would block addTracedPid/removeTracedPid for the duration of all flushes.

Complexity

Lock hold time: O(n) pointer copies vs O(n * T_flush) before. Map snapshot cost is negligible (PID + pointer per entry).

@KorsarOfficial
Copy link
Author

📄 Full analysis report (PDF): 08-perforator-optimizations.pdf

Covers complexity analysis, concurrency audit, and verification for all 7 optimizations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant