Replace synchronized blocks in OkHttpCall with atomic CAS by Amir-yazdanmanesh · Pull Request #4656 · square/retrofit

Amir-yazdanmanesh · 2026-02-15T13:03:03Z

Summary

Replace synchronized(this) blocks in OkHttpCall with lock-free atomic operations to avoid pinning virtual threads (Project Loom)
Merge rawCall and creationFailure fields into a single AtomicReference<Object> rawCallState with state transitions null → Call or null → Throwable
(write-once semantics)
Replace boolean executed with AtomicBoolean using compareAndSet for the execute-once guard

Motivation

synchronized blocks pin virtual threads to platform threads, defeating the purpose of Project Loom. Since the guarded code protects single-instance creation with
minimal contention, atomic CAS is a better fit than ReentrantLock. See #4297.

Design

The key insight is merging rawCall + creationFailure into one AtomicReference. This avoids the race condition from #4301 where separate references could get out of
sync. Once the reference is set to a non-null value (either a Call or a Throwable), it never changes.

If two threads race to create the call, both call createRawCall() but only one CAS wins. The loser uses the winner's result. This is safe because createRawCall() is
side-effect-free.

Test plan

./gradlew :retrofit:java-test:check — all tests pass across JDK 8, 10-24 (including concurrency tests in CallTest.java)

…e#4072) The testJdkN tasks copied classpath and testClassesDirs from the built-in test task, which broke the Gradle task dependency graph. Re-running a test task would not recompile main/test sources because no dependency on compilation tasks was established. Wire directly to sourceSets.test outputs instead, giving Gradle implicit task dependencies through file collection producers. This also eliminates the tasks.getByName("test") eager realization anti-pattern.

1. Imports: Added AtomicBoolean and AtomicReference; removed GuardedBy 2. Fields: Replaced rawCall + creationFailure + executed with AtomicBoolean executed and AtomicReference<Object> rawCallState (merged call/failure into one atomic ref) 3. getRawCall(): Rewrote with CAS — reads state atomically, creates call on miss, uses compareAndSet to publish. Loser of a race uses the winner's result 4. throwStateFailure(): New static helper to rethrow stored failures 5. request() / timeout(): Removed synchronized keyword (now lock-free via getRawCall()) 6. execute(): Uses executed.compareAndSet(false, true) instead of synchronized block 7. enqueue(): Same CAS for executed check; delegates to getRawCall() instead of inlining creation 8. isExecuted(): Returns executed.get() instead of synchronized read 9. cancel() / isCanceled(): Read rawCallState.get() instead of synchronized block

JakeWharton · 2026-02-15T16:02:42Z

The locks don't guard long-running work, only state changes. I don't think this is a big deal. Has anyone demonstrated an actual problem or are we just speculating?

Amir-yazdanmanesh · 2026-02-16T09:05:19Z

The locks don't guard long-running work, only state changes. I don't think this is a big deal. Has anyone demonstrated an actual problem or are we just speculating?

You're right, the synchronized blocks here are short and don't guard long-running work, so pinning should be brief. I don't have benchmarks demonstrating an actual
throughput problem.

The change was motivated by the general guidance to avoid synchronized with virtual threads, but I understand that without evidence of a real issue, the added complexity
of CAS may not be justified.

Happy to close this if you'd prefer to keep the current approach until a concrete problem is reported.

JakeWharton · 2026-02-18T03:28:41Z

I need to review it more closely. The last attempt at this had subtle races because it used two atomic states. This one is different, but it still warrants exhaustively thinking through the state transitions in the context of concurrency.

Amir-yazdanmanesh · 2026-02-18T07:36:22Z

I need to review it more closely. The last attempt at this had subtle races because it used two atomic states. This one is different, but it still warrants exhaustively thinking through the state transitions in the context of concurrency.

Happy to walk through the state transitions while you review:
The main thing I'd flag is the cancel window, if cancel() is called after executed CAS succeeds but before rawCallState is set, it reads null and silently does nothing. The original synchronized approach probably had the same gap, but worth confirming there wasn't a canceled boolean that got checked post-creation.
The two-thread race in getRawCall() should be fine as long as createRawCall() has no side effects, both threads may call it, but only the CAS winner's result survives. I believe that holds, but interceptors/event listeners are worth a glance.
Let me know if a test for the cancel-before-create scenario would help

Amir-yazdanmanesh added 4 commits February 10, 2026 00:09

Merge branch 'square:trunk' into trunk

40de70a

Merge remote-tracking branch 'origin/trunk' into trunk

213893c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace synchronized blocks in OkHttpCall with atomic CAS #4656

Replace synchronized blocks in OkHttpCall with atomic CAS #4656
Amir-yazdanmanesh wants to merge 4 commits intosquare:trunkfrom
Amir-yazdanmanesh:trunk

Amir-yazdanmanesh commented Feb 15, 2026

JakeWharton commented Feb 15, 2026

Amir-yazdanmanesh commented Feb 16, 2026

JakeWharton commented Feb 18, 2026

Amir-yazdanmanesh commented Feb 18, 2026

Labels

2 participants

Conversation