Skip to content

Conversation

@SylvainJuge
Copy link
Member

@SylvainJuge SylvainJuge commented Jan 5, 2021

What does this PR do?

The WeakConcurrentMap implementation throws NullPointerException in the following cases:

  • trying to retrieve entry with a null key, or test if null key entry exists
  • trying to store a null value

While trying to perform any of the above operations is a bug and should be avoided, we can't ensure that any instrumentation plugin will try to, causing errors and verbose log.

This PR provides a wrapper implementation for WeakConcurrentMap that will improve current implementation:

  • avoids throwing NPE + provide no-op implementation when trying to perform any of the above operations with null keys/values
  • logs a single line in log with WARN level
  • provides full error stack trace in DEBUG level to allow easy identification of the root cause.

Few caveats:

  • we could not easily wrap the map with our own type as there are numerous (~22) usages of co.elastic.apm.agent.sdk.weakmap.WeakMapSupplier#createMap

Checklist

  • This is a bugfix
    • I have updated CHANGELOG.asciidoc
    • I have added tests that would fail without this fix
    • manual testing with log_level=debug with packaged agent
    • manual testing with log_level=warn with packaged agent
@codecov-io
Copy link

codecov-io commented Jan 5, 2021

Codecov Report

Merging #1597 (9f17f7d) into master (44f3b70) will increase coverage by 0.07%.
The diff coverage is 73.52%.

Impacted file tree graph

@@ Coverage Diff @@ ## master #1597 +/- ## ============================================ + Coverage 58.61% 58.69% +0.07%  Complexity 92 92 ============================================ Files 396 399 +3 Lines 17805 17854 +49 Branches 2472 2474 +2 ============================================ + Hits 10437 10480 +43  + Misses 6653 6650 -3  - Partials 715 724 +9 
Impacted Files Coverage Δ Complexity Δ
.../bci/bytebuddy/SoftlyReferencingTypePoolCache.java 93.75% <ø> (ø) 0.00 <0.00> (ø)
...elastic/apm/agent/sdk/weakmap/WeakMapSupplier.java 0.00% <0.00%> (ø) 0.00 <0.00> (ø)
...elastic/apm/agent/sdk/state/GlobalThreadLocal.java 41.66% <25.00%> (+41.66%) 0.00 <0.00> (ø)
...va/co/elastic/apm/agent/sdk/weakmap/NullCheck.java 72.72% <50.00%> (ø) 0.00 <0.00> (?)
...m/agent/sdk/weakmap/NullSafeWeakConcurrentMap.java 86.95% <86.95%> (ø) 0.00 <0.00> (?)
...a/co/elastic/apm/agent/bci/HelperClassManager.java 83.50% <100.00%> (ø) 0.00 <0.00> (ø)
...m/agent/sdk/weakmap/NullSafeWeakConcurrentSet.java 45.45% <100.00%> (ø) 0.00 <0.00> (?)
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 44f3b70...9f17f7d. Read the comment docs.

@ghost
Copy link

ghost commented Jan 5, 2021

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: Pull request #1597 updated

  • Start Time: 2021-01-06T20:04:06.742+0000

  • Duration: 44 min 52 sec

Test stats 🧪

Test Results
Failed 0
Passed 1685
Skipped 12
Total 1697

💚 Flaky test report

Tests succeeded.

Expand to view the summary

Test stats 🧪

Test Results
Failed 0
Passed 1685
Skipped 12
Total 1697

Copy link
Contributor

@eyalkoren eyalkoren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two usages of WeakConcurrentMap that deliberately not use the factory (so they can use a dedicated cleaner thread) - let's replace those as well.

Also, worth looking into the already reported stack trace and see if we can find what's wrong:

2020-12-22 12:17:40,576 ERROR [stderr] (default task-1) java.lang.NullPointerException 2020-12-22 12:17:40,577 ERROR [stderr] (default task-1) at co.elastic.apm.agent.shaded.weaklockfree.WeakConcurrentMap.containsKey(WeakConcurrentMap.java:146) 2020-12-22 12:17:40,579 ERROR [stderr] (default task-1) at co.elastic.apm.agent.process.ProcessHelper.doStartProcess(ProcessHelper.java:66) 2020-12-22 12:17:40,580 ERROR [stderr] (default task-1) at co.elastic.apm.agent.process.ProcessHelper.startProcess(ProcessHelper.java:51) 2020-12-22 12:17:40,580 ERROR [stderr] (default task-1) at co.elastic.apm.agent.process.ProcessStartInstrumentation$ProcessBuilderStartAdvice.onExit(ProcessStartInstrumentation.java:77) 2020-12-22 12:17:40,581 ERROR [stderr] (default task-1) at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1071) 2020-12-22 12:17:40,581 ERROR [stderr] (default task-1) at java.base/java.lang.Runtime.exec(Runtime.java:591) 2020-12-22 12:17:40,582 ERROR [stderr] (default task-1) at java.base/java.lang.Runtime.exec(Runtime.java:450) ... 
@eyalkoren
Copy link
Contributor

Let's also handle the NPEs coming from the WeakConcurrentMap used by com.blogspot.mydailyjava.weaklockfree.DetachedThreadLocal (see #1598) - we don't use it directly, we extend it with GlobalThreadLocal, so I think all we need to do is override set(), because keys are guaranteed to be non null, but take another look.

@SylvainJuge SylvainJuge changed the title null-safe weakmap to avoid excessive logging null-safe weak set/map + detached thread local to avoid excessive logging Jan 6, 2021
@SylvainJuge SylvainJuge changed the title null-safe weak set/map + detached thread local to avoid excessive logging null-safe weak set/map + detached thread local Jan 6, 2021
@SylvainJuge SylvainJuge requested a review from eyalkoren January 6, 2021 15:27
Comment on lines 31 to 37
public static <T> boolean isNullKey(@Nullable T key){
return true;
}

public static <T> boolean isNullValue(@Nullable T value) {
return true;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, forgot to update those, it's not really surprising that test fails with that :-)

@SylvainJuge SylvainJuge merged commit 56cc3b8 into elastic:master Jan 7, 2021
@SylvainJuge SylvainJuge deleted the nullsafe-weakmap branch January 7, 2021 09:17
@ghost
Copy link

ghost commented Jan 7, 2021

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: Pull request #1597 updated

  • Start Time: 2021-01-07T09:12:56.482+0000

  • Duration: 41 min 27 sec

Test stats 🧪

Test Results
Failed 0
Passed 1685
Skipped 12
Total 1697

💚 Flaky test report

Tests succeeded.

Expand to view the summary

Test stats 🧪

Test Results
Failed 0
Passed 1685
Skipped 12
Total 1697

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

3 participants