1

In my application, I am using JNA to use the native code written in C. Where I get the notification from native application in callback.

In callback, I get a pointer and some other data to process. In JNA callback code, I have to use this pointer again to call some other native library code and have to pass that pointer. After that I have to return from this callback.

If I don't call that intermediate native library method from callback, which passes the pointer, it works fine, but if I add this call my application crashes intermittently (mostly after processing few hundred of callback requests, sometimes it run for thousands of callbacks also sucessfully).

This NotificationHook class objects for which hook is set up in native code is a static variable, as there will be only one hook for the application. And native library call this one by one.

public interface INotificationHook extends Callback { public int NotificationHook(TRANX htrans, NOTIFICATION.ByReference notification); } public class NotificationHook implements INotificationHook { @Override public int NotificationHook(final TRANX tranx, final NOTIFICATION.ByReference notification) { System.out.println("Enter Java Callback"); // notification contains actual data to process library.SendSrvcResponse(tranx); // if I put this method call, application crashes intermittently System.out.println("Exit Java Callback"); return 0; } } 

TRANX Native Structure:

typedef struct tagTRANX { int unused; } *TRANX; 

TRANX.java

import com.sun.jna.Pointer; import com.sun.jna.Structure; import com.sun.jna.Structure.FieldOrder; @FieldOrder({"unused"}) public class TRANX extends Structure { public static class ByReference extends TRANX implements Structure.ByReference { } public static class ByValue extends TRANX implements Structure.ByValue { } public int unused = 0; public TRANX () { super(); } public TRANX(final int unused) { super(); this.unused = unused; } public TRANX(final Pointer peer) { super(peer); } } 

Library Definition in JNA (java):

int SendSrvcResponse(TRANX tranx); 

This pointer is actually a Structure Pointer, I tried to create a structure and replaced this pointer with that, even in this case application is crashing.

When I added few entry exit logs, below is observation:

printf("Enter to JNA from c library for callback"); hs = (*piHook->hookProc) ( tranx, notification); printf("Callback completed from JNA and returned to c library"); 

Intermediate call c library:

SendSrvcResponse(TRANX tranx) { printf("Enter to send response in c library"); // do some operation printf("Enter to send response in c library"); } 

Every time when there is successful execution, this is what I get from print logs in console (collective JNA and C library):

  1. Enter to JNA from c library // C library calls the JNA callback
  2. Enter Java Callback // Java callback is called Enter to send
  3. response in c library // method in native library is called to send
  4. response Enter to send response in c library // method in native library is exit after sending response
  5. Exit Java Callback // Java callback completes it's operation
  6. Callback completed from JNA and returned to c library // after completing the JNA callback, returning the flow back to c library code

In scenario when, application is crashing this last statement is missing from print log. Only JNA returns the method flow back to C, but it's not received by C.

where notification is a pointer.

Crash Logs:

# # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x0000ffff6c78f610, pid=6396, tid=0x0000ffff6d9f0ac8 # # JRE version: OpenJDK Runtime Environment (8.0_272-b10) (build 1.8.0_272-b10) # Java VM: OpenJDK 64-Bit Server VM (25.272-b10 mixed mode linux-aarch64 compressed oops) # Derivative: IcedTea 3.17.0 # Distribution: Custom build (Fri Dec 11 01:04:16 UTC 2020) # Problematic frame: # C [jna791751318727750086.tmp+0xa610] Java_com_sun_jna_Native_setByte+0x50 # # Core dump written. Default location: ************** # # If you would like to submit a bug report, please include # instructions on how to reproduce the bug and visit: # https://icedtea.classpath.org/bugzilla # The crash happened outside the Java Virtual Machine in native code. # See problematic frame for where to report the bug. # --------------- T H R E A D --------------- Current thread (0x0000ffff6c7b7000): JavaThread "Thread-5356" [_thread_in_native, id=6642, stack(0x0000ffff6d9d0000,0x0000ffff6d9f0ac8)] siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x0000ffff6c2ec018 Registers: R0=0x0000000000000000 R1=0x0000000000000000 R2=0x0000ffff6c2ec018 R3=0x0000000000000000 R4=0x0000000000000000 R5=0x0000000000000000 R6=0x0000ffff66b49458 R7=0x0000ffff66b49458 R8=0x0000ffff6c78f5c0 R9=0x0000ffff6c7b72c8 R10=0x0000000020002230 R11=0x00000000f212dde8 R12=0x0000ffff691637f0 R13=0x0000000020002230 R14=0x00000000f212e4e0 R15=0x00000000f213acd0 R16=0x00000001006c7770 R17=0x0000000000000000 R18=0x0000000000000001 R19=0x0000ffff6c7aa250 R20=0x0000000000000000 R21=0x00000000f212e2b0 R22=0x0000000100781ed8 R23=0x00000000f212fd60 R24=0x00000000f213a868 R25=0x00000001000016d0 R26=0x00000000f213a880 R27=0x0000000000000000 R28=0x0000ffff6c7b7000 R29=0x0000ffff6d9efcd0 R30=0x0000ffff78fea394 Top of Stack: (sp=0x0000ffff6d9efcd0) 0x0000ffff6d9efcd0: 0000ffff6d9efd60 0000ffff78fea394 0x0000ffff6d9efce0: 00000000200f03db 0000000000000000 0x0000ffff6d9efcf0: 0000ffff6c2ec018 0000000000000000 0x0000ffff6d9efd00: 0000ffff6c7b7250 00aebfc8f212dfd8 0x0000ffff6d9efd10: 00000000f212dde8 00000000f21343b0 0x0000ffff6d9efd20: 00000000f21342f0 00000000f2134350 0x0000ffff6d9efd30: 00000000f2134368 00000000f2134380 0x0000ffff6d9efd40: 00000000f2134320 00000000f2134398 0x0000ffff6d9efd50: 00000000fca7ebc8 0000ffff77adb000 0x0000ffff6d9efd60: 0000ffff6d9f0220 0000ffff7988acb4 0x0000ffff6d9efd70: 00000000f21343c8 00000000f213a978 0x0000ffff6d9efd80: 0000ffff6d9f0220 0000ffff79952fbc 0x0000ffff6d9efd90: 00000000f213aa28 00000000f213aa68 0x0000ffff6d9efda0: 0000000000000000 00000000f213aae8 0x0000ffff6d9efdb0: 0000ffff6d9efdd0 0000ffff80d78678 0x0000ffff6d9efdc0: 0000000000000000 00000001000115f0 0x0000ffff6d9efdd0: 00000000fc391290 00000000f212e4e0 0x0000ffff6d9efde0: 00000000f212dfd8 00000000fc3a3960 0x0000ffff6d9efdf0: 00000000200eada6 0000000000000000 0x0000ffff6d9efe00: 00000000f21342e0 0000ffff6c7b7000 0x0000ffff6d9efe10: 00000000fc43c670 00000000d024ce28 0x0000ffff6d9efe20: 0000ffff6d9f0220 0000ffff799275fc 0x0000ffff6d9efe30: 00000000f212e4e0 00000000f212fb00 0x0000ffff6d9efe40: 00000000d0d14760 00000000d0d14760 0x0000ffff6d9efe50: 00000000f2131bc0 f212f75000000000 0x0000ffff6d9efe60: 0000ffff6d9f0220 0000ffff78eccdd8 0x0000ffff6d9efe70: 00000000f2131c20 00000000f212e398 0x0000ffff6d9efe80: 0000ffff6d9f0220 0000ffff79463d2c 0x0000ffff6d9efe90: 200f03db00000000 0000000100781ed8 0x0000ffff6d9efea0: 00000000f213acb0 00000000f21342e0 0x0000ffff6d9efeb0: 00000000f212e398 0000000000000000 0x0000ffff6d9efec0: 00000000f212e4e0 0000000000000000 Instructions: (pc=0x0000ffff6c78f610) 0x0000ffff6c78f5f0: e1 ff 00 91 1f 00 01 eb a3 02 00 54 e2 0f 42 a9 0x0000ffff6c78f600: c0 00 00 f0 00 40 09 91 e1 ff 40 39 00 20 42 b9 0x0000ffff6c78f610: 41 68 23 38 e0 00 00 34 e0 1b 40 f9 22 00 00 b0 0x0000ffff6c78f620: 21 00 00 b0 42 80 30 91 21 e0 30 91 45 e5 ff 97 Register to memory mapping: R0=0x0000000000000000 R1=0x0000000000000000 R2=0x0000ffff6c2ec018 R3=0x0000000000000000 R4=0x0000000000000000 R5=0x0000000000000000 R6=0x0000ffff66b49458 R7=0x0000ffff66b49458 R8=0x0000ffff6c78f5c0 R9=0x0000ffff6c7b72c8 R10=0x0000000020002230 R11=0x00000000f212dde8 R12=0x0000ffff691637f0 R13=0x0000000020002230 R14=0x00000000f212e4e0 R15=0x00000000f213acd0 R16=0x00000001006c7770 R17=0x0000000000000000 R18=0x0000000000000001 R19=0x0000ffff6c7aa250 R20=0x0000000000000000 R21=0x00000000f212e2b0 R22=0x0000000100781ed8 R23=0x00000000f212fd60 R24=0x00000000f213a868 R25=0x00000001000016d0 R26=0x00000000f213a880 R27=0x0000000000000000 R28=0x0000ffff6c7b7000 R29=0x0000ffff6d9efcd0 R30=0x0000ffff78fea394 Stack: [0x0000ffff6d9d0000,0x0000ffff6d9f0ac8], sp=0x0000ffff6d9efcd0, free space=127k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) C [jna791751318727750086.tmp+0xa610] Java_com_sun_jna_Native_setByte+0x50 J 5174 com.sun.jna.Native.setByte(Lcom/sun/jna/Pointer;JJB)V (0 bytes) @ 0x0000ffff78fea394 [0x0000ffff78fea300+0x94] V [libjvm.so+0x40f678] C 0x00000000f212e4e0 Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) J 5174 com.sun.jna.Native.setByte(Lcom/sun/jna/Pointer;JJB)V (0 bytes) @ 0x0000ffff78fea398 [0x0000ffff78fea300+0x98] J 9695 C2 com.sun.jna.Pointer.setByte(JB)V (11 bytes) @ 0x0000ffff7988acb4 [0x0000ffff7988ac80+0x34] J 9266 C2 com.sun.jna.Pointer.setValue(JLjava/lang/Object;Ljava/lang/Class;)V (607 bytes) @ 0x0000ffff79952fbc [0x0000ffff79951ec0+0x10fc] J 9230 C2 com.sun.jna.Structure.writeField(Lcom/sun/jna/Structure$StructField;)V (427 bytes) @ 0x0000ffff799275fc [0x0000ffff79927500+0xfc] J 9423 C2 com.sun.jna.Structure.write()V (126 bytes) @ 0x0000ffff79463d2c [0x0000ffff79463840+0x4ec] J 9434 C2 com.sun.jna.Structure.autoWrite()V (45 bytes) @ 0x0000ffff7904e274 [0x0000ffff7904e240+0x34] J 9905 C1 com.sun.jna.CallbackReference$DefaultCallbackProxy.invokeCallback([Ljava/lang/Object;)Ljava/lang/Object; (238 bytes) @ 0x0000ffff78ee17c4 [0x0000ffff78ede580+0x3244] J 9904 C1 com.sun.jna.CallbackReference$DefaultCallbackProxy.callback([Ljava/lang/Object;)Ljava/lang/Object; (22 bytes) @ 0x0000ffff78fa18b0 [0x0000ffff78fa1800+0xb0] v ~StubRoutines::call_stub 

Things I have tried:

  1. Replacing TRANX with a Pointer in JNA Callback Input.
  2. Calling clear() on structure in last line of Java callback method.
  3. Commenting complete processing Callback code in JNA, and only keeping the call to SendSrvcResponse method and return callback.

All above lead to crash, none was helpful.

One strange observation is that when this callback is implemented in native c code, the application not breaking. It's only breaking when integrated with JNA.

Can someone please help me to understand what could be the possible reason for this one or how I can investigate it further?

7
  • What is library? How do you set the value of tranx? Are you sure the function definition of SendSrvcResponse is correct? Commented Jan 10, 2022 at 17:13
  • @Robert This is some proprietary C library. Function Definition is a Structure Pointer, but I tried both ways, using Structure here, as Structure is used in JNA for Structure Pointer and replace it as Pointer also. Updated code. Commented Jan 10, 2022 at 17:27
  • @Robert I have added some more details to code, please do let me know, if you need any more detail here. I will highly appreciate your help, as I am stuck on this for very long. Commented Jan 10, 2022 at 17:55
  • 1. What is the native structure typedef of TRANX? 2. In your pointer constructor for TRANX you would normally have a read() to copy the native value to your local unusued. But if it's really unused this shouldn't matter. Is this ever used? 3. It looks like you're declaring the TRANX structure (and thus its memory allocation) yourself and passing it to native. How are you preventing it from being GC'd? If native is providing you the address to TRANX your mappings are wrong. Commented Jan 10, 2022 at 20:33
  • 1
    @GauravJeswani the key question here is "when is TRANX initialized". Whoever initializes it controls when it is released. Commented Jan 11, 2022 at 16:34

1 Answer 1

1

Based on the code you've provided, the problem is the same as other Callback-related questions here: you're losing the native allocation of TRANX due to Java's garbage collection.

A JNA Structure consists of two parts: a pointer (to data), and the data itself. You have not provided the native typedef for TRANX to confirm your JNA mapping, but an instantiated object will have an internal pointer reference, pointing to a 4-byte allocation of memory (the int unused).

You only show the callback code where TRANX is already an argument, meaning you've already instantiated it to pass to the callback.

If you allocated it yourself using new TRANX() or new TRANX(int unused), then JNA has

  • allocated 4 bytes of native memory
  • stored the pointer to it internally

In JNA, the native memory attached to a Structure is automatically freed as a part of the garbage collection process. This is a common problem with callbacks, as you generally don't control the timing of the callback return, so the following sequence occurs:

  • You create the object in Java (allocating the native 4 bytes which the TRANX structure tracks the pointer to internally)
  • You pass the TRANX object to the callback
  • Immediately after passing the object, Java no longer has need for its own object; it is unreachable and thus eligible for garbage collection
  • When GC occurs the native 4 bytes are freed as part of the process
  • The TRANX object in the callback still has the pointer internally, but it now points to memory that is no longer allocated, resulting in the SIGSEGV (or Invalid Memory Access error, or strange symptoms if the memory is allocated by another thread, or other undefined behavior).

The solution to the problem is to track the memory associated with TRANX.

  • If you are allocating it yourself, keep a reference to the TRANX object to prevent it from being unreachable.
    • This generally requires accessing the TRANX structure at some later point after you are sure the callback will have been processed
    • In JDK9+ a ReachabilityFence can be used for this.
    • In JDK8 you should manipulate the class in some way (e.g., read a value from it, or call toString on it, etc.).
  • If you are using a native allocation and creating the pointer from the peer value returned from the native API, then read the API to determine when that memory is freed.
Sign up to request clarification or add additional context in comments.

4 Comments

Yes, it's native allocation and TRANX is being passed to callback from native code in callback (NotificationHook > NotificationHook). And native library clears memory after the Callback method returns it complete flow (NotificationHook > NotificationHook). That's why it's little confusing for me, why this memory is being GCd? I have added more details around my callback hook. Please do let me know, if you need more details.
Highly appreciated your explanation. As per your inputs, I did some changes in my code. I called SendSrvcResponse method in different thread delayed by 50ms. By that Callback returns first and SendSrvcResponse after that. In that case application is not crashing. So looks like native library is doing some manipulation/changes in TRANX in SendSrvcResponse, which is causing the crash.
@GauravJeswani glad you solved it.
Yeah, hopefully looks like now near to the root cause of crash. Will work on solution now. Thanks for your help.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.