1

I'am working in a Transparent Dirty Detection Agent (tdd-agent). It work really well redefining the target classes to implement the setDirty()/isDirty() and set it when it detect a putfield, but I want to extend it to detect calls to add() in collections, for example. How could I detect that call?

When the call is made with a complex parameters the generated asm separates the field and the final invocation to the method and I'm not figuring out how to detect and handle it.

For example:

private List<Foo> listaFoo = new ArrayList<>(); public void testNativeCollections(){ this.listaFoo.add(new Foo("text").setS("otro text")); } 

This block generate this asm:

 aload 0 // reference to self getfield test/OuterTarget.listaFoo:java.util.List new test/Foo dup ldc "text" (java.lang.String) invokespecial test/Foo.<init>(Ljava/lang/String;)V ldc "otro text" (java.lang.String) invokevirtual test/Foo.setS(Ljava/lang/String;)Ltest/Foo; invokeinterface java/util/List.add(Ljava/lang/Object;)Z pop 

and I need to pair the first getfield with the last invoke interface to detect that this call will modify the object and after it insert a call to setDirty(). The main issue is that the code between the getfield/invoke could be arbitrary long and complex.

3
  • 1
    Calling add does not necessarily modify the collection. In case of a Set, for example, a return value of false implies that the set has not been modified. Further, it seems you are focusing on the easier scenario, where the code modifying the collection is inside the class whose “dirty” flag you want to set. But this doesn’t have to be the case. The collection might be returned by a method and the caller may one of the dozens modification methods, perhaps one not yet defined, as the API can evolve. You should probably rethink your approach… Commented Mar 4, 2024 at 16:14
  • Just one step at time. First, detect and instrument collections in a simple way. Commented Mar 5, 2024 at 19:25
  • 1
    Ignoring the big picture does not necessarily lead to a simple solution. For example, when you consider the complexity of detecting all modifications of a collection, you may find that replacing a collection with a delegating collection designed for this purpose (and implemented in ordinary Java code) is simpler than instrumenting every code pattern that could lead to a modification. Then, the instrumentation task reduces to finding the creation sites and injecting the code to wrap the created collection. If I get your use case correctly, it would affect the constructors of the target only. Commented Mar 6, 2024 at 9:23

2 Answers 2

0

You'd have to build java's own internal (as in, its private API, but, java is open source, there's that) static stackwalk analysis code, which is extremely complicated. One upside is that bytecode gen has been changed to be more palatable to it (e.g. RET is no longer being generated by javac, mostly for this reason).

You need to map each and every opcode that exists (and there a lot) to the effect it has on the stack. For example, ALOAD_0 (which ASM at least helpfully flattens; ALOAD_0, ALOAD_1, ALOAD_2, ALOAD_3, and ALOAD constant are all flattened into just aload which helps a bit) has the effect of popping nothing and pushing 1 object onto the stack.

Doing that analysis:

 // stack at 0 aload 0 // stack at 1 getfield // pops 1, then pushes value. stack at 1, remember: stack[0] = listaFoo new test/Foo // stack at 2 dup // stack at 3 ldc "text" (java.lang.String) //stack at 4 invokespecial test/Foo.<init>(Ljava/lang/String;)V // stack at 2 ldc "otro text" (java.lang.String) // stack at 3 invokevirtual test/Foo.setS(Ljava/lang/String;)Ltest/Foo; // stack at 2 invokeinterface java/util/List.add(Ljava/lang/Object;)Z // stack at 1 pop 

All the flavours of invoke require analysing the signature to know what impact it has on stack (-1 stack for every argument it has, and all but invokestatic an additional -1 stack for the implicit receiver argument - then +1 stack if its return type is anything but V.

Here your analyser should be capable of registering that that last invokeinterface is the one that pops the field you are interested in off the stack as receiver of an add method, thus, qualifying it for a 'dirty' flag.

I don't think a library exists that is public and maintained in a way that it is intended for analysis like this, but, I haven't looked all that much - perhaps now you know what to look for.

The JVM does this kind of analysis itself when loading a class. For example, if you have this bytecode:

ALOAD_0 DUP GETFIELD someIntField IADD 

The verifier will abort with a VerifyError and this entire class will never even be loaded: It analysed that that IADD instruction, which pops 2 things off the stack and either [A] adds them if they are both ints and pushes that back on, or [B] all hell breaks loose if they aren't both int values - and it analysis that this is a B situation, because what's on the stack is this and an int - not 2 ints.

How does it do that? By applying the same principle: Analyse every instruction and keep track of what that does to the stack. In its case, it registers the type of each thing on the stack (ALOAD_0 - okay, there is an object in stack[0]. DUP - okay, there is an object in stack[1]. GETFIELD someIntField - okay, there is now an int in stack[1]. IADD - okay, check stack[current] and stack[current-1]'s types? Uhoh, one of em aint int - VerifyError!

You'd do the same thing, except instead of tracking 'int', 'object', etcetera, you track specifically: "Ah, field this-or-that", and once you hit an INVOKEVIRTUAL to a method sig you know 'dirties' its receiver (such as j/u/List.add(...)Z), you can use your stack analysis to know exactly what it is dirtying.

Sign up to request clarification or add additional context in comments.

2 Comments

Reading and readign, I landing on AnalyzerAdapter. Now looking how to use it in a correct way. It's seems it have a stack simulation that could do what I want.
@MarceloD.Ré if you want to go that route, you may have a look at Java ASM Bytecode - Find all instructions belonging to a specific method-call But as said in the comment at your question, I’d consider taking a different, simpler route…
0

And finally I got it! As I said, one step at time.

The trick was to use the AnalyzerAdapter class and keep a track about the name of the field in the stack so at any time you can know the field name who is referenced.

Here is the main code:

/** * * @author Marcelo D. Ré {@literal <[email protected]>} */ public class WriteAccessActivatorAdapter extends AnalyzerAdapter implements ITransparentDirtyDetectorDef, IJavaCollections { private final static Logger LOGGER = Logger.getLogger(WriteAccessActivatorAdapter.class.getName()); private boolean activate = false; private String owner; private List<String> ignoreFields; private List<String> collectionFields; HashSet<String> lastCollectionModifiedFields = new HashSet<>(); // mapea la posición de la pila con el nombre del campo asociado private Map<String,String> stackToField = new HashMap<>(); static { if (LOGGER.getLevel() == null) { LOGGER.setLevel(LogginProperties.WriteAccessActivatorAdapter); } } public WriteAccessActivatorAdapter(int api, String owner, int access, String name, String descriptor, MethodVisitor methodVisitor, List<String> ignoreFields, List<String> collectionFields ) { super(api, owner, access, name, descriptor, methodVisitor); this.ignoreFields = ignoreFields; this.collectionFields = collectionFields; this.owner = owner; } /** * Add a call to setDirty in every method that has a PUTFIELD in its code. * @param opcode código a analizar */ @Override public synchronized void visitInsn(int opcode) { LOGGER.log(Level.FINEST, "Activate: {0} - opcode: {1} ", new Object[]{this.activate,Printer.OPCODES[opcode]}); // analizar las listas if ((this.activate)&&((opcode >= Opcodes.IRETURN && opcode <= Opcodes.RETURN) || opcode == Opcodes.ATHROW )) { // si hay colleciones agregadas, incluirlas como dirty antes de retornar. if (lastCollectionModifiedFields.size()>0) { insertDirtyCollectionsFields(); lastCollectionModifiedFields.clear(); } LOGGER.log(Level.FINEST, "Agregando llamada a setDirty..."); mv.visitVarInsn(Opcodes.ALOAD, 0); // mv.visitInsn(Opcodes.ICONST_1); mv.visitMethodInsn(Opcodes.INVOKEVIRTUAL, owner, SETDIRTY, "()V", false); //mv.visitFieldInsn(Opcodes.PUTFIELD, owner, "__ogm__dirtyMark", "Z"); } super.visitInsn(opcode); LOGGER.log(Level.FINEST, "fin --------------------------------------------------"); } @Override public synchronized void visitFieldInsn(int opcode, String owner, String name, String desc) { super.visitFieldInsn(opcode, owner, name, desc); LOGGER.log(Level.FINEST, "opcode: {0} - owner: {1} - name: {2} - desc: {3} - transient: {4}", new Object[]{Printer.OPCODES[opcode], owner, name, desc, ignoreFields.contains(name)}); printStack(); if ((opcode ==Opcodes.GETFIELD)||(opcode == Opcodes.GETSTATIC)) { // si se está accediendo a un field, preservar el nombre para futuras referencias. this.stackToField.put(""+(this.stack==null? 0:(this.stack.size()-1)),name); // this.owner = owner; } if ((opcode == Opcodes.PUTFIELD) && (!ignoreFields.contains(name))) { LOGGER.log(Level.FINEST, "Modificación detectada!! Agregar el campo \"{0}\" a la lista.",new Object[]{name}); this.activate = true; // this.owner = owner; printStack(); insertDirtyField(name); } LOGGER.log(Level.FINEST, "fin --------------------------------------------------"); } @Override public void visitMethodInsn(int opcode, String owner, String name, String descriptor, boolean isInterface) { LOGGER.log(Level.FINEST, "opcode: {0} - owner: {1} - name: {2} - desc: {3} - isInterface: {4}", new Object[]{Printer.OPCODES[opcode], owner, name, descriptor, isInterface}); printStack(); // si el método coincide con una de las clases y métodos a monitorear, revisar el stack para verificar // que el campo sea un field. LOGGER.log(Level.FINEST, "activable object?: "+getJavaCollections().contains("L"+owner+";") +" - method: "+ name + "> activable? : " +getJavaCollectionsDirtyMethods().contains(name) ); if ((getJavaCollections().contains("L"+owner+";")) && (getJavaCollectionsDirtyMethods().contains(name))) { // calcular la posición de la pila a acceder int stackOffset = descriptor.equals("()V")?0:descriptor.substring(1, descriptor.indexOf(")")) .split(";").length; int stackIdx = this.stack == null ? 0 : this.stack.size() - 1 - stackOffset; String field = this.stackToField.get(""+stackIdx); LOGGER.log(Level.FINEST, "modificación de una colección detectada! stack idx: "+stackIdx+" field: "+field); if (this.collectionFields.contains(field)) { lastCollectionModifiedFields.add(field); this.activate = true; // this.owner = owner; } } super.visitMethodInsn(opcode, owner, name, descriptor, isInterface); } @Override public void visitInvokeDynamicInsn(String name, String descriptor, Handle bootstrapMethodHandle, Object... bootstrapMethodArguments) { super.visitInvokeDynamicInsn(name, descriptor, bootstrapMethodHandle, bootstrapMethodArguments); LOGGER.log(Level.FINEST, "\n\n\n\n\nname: "+name+" - desc: "+descriptor+" bs: "+ Arrays.toString(bootstrapMethodArguments)); printStack(); for (Object bsMthArg : bootstrapMethodArguments) { String bsMth = bsMthArg.toString(); int dot = bsMth.indexOf('.'); int bracket = bsMth.indexOf("("); if (dot > 0 && bracket > 0) { String cls = "L"+bsMth.substring(0, dot)+";"; String mth = bsMth.substring(dot+1, bracket); LOGGER.log(Level.FINEST, "cls: "+cls + " - method: "+mth); if (getJavaCollections().contains(cls) && getJavaCollectionsDirtyMethods().contains(mth)) { int stackIdx = this.stack.size() - 1 ; String field = this.stackToField.get(""+stackIdx); LOGGER.log(Level.FINEST, "modificación de una colección detectada! stack idx: "+stackIdx+" field: "+field); if (this.collectionFields.contains(field)) { lastCollectionModifiedFields.add(field); this.activate = true; } } } } LOGGER.log(Level.FINEST, "\n\n\n\n\n"); } @Override public void visitLabel(Label label) { LOGGER.log(Level.FINEST, "Label: "+label); if (lastCollectionModifiedFields.size()>0){ // si se ha agregado un collectionModifiedField, instrumentar add del campo LOGGER.log(Level.FINEST, "Modificaciones detectadas!! Agregar los campos a la lista."); printStack(); insertDirtyCollectionsFields(); // resetear el campo lastCollectionModifiedFields.clear(); LOGGER.log(Level.FINEST, " --------------------------------------------------"); } super.visitLabel(label); } @Override public void visitJumpInsn(int opcode, Label label) { if (this.activate && opcode == Opcodes.GOTO) { // si hay colleciones agregadas, incluirlas como dirty antes de retornar. if (lastCollectionModifiedFields.size()>0) { insertDirtyCollectionsFields(); lastCollectionModifiedFields.clear(); } } super.visitJumpInsn(opcode, label); } @Override public void visitEnd() { LOGGER.log(Level.FINEST, "fin MethodVisitor -------------------------------------"); // mv.visitMaxs(0, 0); super.visitEnd(); } private void printStack() { if (LOGGER.isLoggable(Level.FINEST)) { if (this.stack != null) { System.out.println("stack size:"+this.stack.size()); for (int i = 0; i < this.stack.size(); i++) { Object o = this.stack.get(i); System.out.println(""+o.getClass()+" : "+o + " --> "+ this.stackToField.get(""+i)); } System.out.println("--------------"); } else { System.out.println("stack size: NULL <<<<<<<<<<<<<<<<<<<<<<< "); } } } /** * Insert all field registered in the lastCollectionFields hashset. */ private void insertDirtyCollectionsFields() { for (String lastCollectionModifiedField : lastCollectionModifiedFields) { mv.visitVarInsn(Opcodes.ALOAD, 0); mv.visitFieldInsn(Opcodes.GETFIELD, owner, MODIFIEDFIELDS, "Ljava/util/Set;"); mv.visitLdcInsn(lastCollectionModifiedField); mv.visitMethodInsn(Opcodes.INVOKEINTERFACE, "java/util/Set", "add", "(Ljava/lang/Object;)Z", true); mv.visitInsn(Opcodes.POP); // Descartar el resultado booleano de add } } private void insertDirtyField(String name) { mv.visitVarInsn(Opcodes.ALOAD, 0); mv.visitFieldInsn(Opcodes.GETFIELD, owner, MODIFIEDFIELDS, "Ljava/util/Set;"); mv.visitLdcInsn(name); mv.visitMethodInsn(Opcodes.INVOKEINTERFACE, "java/util/Set", "add", "(Ljava/lang/Object;)Z", true); mv.visitInsn(Opcodes.POP); // Descartar el resultado booleano de add } } 

Of course, there's a lot of situation that it will not work. You must respect the "Tell, don`t ask" rule and "Law of Demeter" !!!

The full code is at github.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.