0

I'm trying to code a program where I can:

  1. Load a file
  2. Input a start and beginning offset addresses where to scan data from
  3. Scan that offset range in search of specific sequence of bytes (such as "05805A6C")
  4. Retrieve the offset of every match and write them to a .txt file

i66.tinypic.com/2zelef5.png

As the picture shows I need to search the file for "05805A6C" and then print to a .txt file the offset "0x21F0".

I'm using Java Swing for this. So far I've been able to load the file as a Byte array[]. But I haven't found a way how to search for the specific sequence of bytes, nor setting that search between a range of offsets.

This is my code that opens and reads the file into byte array[]

public class Read { static public byte[] readBytesFromFile () { try { JFileChooser chooser = new JFileChooser(); int returnVal = chooser.showOpenDialog(null); if (returnVal == JFileChooser.APPROVE_OPTION) { FileInputStream input = new FileInputStream(chooser.getSelectedFile()); byte[] data = new byte[input.available()]; input.read(data); input.close(); return data; } return null; } catch (IOException e) { System.out.println("Unable to read bytes: " + e.getMessage()); return null; } } } 

And my code where I try to search among the bytes.

byte[] model = Read.readBytesFromFile(); String x = new String(model); boolean found = false; for (int i = 0; i < model.length; i++) { if(x.contains("05805A6C")){ found = true; } } if(found == true){ System.out.println("Yes"); }else{ System.out.println("No"); } 
3
  • "pmdl" is a String (not a byte[]). Commented Feb 7, 2016 at 2:18
  • "05805A6C" is a String, not a sequence of bytes Commented Feb 7, 2016 at 2:28
  • Right, but how should I go to search for that sequence? Commented Feb 7, 2016 at 2:34

2 Answers 2

3

Here's a bomb-proof1 way to search for a sequence of bytes in a byte array:

public boolean find(byte[] buffer, byte[] key) { for (int i = 0; i <= buffer.length - key.length; i++) { int j = 0; while (j < key.length && buffer[i + j] == key[j]) { j++; } if (j == key.length) { return true; } } return false; } 

There are more efficient ways to do this for large-scale searching; e.g. using the Boyer-Moore algorithm. However:

  • converting the byte array a String and using Java string search is NOT more efficient, and it is potentially fragile depending on what encoding you use when converting the bytes to a string.

  • converting the byte array to a hexadecimal encoded String is even less efficient ... and memory hungry ... though not fragile if you have enough memory. (You may need up to 5 times the memory as the file size while doing the conversion ...)


1 - bomb-proof, modulo any bugs :-)

Sign up to request clarification or add additional context in comments.

6 Comments

How would I use this? Is this a class?
1) You call it from your code, passing in the buffer and the key as parameters. 2) No. It is a method. You can put it into a class. Or you copy the code / method body into something else, refactoring as required.
I don't understand how to use this. What buffer and key suppose to be?. I only got one byte array[] which is the data from the file (model)
Buffer is the data you got from the file. Key is a byte array containing the byte sequence that you are search for.
Yeah, I got that method working and yes it does work in doing the search. However how can I get the offset address of every match? Also, how to do many search with different byte sequences from top to bottom?
|
0

EDIT It seems the charset from system to system is different so you may get different results so I approach it with another method:

String x = HexBin.encode(model); String b = new String("058a5a6c"); int index = 0; while((index = x.indexOf(b,index)) != -1 ) { System.out.println("0x"+Integer.toHexString(index/2)); index = index + 2; } ... 

15 Comments

I get an error of "possible lossy conversion from int to byte"
@Midori_hige check my edit, I tested it with eclipse and worked fine
It does works, surprisingly. Now, about getting the offset of each finding?
@Midori_hige: my edit now reflects the indices which the sequence occured
Bad idea. Unless you get the encoding right, converting a byte array to a string like that is lossy.
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.