2

I have given a byte array [97, 98, 0, 99, 100] which is GSM 7-Bit encoded. This should be converted into ab@cd. When I tried to append this given array into a StringBuilder, I was not able to convert the @ sign.

Here is my code:

byte[] byteFinal ={97, 98, 0, 99, 100}; char ch; StringBuilder str = new StringBuilder(); for(byte b : byteFinal){ ch = (char)b; System.out.println("ch:"+ch); str.append(ch); } System.out.println(str.toString()); 
10
  • 5
    Pretty sure 0 is char code for NUL non-printable string terminator, not @ character Commented Dec 17, 2018 at 6:50
  • exactly, but how if then i can convert this kind of byte array into a coherence text? Commented Dec 17, 2018 at 6:53
  • Do you want to convert 0 to @? If so use if. Commented Dec 17, 2018 at 6:55
  • I see no attempt in your code in "converting '\0' to '@'" Commented Dec 17, 2018 at 7:02
  • @Dummy a bit off topic: I don't NUL is string terminator in Java. Commented Dec 17, 2018 at 7:03

6 Answers 6

7

Based on your comments in other answers, the problem is caused by missing handling of GSM 7-bit encoding.

You can treat GSM 7 Bit as a different character encoding, and you shouldn't use byte array of such encoding as-is and cast each byte to char. Casting byte to char only works iff your bytes are in UTF-8/ASCII or similar encoding, and the characters are less than code point 128.

It seems Java does not provide a built-in Charset for GSM 7-bit (else, you could have done something like String result = new String(byteFinal, GSM_7_BIT_CHARSET);).

You need to handcraft the logic, which looks something like https://mnujali.wordpress.com/2011/12/01/gsm-7-bit-encodingdecoding-used-for-sms-and-ussd-strings-java-code/:

static final char[] GSM7CHARS = { 0x0040, 0x00A3, 0x0024, 0x00A5, 0x00E8, 0x00E9, 0x00F9, 0x00EC, 0x00F2, 0x00E7, 0x000A, 0x00D8, 0x00F8, 0x000D, 0x00C5, 0x00E5, 0x0394, 0x005F, 0x03A6, 0x0393, 0x039B, 0x03A9, 0x03A0, 0x03A8, 0x03A3, 0x0398, 0x039E, 0x00A0, 0x00C6, 0x00E6, 0x00DF, 0x00C9, 0x0020, 0x0021, 0x0022, 0x0023, 0x00A4, 0x0025, 0x0026, 0x0027, 0x0028, 0x0029, 0x002A, 0x002B, 0x002C, 0x002D, 0x002E, 0x002F, 0x0030, 0x0031, 0x0032, 0x0033, 0x0034, 0x0035, 0x0036, 0x0037, 0x0038, 0x0039, 0x003A, 0x003B, 0x003C, 0x003D, 0x003E, 0x003F, 0x00A1, 0x0041, 0x0042, 0x0043, 0x0044, 0x0045, 0x0046, 0x0047, 0x0048, 0x0049, 0x004A, 0x004B, 0x004C, 0x004D, 0x004E, 0x004F, 0x0050, 0x0051, 0x0052, 0x0053, 0x0054, 0x0055, 0x0056, 0x0057, 0x0058, 0x0059, 0x005A, 0x00C4, 0x00D6, 0x00D1, 0x00DC, 0x00A7, 0x00BF, 0x0061, 0x0062, 0x0063, 0x0064, 0x0065, 0x0066, 0x0067, 0x0068, 0x0069, 0x006A, 0x006B, 0x006C, 0x006D, 0x006E, 0x006F, 0x0070, 0x0071, 0x0072, 0x0073, 0x0074, 0x0075, 0x0076, 0x0077, 0x0078, 0x0079, 0x007A, 0x00E4, 0x00F6, 0x00F1, 0x00FC, 0x00E0}; static final char[] ESCAPE = { 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, '\n' , 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, '^' , 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, '{' , '}' , 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, '\\', 0x0000, 0x0000, 0x0000, 0x0000, '[' , '~' , ']' , 0x0000, '|' , 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x20AC, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000}; // or use -1 instead of 0x0000, depending on your preference //... byte[] byteFinal ={97, 98, 0, 99, 100}; StringBuilder sb = new StringBuilder(); boolean escape = false for(byte b : byteFinal){ if (b >= 0) { if (escape) { sb.append(ESCAPE[b] > 0 ? ESCAPE[b] : GSMCHARS[b]); escape = false; } else { if (b == 27) { // escape escape = true; } else { sb.append(GSM7CHARS[b]); } } } } System.out.println(sb.toString()); 

Update 1:

With some searching it seems GSM 7 bit encoding is a bit more complicated than what implemented above https://www.developershome.com/sms/gsmAlphabet.asp (Eg escaping etc)

However this at least give you idea on the need for handcrafting some lookup, instead of just casting the byte to char


Update 2:

It seems someone has implemented charset for GSM 7 bit: https://github.com/OpenSmpp/opensmpp/blob/master/charset/src/main/java/org/smpp/charset/Gsm7BitCharset.java

By using it, you can simply do something like String result = new String(byteFinal, GSM_7_BIT_CHARSET); without struggling with all those internals of GSM 7 bit


Update 3:

Thanks for comment from @user432024 , there seems to be yet another charset lib contains GSM-7-Bit : https://github.com/brake/telecom-charsets

Sign up to request clarification or add additional context in comments.

12 Comments

this is the most efficient answer. i've looked on the link you added and i saw there also many -1s' that you omited?
Your need is slightly different. Anyway, given that gsm 7bit is going to give you bytes of value [0, 127], the above logic should work. Those -1s seems only serve for purpose of invalid value.
your above example works in the real world app perfect (so far).
fortinatly right now i need only to unpack the received sms and convert received byteatrray to to 7bit bytearray. this proccess i've already done so that the last piece of the puzzle was to cast 7bit bytearray to text and static final char[] GSM7CHARS solve this issue.
There is also this lib: github.com/brake/telecom-charsets
|
3

Change array to:

byte[] byteFinal ={97, 98, 64, 99, 100}; 

Ascii code of '@' is 64. Incidentally caret notation of NUL character (ascii code 0) is ^@ which seems to have confused you here.

5 Comments

Good answer mentioning caret notation. I was initially clueless why OP could confuse @ with NUL :P
Ok for the full picture of the issue i'm actually receiving data in base64 encoded originally to GSM 7bit packed and i need to unpack it to readble text. now, if i isolate only the @ sign represented by base64 i'll be receiving:AA== and the hex value of it is 00(double zero) the byte representatin of 00 of hex is 0. and thus i'm receiving this annoying zero...
@HaimKlainman That's strange: a single @ should be QA== in base64. I think there is something wrong for your data.
@Adrian Shum it is true but to decode base64 properly you need to know how the original message was encoded to. in this case the original message encoded to 7bit hex and then to base64 so the decoded hex value back from base64 is 00 which is [0] in bytes ->which is @ in italics
Oh! I think I got what you mean by GSM 7 bit. You can treat that as a totally different character encoding scheme. It is just like, you are converting a EBCDIC byte array to string as is, and complain Java giving you the wrong result. The question has nothing to do with base64 actually
1

You are using ascii values of characters in your byte array.

Here 64 corresponds to ascii value of '@' character that you are after.

Hence your array should be:

byte[] byteFinal ={97, 98, 64, 99, 100}; ^^ 

Looking at the wiki ascii value of 0 corresponds to null character.

Also to create String, you could just create string as below instead of using StringBuilder:

System.out.println(new String(byteFinal)); 

So all you need is two lines of code like:

byte[] byteFinal ={97, 98, 64, 99, 100}; System.out.println(new String(byteFinal)); 

Comments

0

Corresponding ASCII value of @ = 64 , Look Wikipedia

Rest of your code is fine!

byte[] byteFinal ={97, 98, 64, 99, 100}; char ch; StringBuilder str = new StringBuilder(); for(byte b : byteFinal){ ch = (char)b; System.out.println("ch:"+ch); str.append(ch); } System.out.println(str.toString()); 

Comments

0

You can also install the charset in the lib and use getBytes("SCGSM")

Comments

0

There is the library jCharset. When the library is on the class path it will be automatically added to the available charsets.

import java.io.UnsupportedEncodingException; class Scratch { public static void main(String[] args) throws UnsupportedEncodingException { byte[] encoded = "something".getBytes("GSM7"); System.out.println(new String(new byte[]{97, 98, 0, 99, 100}, "GSM7")); } } 

ab@cd Here are the Maven coordinates.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.