0

I'm confused as a java dev trying his way into C#. I've read about the string type and it being immutable and such , not much different from java except that it doesn't seem to be an object like there but I'm getting weird behavior regardless. I have following toString method on a class

 public override string ToString() { StringBuilder builder = new StringBuilder(); builder.Append("BlockType: "); builder.Append(BlockType + "\n"); //builder.Append(System.Text.ASCIIEncoding.ASCII.GetChars(Convert.FromBase64String("dHh0AA=="))); //builder.Append("\n"); builder.Append("BlockName: "); builder.Append(BlockName + "\n"); //builder.Append(System.Text.ASCIIEncoding.ASCII.GetChars(Convert.FromBase64String(this.BlockName))); //builder.Append("\n"); builder.Append("BlockLength: " + this.BlockLength + "\n"); builder.Append("pBlockData: " + this.pBlockData + "\n"); return builder.ToString(); } 

When I fill it with data. Taking in account that BlockType and BlockName will contain a Base64 String. I get following result

FileVersionNo: 0 nx: 1024 ny: 512 TileSize: 256 HorizScale: 10 Precis: 0,01 ExtHeaderLength: 35 nExtHeaderBlocks: 1 pExtHeaderBlocks: System.Collections.Generic.LinkedList`1[LibFhz.HfzExtHeaderBlock] BlockType: dHh0AA== BlockName: YXBwLW5hbWUAAAAAAAAAAA== BlockLength: 11 pBlockData: System.Byte[] 

Which is perfect exactly what I want, however when I try to get the ASCII value of those Base64 (or UTF-8, I tried both) I get the following result

FileVersionNo: 0 nx: 1024 ny: 512 TileSize: 256 HorizScale: 10 Precis: 0,01 ExtHeaderLength: 35 nExtHeaderBlocks: 1 pExtHeaderBlocks: System.Collections.Generic.LinkedList`1[LibFhz.HfzExtHeaderBlock] BlockType: txt 

The code just seems to stop, without error or stacktrace. I have no idea what is going on. I thought first that a \0 is missing so I've added it to the string, then I thought I need a \r\n ... again not the sollution, I started to google with people just wanting to know how to do a Bas64 to UTF-8 conversion ... but that part seems easy ... this code stop isn't.

Any insights or links to decent articles about string handling in .net would be appreciated

3
  • 1
    Convert.FromBase64String() is likely returning binary zero values, which are getting converted to ASCII NULL characters. I imagine that might mess up the output. What output are you expecting? Commented Mar 14, 2014 at 14:32
  • A base64 string generally won't have a meaningful ASCII value. If it did, there would be no reason to encode it in the first place, so this seems very odd. Commented Mar 14, 2014 at 14:37
  • BlockType: txt BlockName: app-name that is the readable text value of the Base64 content, your binary zero values suggestion might be the culprit though. Small note the content of the variables got filled with this string blockName = Convert.ToBase64String(reader.ReadBytes(16));, bytes read from a stream Commented Mar 14, 2014 at 14:38

2 Answers 2

1

I've had a look at what you get from this:

var test = Convert.FromBase64String("YXBwLW5hbWUAAAAAAAAAAA=="); var builder = new StringBuilder(); builder.Append(System.Text.Encoding.ASCII.GetChars(test)); 

The answer is the string "app-name" with a load of null (0) characters at the end.

You could try removing all the null characters by adding this line just before you return builder.ToString():

builder.Replace("\0", null); 

That may or may not help, depending on what you're doing with the returned string.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks this was the solution. I'm using the returned string just as a toString so people can read the Class object in debug when they make use of the Class, not for any other meaningfull processing (otherwhise I would just keep the Base64)
1

First

builder.Append("pBlockData: " + this.pBlockData + "\n"); 

Doesn't do what you think it does, specifically if pBlockData is a byte array you will get something like this (output from scriptcs):

> byte[] data = new byte[11]; > StringBuilder sb = new StringBuilder(); > sb.Append("data = ") {Capacity:16,MaxCapacity:2147483647,Length:7} > sb.Append(data); {Capacity:32,MaxCapacity:2147483647,Length:20} > sb.ToString() data = System.Byte[] 

Second C# strings (.NET strings in general) are UTF-16, so it doesn't really know how to handle displaying bytes. It doesn't matter if it is bas64 encoded or ASCII or French pickles ;-) the runtime just treats it as binary. Also null termination is not required, the length of the string is kept as a property of the string object.

So you need to turn the byte array you have into a UTF-16 character array, or string before you output it. If the byte array contains valid ASCII you can look into the 'System.Text.ASCIIEncoding.ASCII.GetDecoder().Convert' method as one way to accomplish this.

3 Comments

Thanks this might help me do it right at the source: string blockName = Convert.ToBase64String(reader.ReadBytes(16)); is the current source so I could read it into an UTF-16 right here. Although this still might keep the \0 though, this part of the file is always 16 bytes long null terminated , I just remember this from the file spec !
If 16 bytes are available the reader will read them and if those include one or more \0 values it will read those too. Since Unicode also include a null termination character as a valid value these should appear in the 'blockName' string and will be counted as part of the strings total length.
Ok so in case of data I shouldn't touch the \0 since they are part of the data. But for representation there is nothing wrong with stripping them then. Thanks this has been enlightning

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.