I'm struggling to find an effective way to serialize a string that could contain both unicode and non-unicode characters into a binary array which I then serialize to a file that I have to deserialize using C++.
I have already implemented a serializer/deserializer in C++ which I use to do most of my serialization which can handle both unicode and non-unicode characters (basically I convert non-unicode characters into their unicode equivalent and serialize everything as a unicode string, not the most effective way since every string now has 2 bytes per character but works).
What I'm trying to achieve is to transform an arbitrary string into a 2 byte per character string that I can then deserialize from C++.
What would be the most effective effective way to achieve what I'm looking for?
Also, any suggestion regarding the way I'm serializing strings is well accepted of course.
Encoding.Unicode.GetBytes("my string")Encoding.Unicodein .net is UTF-16), because UTF-8 encodes ascii range as one byte, and that range is quite common. For that you need to adjust C++ part of course.