How to convert to keyboard characters

Question

I have a text written with al kinds of weird characters, like ŸŞşȘș€ÀÈÉÌÒÓÙàèéìòóùºª«»€ and I am trying to convert them to their normal equivalents, SAEIOUaeiou etc. I have tried this in a number of ways, but I keep getting mixed results, some work, some don't. This is what I've done so far:

byteArray1 = UnicodeEncoding.GetEncoding(1250).GetBytes(charArray); byteArray2 = UnicodeEncoding.GetEncoding(852).GetBytes(charArray); byteArray3 = UnicodeEncoding.GetEncoding(737).GetBytes(charArray); resultArray1 = UTF7Encoding.GetEncoding(1250).GetChars(byteArray1); resultArray2 = UTF7Encoding.GetEncoding(852).GetChars(byteArray2); resultArray3 = UTF7Encoding.GetEncoding(737).GetChars(byteArray3);

Is there something simple and obvious (I doubt it) that I'm missing? Also, if I'm doing something really the wrong way, do tell.

Why are you creating encodings from specific subclasses? This will only confuse a reader. Just use Encoding.GetEncoding(). — Joey
– Joey, Commented Feb 3, 2012 at 15:24
I've tried in lots of ways, and this was the only one that partially worked. — Adrian Marinica
– Adrian Marinica, Commented Feb 3, 2012 at 15:41

madd0 · Accepted Answer · 2012-02-03 15:03:05Z

If what you want to do is simply remove the diacritic marks from characters, I recommend you take a look at this blog post which describes how to do so.

It will not do anything about characters such as ºª«»€ though, but you can get rid of those after removing diacritics with a simple regular expression if you want:

var noDiac = RemoveDiacritics("ŸŞşȘș€ÀÈÉÌÒÓÙàèéìòóùºª«»€"); var cleanTxt = Regex.Replace(noDiac, "[^A-Z]", string.Empty, RegexOptions.IgnoreCase); // outputs: YSsSsAEEIOOUaeeioou

Collectives™ on Stack Overflow

How to convert to keyboard characters

1 Answer 1

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Related