HttpWebRequest return broken chars

Question

I'm reading a Dutch webpage :

HttpWebRequest oReq = (HttpWebRequest)WebRequest.Create(website); oReq.Method = "GET"; HttpWebResponse resp = (HttpWebResponse)oReq.GetResponse(); HtmlDocument doc; doc.Load(resp.GetResponseStream(), Encoding.GetEncoding("iso-8859-1"));

When I get the text of some random element within the page I get some weird characters not the Dutch ones I see in Chrome:

HtmlNode node = doc.DocumentNode.SelectSingleNode(xpath); if(node != null) { MessageBox.Show(node.InnerText, "--- just scrapped some xpath ---"); }

Instead of café I get cafÃ©

How do I solve this? I get the same text when writting it to a file, when I assign it to a richtextbox, etc ,etc, the same broken text.

Thanks! Big code I'm working with , I tried that in another path of the code that wasn't being processed and thought I really excluded that possibility. Big thanks again! Put this as an answer I'll accept it. — kawa
– kawa, Commented Jun 15, 2014 at 15:15

krivtom · Accepted Answer · 2014-06-15 15:17:10Z

1

Change the encoding to Unicode, e.g. utf-8

answered Jun 15, 2014 at 15:17

krivtom

25k10 gold badges53 silver badges54 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

HttpWebRequest return broken chars

1 Answer 1

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Related