1

I'm reading a Dutch webpage :

HttpWebRequest oReq = (HttpWebRequest)WebRequest.Create(website); oReq.Method = "GET"; HttpWebResponse resp = (HttpWebResponse)oReq.GetResponse(); HtmlDocument doc; doc.Load(resp.GetResponseStream(), Encoding.GetEncoding("iso-8859-1")); 

When I get the text of some random element within the page I get some weird characters not the Dutch ones I see in Chrome:

HtmlNode node = doc.DocumentNode.SelectSingleNode(xpath); if(node != null) { MessageBox.Show(node.InnerText, "--- just scrapped some xpath ---"); } 

Instead of café I get café

How do I solve this? I get the same text when writting it to a file, when I assign it to a richtextbox, etc ,etc, the same broken text.

2
  • 1
    Try changing the encoding to Unicode, e.g. utf-8 Commented Jun 15, 2014 at 15:11
  • Thanks! Big code I'm working with , I tried that in another path of the code that wasn't being processed and thought I really excluded that possibility. Big thanks again! Put this as an answer I'll accept it. Commented Jun 15, 2014 at 15:15

1 Answer 1

1

Change the encoding to Unicode, e.g. utf-8

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.