Can I use Html Agility Pack To Parse HTML Fragment?

Question

Can Html Agility Pack be used to parse an html string fragment?

Such As:

var fragment = "<b>Some code </b>";

Then extract all <b> tags? All the examples I seen so far have been loading like html documents.

It could be done even simlier with HAP, in one line: var text = HtmlNode.CreateNode("<b>Some code </b>").InnerText; — Oleks
– Oleks, Commented Mar 4, 2012 at 15:31

Mike Koder · Accepted Answer · 2010-03-29 05:49:23Z

11

If it's html then yes.

string str = "<b>Some code</b>"; // not sure if needed string html = string.Format("<html><head></head><body>{0}</body></html>", str); HtmlDocument doc = new HtmlDocument(); doc.LoadHtml(html); // look xpath tutorials for how to select elements // select 1st <b> element HtmlNode bNode = doc.DocumentNode.SelectSingleNode("b[1]"); string boldText = bNode.InnerText;

edited Mar 29, 2010 at 5:49

answered Mar 29, 2010 at 5:13

Mike Koder

1,9381 gold badge17 silver badges29 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

chobo2 Over a year ago

Ok then what would I do with it how would I do some parsing?

chobo2 Over a year ago

Hmm thanks but I copied and pasted that code into a console app and imported html agility back but on HtmlNode line I get a null reference exception.

Mike Koder Over a year ago

Maybe it's HtmlNode bNode = doc.DocumentNode.SelectSingleNode("/b[1]");

Rohit Agarwal Over a year ago

Try HtmlNode bNode = doc.DocumentNode.SelectSingleNode("//b[1]");

rtpHarry · Accepted Answer · 2010-04-04 14:34:40Z

I dont think this is really the best use of HtmlAgilityPack.

Normally I see people trying to parse large amounts of html using regular expressions and I point them towards HtmlAgilityPack but in this case I think it would be better to use a regex.

Roy Osherove has a blog post describing how you can strip out all the html from a snippet:

http://weblogs.asp.net/rosherove/archive/2003/05/13/6963.aspx

Even if you did get the correct xpath with Mika Kolari's sample this would only work for a snippet with a <b> tag in it and would break if the code changed.

Dennis Rutherford · Accepted Answer · 2021-08-31 17:33:07Z

This answer came up when I searched for the same thing. I don't know if the features have changed since it was answered but this below should be better.

$string = '<b>Some code </b>' [HtmlAgilityPack.HtmlNode]::CreateNode($string)

Collectives™ on Stack Overflow

Can I use Html Agility Pack To Parse HTML Fragment?

3 Answers 3

4 Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

4 Comments

Comments

Comments

Linked

Related