11

I need to match with a javascript RegExp the string: bimbo999 from this a tag: <a href="/game.php?village=828&amp;screen=info_player&amp;id=29956" >bimbo999</a>

The numbers from URL vars (village and id) are changing every time so I have to match the numbers somehow with RegExp.

</tr> <tr><td>Sent</td><td >Oct 22, 2011 17:00:31</td></tr> <tr> <td colspan="2" valign="top" height="160" style="border: solid 1px black; padding: 4px;"> <table width="100%"> <tr><th width="60">Supported player:</th><th> <a href="/game.php?village=828&amp;screen=info_player&amp;id=29956" >bimbo999</a></th></tr> <tr><td>Village:</td><td><a href="/game.php?village=828&amp;screen=info_village&amp;id=848" >bimbo999s village (515|520) K55</a></td></tr> <tr><td>Origin of the troops:</td><td><a href="/game.php?village=828&amp;screen=info_village&amp;id=828" >KaLa I (514|520) K55</a></td></tr> </table><br /> <h4>Units:</h4> <table class="vis"> 

I tried with this:

var match = h.match(/Supported player:</th>(.*)<\/a><\/th></i); 

but is not working. Can you guys, help me?

2
  • 2
    Why are you manipulating the HTML directly? It's much safer (and usually easier) to work through the DOM. Find the right <table>, then the appropriate <a> tags in the table using jQuery or a cross-browser selector library like Sizzle and then just get the innerHTML of the <a> tag to get bimbo999. Commented Oct 23, 2011 at 6:40
  • Using regex to traverse html tags is not very good practice. Have you tried making a DOM element from the tag and getting innerHTML? Commented Oct 23, 2011 at 6:41

2 Answers 2

27

Try this:

/<a[^>]*>([\s\S]*?)<\/a>/ 
  • <a[^>]*> matches the opening a tag
  • ([\s\S]*?) matches any characters before the closing tag, as few as possible
  • <\/a> matches the closing tag

The ([\s\S]*?) captures the text between the tags as argument 1 in the array returned from an exec or match call.

This is really only good for finding text within a elements, it's not incredibly safe or reliable, but if you've got a big page of links and you just need their text, this will do it.


A much safer way to do this without RegExp would be:

function getAnchorTexts(htmlStr) { var div, anchors, i, texts; div = document.createElement('div'); div.innerHTML = htmlStr; anchors = div.getElementsByTagName('a'); texts = []; for (i = 0; i < anchors.length; i += 1) { texts.push(anchors[i].text); } return texts; } 
Sign up to request clarification or add additional context in comments.

2 Comments

/<a[^>]*>((?:.|\r?\n)*?)<\/a>/ is also handy for matching to the next closing tag over multiple lines.
It would match over multiple lines already \s match any white space character [\r\n\t\f ]
2

I don't have experience with Regex, but I think you can use JQuery with .text() !

JQuery API - .text()

I mean if you use :

var hrefText = $("a").text(); 

You will get your text without using Regex!

.find("a") and then gives you a list of a's tags objects and then use .each() to loop on that list then you can get the text by using .text().

Or your can use a class selector, id or anything you want!

2 Comments

this could also be done with regular javascript using getElementsByTagName('a'). Not a bad idea.
As a side note, it's not a good idea to use regex to parse HTML :)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.