0

I have spent considerable amount of time searching for the solution or trying one, But I did not found one. So my usecase is:

I have a text which can have simple url(with or without http/s) or it can also have hyperlinked url.

What regex should do

It should leave hyperlink url as it is and convert the non hyperlinked url to a hyperlinked URL.

Example Text

I am learning regex from www.codeburst.com and trying regex at <a href="https://regexr.com">Regexr</a>.

Expected Solution

I am learning regex from <a href="www.codeburst.com">www.codeburst.com</a> and trying regex at <a href="https://regexr.com">Regexr</a>.

I have tried

this regex, but it it not working as expected.

/((?!href).((https?:\/\/)||(www\.)|(mailto:)).+)/gi 
6
  • I don't see a difference between your source text and your 'Expected Solution' In what way does your regex not do what you want? Commented May 2, 2019 at 9:52
  • www.codeburst.com should also come in href tag. Commented May 2, 2019 at 9:53
  • Made changes to the question, now html part is showing correctly. Commented May 2, 2019 at 9:56
  • Possible duplicate of What is a good regular expression to match a URL? Commented May 2, 2019 at 9:58
  • No. Both questions are different. There the questioner wants regex which only pass if url start with http or https. Mine url can start with http, https, www and mailto also. Commented May 2, 2019 at 10:00

2 Answers 2

1
  1. You probably need a negative lookbehind (?<!href=") which was added to ECMAScript recently, see this answer
  2. be careful with double || which renders tokend behind this useless (hungry match)
  3. also be careful with .+ which matches everything after (including newline with /s regex option)

I would start with

(?<!href=")(((https?:\/\/)|(www\.)|(mailto:))\S+) 
Sign up to request clarification or add additional context in comments.

Comments

0
(https?:\/\/)?(www\.)?[-a-zA-Z0-9@:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_\+.~#?&//=]*) 

The breakout of regex is as follows,

  1. (https?:\/\/)? checks for http:\\ or https:\\ or no http
  2. (www\.)?checks for www. or no www.

I have checked above regex with following test cases:

  1. href="https://www.regexr.com"
  2. href="http://www.regexr.com"
  3. href="mailto:[email protected]"

2 Comments

Your solution works correctly for urls without hyperlink, but if text is 'with hyperlink url <a href="example.com">ABC</a>', then it gets converted to 'with hyperlink url <a href="<a class="red" href="example.com">https://www.example.com</…>'. For urls inside href, it should keep them as it is.
You can search all the strings matching with above url, and then find the strings not starting with href programmatically.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.