In A Liberal, Accurate Regex Pattern for Matching URLs I found the following Regex
\b(([\w-]+://?|www[.])[^\s()<>]+(?:([\w\d]+)|([^[:punct:]\s]|/)))
Solution
/** * @param string $str the string to encode and parse for URLs */ function preventXssAndParseAnchors(string $str): string { $url_regex = "/\b((https?:\/\/?|www\.)[^\s()<>]+(?:\([\w\d]+\)|([^[:punct:]\s]|\/)))/"; // Encoding HTML special characters To prevent XSS // Before parsing the URLs to Anchors $str = htmlspecialchars($str, ENT_QUOTES, 'UTF-8'); preg_match_all($url_regex, $str, $urls); foreach ($urls[0] as $url) { $str = str_replace($url, "<a href='$url'>$url</a>", $str); } return $str; }
Example
<?php $str = " apply here https://ph.dbsd.com/job/dfvdfg/5444 <script> console.log('this is a hacking attempt hacking'); </script> and www.google.com also http://somesite.net "; echo preventXssAndParseAnchors($str);
The output
apply here <a href='https://ph.dbsd.com/job/dfvdfg/5444'>https://ph.dbsd.com/job/dfvdfg/5444</a> <script> console.log('this is a hacking attempt hacking'); </script> and <a href='www.google.com'>www.google.com</a> also <a href='http://somesite.net'>http://somesite.net</a>
Test https://3v4l.org/85lsl