-1

for example i have a text like

<p> Quis vel accusantium libero. Suscipit officiis culpa <img src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7"> libero quia ad. </p> 

and i want to check if the string has any data:image then truncate only this part so max char of 50, so the results become

<p> Quis vel accusantium libero. Suscipit officiis culpa <img src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH..."> libero quia ad. </p> 

am not sure how exactly to achieve that with preg_replace and "data:image.+?" pattern

6
  • 3
    If you delete most of a bas64 image string then whats the point in even having the base 64 image? It is not going to work after that Commented Dec 1, 2017 at 10:23
  • @dreftymac it's not exactly like there, because he has a pattern to check, but I would use php.net/manual/en/function.preg-split.php and then give the pattern as a regex (not checked, but maybe something like this: '/data:image(.){50}/') then get the first part of the result array Commented Dec 1, 2017 at 10:28
  • @Edwin good point. I should have elaborated that regex is not essential to solve this problem. Commented Dec 1, 2017 at 10:30
  • @ProEvilz because i want to display it as string not as an image, am already aware that truncating the uri will render it useless. Commented Dec 1, 2017 at 11:00
  • How do you propose to show an image plus it's src ? Commented Dec 1, 2017 at 11:01

2 Answers 2

2

Problem: PHP string parse

  • Thanks for clarifying your question with comments. What you seem to be wanting is a general-purpose HTML parser that can make special-case modifications to the HTML Markup.
  • Generally speaking, it is not advisable to use regex to parse HTML.
  • If you are wanting a general-purpose tool (and not a quick-and-dirty approach) SO already has a question about Modifying html attributes with PHP that may be closer to what you want.
  • If all you want is a quick-and-dirty approach that will remove long base64 encoded data from src attribute on img tags, then you can tokenize the raw HTML string, and then perform regex replaces, but that approach is going to be painful if you decide you want to do other modifications. You may end up re-inventing the wheel, when you could have just used a real HTML parser to begin with.
  • Nevertheless, the below approach does just that, tokenize the string, do replacements and then return the entire modified string.

Solution using preg_replace (quick-and-dirty)

<?php $demostring = ' <p> Quis vel accusantium libero. Suscipit officiis culpa <img src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7"> libero quia ad. </p> '; function ctf0_truncate($vinput){ return( preg_replace('/(data:image.{50})(.*)/', '$1', $vinput) ); } function ctf0_parse($text, $chars = 50) { if (strpos($text, 'data:image') !== FALSE){ $tokens = explode('"',$text); $tokens = array_map("ctf0_truncate",$tokens); $vout = implode('"',$tokens); } elseif( True ) { $vout = $text; } return $vout; } $myresult = ctf0_parse($demostring); print($myresult); 

Output result

<p> Quis vel accusantium libero. Suscipit officiis culpa <img src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALA"> libero quia ad. </p> 

Notes

  • The above solution omits a requested element of the question. Specifically, how to add the '...' ellipsis points. For that part, please see other answers on SO, such as here and here.
Sign up to request clarification or add additional context in comments.

5 Comments

because am building a diff tool, that render the new vs old string, so displaying the whole data:uri wont make any difference other than delaying the diff rendering to the end user
if you know another/better solution without using the regex, plz add it.
@ctf0 If my hunch is correct, and you are going to want your application to do more modifications similar to this one, and you do not want to re-invent the wheel with a bunch of custom-made regexes, then any full-fledged HTML parser should fit the bill. See e.g., this link
thanx, will give it a try, btw regarding the current example 3v4l.org/cvifQ, is there a way to add '...' in place of removed chars ?
@ctf0 I will update my answer to include that as well.
1

You can do that in different ways, with preg_match(_all), preg_split, etc.

But with the preg_replace will work like this: run to see

<?php $text='data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7'; $result=preg_replace('/(?<=data:image.{50}).*/', '', $text); echo $result; 

3 Comments

actually we can use any regex function, am just used to preg_replace, still thanx
you have other text, you will have to adjust the regex accordingly to your use case (=> 3v4l.org/LXdQf)
awesome, thanx.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.