3

I'm trying to truncate a long string to a specific number of characters and interpolate another user-defined string in the middle of it (more or less) to represent that the string has been truncated. And at at same time, I'm trying to make the words to not get broken in half. E.g:

The quick brown fox jumped over the lazy dog

If defined (as function parameter) to truncate this string to 20 characters the resulting string should be something like:

The quick brown ... the lazy dog

The closest implementation I came was:

function truncate( $string, $length, $append = NULL ) { if( strlen( $string ) <= $length ) return $string; $append = ( strlen( $append ) ? sprintf( ' %s ', $append ) : ' ... ' ); $start = round( $length / 2 ); $start = strlen( substr( $string, 0, ( strpos( substr( $string, $start ), ' ' ) + $start ) ) ); $end = ( $start - strlen( $append ) ); $end = strlen( substr( $string, 0, strrpos( substr( $string, $start + strlen( $append ) - 1 ), ' ' ) ) ); return substr( $string, 0, $start ) . $append . substr( $string, ( strlen( $string ) - $end ) ); } 

But not only this is not running smoothly with strings of different lengths, but it's also not truncating to the size as defined.

For some strings I'm receiving duplicated blank characters (because of wrong math about the blank spaces used by sprintf() over $append), sometimes one letter is removed from the word closest to the interpolated string and sometimes a word is getting broken in half when it shouldn't.

The above string, for example, if used like:

truncate( $str, 20 ); 

Results in:

The quick brown ... ped over the lazy dog

1
  • Where does the length 20 fit? Your sample string output is actually 32, so at 20 would it really be "The quick...lazy dog" ? Commented Dec 18, 2014 at 16:39

1 Answer 1

0

To avoid the mid-word truncation, I look first to wordwrap() since it already has that ability by default.

So the approach I would take instead is to use wordwrap() to split the string into segments around half your total desired length, minus the length of the separator string.

Then combine the first line from wordwrap(), the separator, and the last line. (Use explode() to split the wordwrap() output into lines).

// 3 params: input $string, $total_length desired, $separator to use function truncate($string, $total_length, $separator) { // The wordwrap length is half the total minus the separator's length // trim() is used to prevent surrounding space on $separator affecting the length $len = ($total_length - strlen(trim($separator))) / 2; // Separate the output from wordwrap() into an array of lines $segments = explode("\n", wordwrap($string, $len)); // Return the first, separator, last return reset($segments) . $separator . end($segments); } 

Try it out: http://codepad.viper-7.com/ai6mAK

$s1 = "The quick brown fox jumped over the lazy dog"; $s2 = "Lorem ipsum dolor sit amet, nam id laudem aliquid. Option utroque interpretaris eu sea, pro ea illud alterum, sed consulatu conclusionemque ei. In alii diceret est. Alia oratio ei duo."; $s3 = "This is some other long string that ought to get truncated and leave some stuff on the end of it."; // Fox... echo truncate($s1, 30, "..."); // Lorem ipsum... echo truncate($s2, 30, "..."); // Other one echo truncate($s3, 40, "..."); 

Outputs:

The quick...the lazy dog Lorem ipsum...ei duo. This is some...on the end of it. 

Notice in this output that the last bit ei duo is a little shorter. That is because the final line wordwrap() returned was not of the total length. That could be worked around if it is important to you, by inspecting the strlen() of the last element from the $segments array and if it is less than some threshold (say $len / 2) split the array element before it into words with explode() and prepend another word from that array.

Here's an improved version addressing that issue by backtracking to the second-last line from wordwrap() and popping words off it until the ending is at least half the length of $total_length. It is a bit more complicated, but has a more satisfying result. http://codepad.viper-7.com/mDmlL0

function truncate($string, $total_length, $separator) { // The wordwrap length is half the total, minus the separator's length $len = (int)($total_length - strlen($separator)) / 2; // Separate the output from wordwrap() into an array of lines $segments = explode("\n", wordwrap($string, $len)); // Last element's length is less than half $len, append words from the second-last element $end = end($segments); // Add words from the second-last line until the end is at least // half as long as $total_length if (strlen($end) <= $total_length / 2 && count($segments) > 2) { $prev = explode(' ', prev($segments)); while (strlen($end) <= $total_length / 2) { $end = array_pop($prev) . ' ' . $end; } } // Return the first, separator, last return reset($segments) . $separator . $end; } // Produces: The quick...over the lazy dog Lorem ipsum...Alia oratio ei duo. This is some other...stuff on the end of it. 
Sign up to request clarification or add additional context in comments.

5 Comments

Man... While I was waiting for the answer I tried again and I was so close to your approach. I was missing just the right logic for $len. Personally, right now, I don't need the string to have the length as close as possible to defined so, if you don't mind, I would like to see a more extended approach to accomplish that. Also, one thing I noticed is the output get shorter if $separator has boundary spaces.
@BrunoAugusto Do you mean you would like to see the last paragraph fleshed out better to get closer to the intended length? To fix the issue with spaces around the separator, use strlen(trim($separator)) instead of strlen($separator)
That's what I was looking for. Just out of curiosity, and I hope I'm not abusing you, but would it be possible to have a minimum amount of words on each extremity of the resulting string, regardless the length defined, for cases in which the separator has more characters than the substrings added around it?
@BrunoAugusto Okay, this version kind of manages a minimum number of words. It would require some finesse to get it really perfect, but I don't have time to work on it more now. I'm not going to edit this version in above, because it is in need of refinement. But hopefully it is sufficient to set you on your way. codepad.viper-7.com/WEXMlf
No problems, you already helped me a lot. I can see that such a little thing would be much more costly to the program.