1

I'm trying to use this online tool to make a regular expression that allows me to insert a series of line breaks in a list but can't find the way.

I'm not a programmer but understand the big picture of regular expressions, the problem is that this expression is so complex for me.

I have a list like this:

line1 line2 keyword line3 line4 line6 line7 keyword line8 line9 keyword line10 

And I want to insert a line break on the lines before the "keyword" lines so the list results in something like that:

line1 line2 keyword line3 line4 line6 line7 keyword line8 line9 keyword line10 

I'm using the "Find a Pattern Using a RegExp" on this other tool to locate the start of the line but all my attempts didn't work till now.

I know that with the expression /\bKeyword/ I can select the target word "keyword" and think with /^/ I can go to the start of the line but haven't could writhe the expression in a logical code that work in the tool and insert the line breaks.

Any help or clue on this is welcome.

1
  • Thanks, everybody for your help! The answer from @bitinerant works great with the tool I'm using. The other answers have been so complete and helped me to better understand the RegEx structures. Thank you so much agin. Commented Sep 18, 2019 at 21:59

3 Answers 3

1

Here is the solution.

Find:

/([^\n]+(\n))(keyword\n)/g 

Replace with:

$2$1$3 

This solution works with the tool the OP is using. (The \n sequence does not work as expected in the 'replace with' box, so I had to capture that in the input.)

Sign up to request clarification or add additional context in comments.

Comments

0

You'll need to use a regexp engine wich works wich can deal with multiline matching (not sed). you can use this regular expression: (\n?)(.*)\nkeyword (replace \n by \r\n if you deal with Windows end of lines).

Explanation:

  • \n means a line feed character
  • .* means match a sequence of 0 or more (*) character that are not line breaks (.).
  • Parenthesis () will "capture" the matched string. Each parenthesis set will produce a captured group named accordingly to the set rank: $1, $2, $3 (dollard or backslash notations are both used).

The substitution expression will be \n$1$2\nkeyword, which means "two line feeds followed by the first and second captured strings, a line feed and "keyword"".

Working fiddle here. (The tool you use.)

2 Comments

Nothing in your regex is dependent on "multiline" mode.
@KennethK: This regexp also won't match if the keyword occurs on the second line of the input. To make that work, you do need ^ and multiline mode (or else some kluge like (^|\n)).
0

So basically you're trying to match a line followed by another line that contains a certain keyword, and insert a newline before it? If so, this should work:

/^.*\n.*keyword/mg 

Here:

  1. In multiline mode (enabled with the m flag) the caret (^) matches the beginning of a line.

  2. .* matches zero or more non-newline characters, i.e. the contents of a single line.

  3. \n matches a newline.

  4. .*keyword again matches any number of non-newline characters followed by the literal string keyword.

You can use this regexp in a search-and-replace operation to insert the newlines like this:

let string = `line1 line2 keyword line3 line4 line6 line7 keyword line8 line9 keyword line10`; string = string.replace( /^.*\n.*keyword/gm, '\n$&' ); console.log( string );

Note that the ^ at the beginning isn't strictly needed, since most regexp engines (including any PCRE-compatible ones) will try to start each match as early as possible, and the earliest place .* can start matching something is at the beginning of a line. But having it there makes the intent clearer, and may also speed up the matching by telling the engine that there's no point in even trying to start a match in the middle of a line.

Also note that the regexp above will not match anything after keyword on the second line. For this particular replacement that doesn't matter, but if you do want to match the rest of the line, just add another .* after keyword in the regexp.

Conversely, if you know that the keyword always occurs at the beginning of the line (or if you don't actually want to match lines where it doesn't), you can remove the .* before keyword from the regexp. Or, if the keyword must occur alone on the line, you can remove the .* before keyword and add an end-of-line anchor ($) after it.


Ps. as bitinerant notes, the online search-and-replace tool you're using doesn't support escape sequences like \n in the replacement. To work around those limitations, you can use the regexp /^.*(\n).*keyword/mg and the replacement string $1$&. Here, the parenthesized subexpression (\n) just matches the newline between the two lines, so that we can use $1 in the replacement to insert an extra newline without having to somehow enter a literal newline into the replacement field.

1 Comment

@Amessihel: The lack of \n etc. seems to be a limitation of that specific web page the OP is using. (Or, rather, it's probably caused by the page feeding the user input from the text input field directly into .replace() without parsing it for backslash escape sequences first. In JS code, it's the JS parser that handles those, not the regexp engine.)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.