Skip to main content
added 175 characters in body
Source Link
jubilatious1
  • 3.9k
  • 10
  • 21

Using Raku (formerly known as Perl_6)

~$ raku -e 'my @a; for slurp() { @a = .comb(/^^ Chapter .*? <?before \nChapter | $ > /) }; @a.grep(/quench/).put;' file 

OR (more compactly):

~$ raku -e 'slurp.comb(/^^ Chapter .*? <?before \nChapter | $ > /).grep(/quench/).put;' file 

Certain someone will post a Perl answer, but here's an answer written in Raku (a.k.a. Perl6). Raku provides high-level support for Unicode, built-in.

Briefly, the file is slurped in all-at-once, and combed through to locate matching records (Chapters). Think of comb as the global-converse of split: patterns(s) are requested and non-destructively selected out. An advantage of combing through thuslywith the ^^Chapter .*? <?before \nChapter | $> pattern is that you can still recover individual ChaptersChapters that don't have an associated Reference line, which is important (important if you're still correcting/amendingediting text):

Regex tokens:

  • ^^ start-of-line,
  • Chapter text,
  • .*? zero-or-more any-character (frugally),
  • <?before … > positive lookahead: \nChapter text | or $ end-of-file.

Sample Output:

Chapter: 1 One: Birds and Trees Birds are beautiful and trees are amazing and they are dependent on each other. Birds most of the time choose to make their nests on trees since trees provide more stability. One day the bird sat on a tree and said; Bird: Oh my I'm so tired from all the flying, I should take a rest Tree: Mr Bird, you seem tired, perhaps you should take some rest, and here are some fruits to quench your thirst. Bird: Oh thank you very much! Reference: Chapter 1: birds and trees 

In the final statement grep is used to only return matching records (Chapters). Above Sample Input is same as provided by OP. Add calls to trim, trim-leading or trim-trailing in the final statement to remove surrounding whitespace, as desired.

https://raku.org

Using Raku (formerly known as Perl_6)

~$ raku -e 'my @a; for slurp() { @a = .comb(/^^ Chapter .*? <?before \nChapter | $ > /) }; @a.grep(/quench/).put;' file 

Certain someone will post a Perl answer, but here's an answer written in Raku (a.k.a. Perl6). Raku provides high-level support for Unicode, built-in.

Briefly, the file is slurped in, and combed through to locate matching records (Chapters). Think of comb as the global-converse of split: patterns(s) are requested and non-destructively selected out. An advantage of combing through thusly is that you can still recover individual Chapters that don't have an associated Reference line, which is important if you're still correcting/amending text:

Regex tokens:

  • ^^ start-of-line,
  • Chapter text,
  • .*? zero-or-more any-character (frugally),
  • <?before … > positive lookahead: \nChapter text | or $ end-of-file.

Sample Output:

Chapter: 1 One: Birds and Trees Birds are beautiful and trees are amazing and they are dependent on each other. Birds most of the time choose to make their nests on trees since trees provide more stability. One day the bird sat on a tree and said; Bird: Oh my I'm so tired from all the flying, I should take a rest Tree: Mr Bird, you seem tired, perhaps you should take some rest, and here are some fruits to quench your thirst. Bird: Oh thank you very much! Reference: Chapter 1: birds and trees 

In the final statement grep is used to only return matching records (Chapters). Above Sample Input is same as provided by OP. Add calls to trim, trim-leading or trim-trailing in the final statement to remove surrounding whitespace, as desired.

https://raku.org

Using Raku (formerly known as Perl_6)

~$ raku -e 'my @a; for slurp() { @a = .comb(/^^ Chapter .*? <?before \nChapter | $ > /) }; @a.grep(/quench/).put;' file 

OR (more compactly):

~$ raku -e 'slurp.comb(/^^ Chapter .*? <?before \nChapter | $ > /).grep(/quench/).put;' file 

Certain someone will post a Perl answer, but here's an answer written in Raku (a.k.a. Perl6). Raku provides high-level support for Unicode, built-in.

Briefly, the file is slurped in all-at-once, and combed through to locate matching records (Chapters). Think of comb as the global-converse of split: patterns(s) are requested and non-destructively selected out. An advantage of combing through with the ^^Chapter .*? <?before \nChapter | $> pattern is that you can still recover individual Chapters that don't have an associated Reference line (important if you're still editing text):

Regex tokens:

  • ^^ start-of-line,
  • Chapter text,
  • .*? zero-or-more any-character (frugally),
  • <?before … > positive lookahead: \nChapter text | or $ end-of-file.

Sample Output:

Chapter: 1 One: Birds and Trees Birds are beautiful and trees are amazing and they are dependent on each other. Birds most of the time choose to make their nests on trees since trees provide more stability. One day the bird sat on a tree and said; Bird: Oh my I'm so tired from all the flying, I should take a rest Tree: Mr Bird, you seem tired, perhaps you should take some rest, and here are some fruits to quench your thirst. Bird: Oh thank you very much! Reference: Chapter 1: birds and trees 

In the final statement grep is used to only return matching records (Chapters). Above Sample Input is same as provided by OP. Add calls to trim, trim-leading or trim-trailing in the final statement to remove surrounding whitespace, as desired.

https://raku.org

show regex tokens
Source Link
jubilatious1
  • 3.9k
  • 10
  • 21

Using Raku (formerly known as Perl_6)

~$ raku -e 'my @a; for slurp() { @a = .comb(/^^ Chapter .*? <?before \nChapter | $ > /) }; @a.grep(/quench/).put;' file 

Certain someone will post a Perl answer, but here's an answer written in Raku (a.k.a. Perl6). Raku provides high-level support for Unicode, built-in.

Briefly, the file is slurped in, and combed through to locate matching records (Chapters). Then in the final statementThink of grepcomb is used to only return matching recordsas the global-converse of split: patterns(Chapterss) are requested and non-destructively selected out. Sample InputAn advantage of combing through thusly is same as provided by OP.that you can still recover individual Chapters that don't have an associated Reference line, which is important if you're still correcting/amending text:

Regex tokens:

  • ^^ start-of-line,
  • Chapter text,
  • .*? zero-or-more any-character (frugally),
  • <?before … > positive lookahead: \nChapter text | or $ end-of-file.

Sample Output:

Chapter: 1 One: Birds and Trees Birds are beautiful and trees are amazing and they are dependent on each other. Birds most of the time choose to make their nests on trees since trees provide more stability. One day the bird sat on a tree and said; Bird: Oh my I'm so tired from all the flying, I should take a rest Tree: Mr Bird, you seem tired, perhaps you should take some rest, and here are some fruits to quench your thirst. Bird: Oh thank you very much! Reference: Chapter 1: birds and trees 

In the final statement grep is used to only return matching records (Chapters). Above Sample Input is same as provided by OP. Add calls to trim, trim-leading or trim-trailing in the final statement to remove surrounding whitespace, as desired.

https://raku.org

Using Raku (formerly known as Perl_6)

~$ raku -e 'my @a; for slurp() { @a = .comb(/^^ Chapter .*? <?before \nChapter | $ > /) }; @a.grep(/quench/).put;' file 

Certain someone will post a Perl answer, but here's an answer written in Raku (a.k.a. Perl6). Raku provides high-level support for Unicode, built-in.

Briefly, the file is slurped in, and combed through to locate matching records (Chapters). Then in the final statement grep is used to only return matching records (Chapters). Sample Input is same as provided by OP.

Sample Output:

Chapter: 1 One: Birds and Trees Birds are beautiful and trees are amazing and they are dependent on each other. Birds most of the time choose to make their nests on trees since trees provide more stability. One day the bird sat on a tree and said; Bird: Oh my I'm so tired from all the flying, I should take a rest Tree: Mr Bird, you seem tired, perhaps you should take some rest, and here are some fruits to quench your thirst. Bird: Oh thank you very much! Reference: Chapter 1: birds and trees 

Add calls to trim, trim-leading or trim-trailing in the final statement to remove surrounding whitespace, as desired.

https://raku.org

Using Raku (formerly known as Perl_6)

~$ raku -e 'my @a; for slurp() { @a = .comb(/^^ Chapter .*? <?before \nChapter | $ > /) }; @a.grep(/quench/).put;' file 

Certain someone will post a Perl answer, but here's an answer written in Raku (a.k.a. Perl6). Raku provides high-level support for Unicode, built-in.

Briefly, the file is slurped in, and combed through to locate matching records (Chapters). Think of comb as the global-converse of split: patterns(s) are requested and non-destructively selected out. An advantage of combing through thusly is that you can still recover individual Chapters that don't have an associated Reference line, which is important if you're still correcting/amending text:

Regex tokens:

  • ^^ start-of-line,
  • Chapter text,
  • .*? zero-or-more any-character (frugally),
  • <?before … > positive lookahead: \nChapter text | or $ end-of-file.

Sample Output:

Chapter: 1 One: Birds and Trees Birds are beautiful and trees are amazing and they are dependent on each other. Birds most of the time choose to make their nests on trees since trees provide more stability. One day the bird sat on a tree and said; Bird: Oh my I'm so tired from all the flying, I should take a rest Tree: Mr Bird, you seem tired, perhaps you should take some rest, and here are some fruits to quench your thirst. Bird: Oh thank you very much! Reference: Chapter 1: birds and trees 

In the final statement grep is used to only return matching records (Chapters). Above Sample Input is same as provided by OP. Add calls to trim, trim-leading or trim-trailing in the final statement to remove surrounding whitespace, as desired.

https://raku.org

deleted 2 characters in body
Source Link
jubilatious1
  • 3.9k
  • 10
  • 21

Using Raku (formerly known as Perl_6)

~$ raku -e 'my @a;  for slurp() {  @a = .comb( /^^Chapter^^ Chapter  .*?  <?before \nChapter | $ > /)  }; @a.grep(/quench/).put;'  file 

Certain someone will post a Perl answer, but here's an answer written in Raku (a.k.a. Perl6). Raku provides high-level support for Unicode, built-in. 

Briefly, the file is slurped in, and combed through to locate matching records (Chapters). Then in the secondfinal statement grep is used to only return matching records (Chapters). Sample Input is same as provided by OP.

Sample Output:

Chapter: 1 One: Birds and Trees Birds are beautiful and trees are amazing and they are dependent on each other. Birds most of the time choose to make their nests on trees since trees provide more stability. One day the bird sat on a tree and said; Bird: Oh my I'm so tired from all the flying, I should take a rest Tree: Mr Bird, you seem tired, perhaps you should take some rest, and here are some fruits to quench your thirst. Bird: Oh thank you very much! Reference: Chapter 1: birds and trees 

Add calls to trim, trim-leading or trim-trailing in the final statement to remove surrounding whitespace, as desired.

https://raku.org

Using Raku (formerly known as Perl_6)

~$ raku -e 'my @a; for slurp() { @a = .comb( /^^Chapter .*? <?before \nChapter | $ > /) }; @a.grep(/quench/).put;'  file 

Certain someone will post a Perl answer, but here's an answer written in Raku (a.k.a. Perl6). Briefly, the file is slurped in, and combed through to locate matching records (Chapters). Then in the second statement grep is used to only return matching records (Chapters). Sample Input is same as provided by OP.

Sample Output:

Chapter: 1 One: Birds and Trees Birds are beautiful and trees are amazing and they are dependent on each other. Birds most of the time choose to make their nests on trees since trees provide more stability. One day the bird sat on a tree and said; Bird: Oh my I'm so tired from all the flying, I should take a rest Tree: Mr Bird, you seem tired, perhaps you should take some rest, and here are some fruits to quench your thirst. Bird: Oh thank you very much! Reference: Chapter 1: birds and trees 

https://raku.org

Using Raku (formerly known as Perl_6)

~$ raku -e 'my @a;  for slurp() {  @a = .comb(/^^ Chapter  .*?  <?before \nChapter | $ > /)  }; @a.grep(/quench/).put;' file 

Certain someone will post a Perl answer, but here's an answer written in Raku (a.k.a. Perl6). Raku provides high-level support for Unicode, built-in. 

Briefly, the file is slurped in, and combed through to locate matching records (Chapters). Then in the final statement grep is used to only return matching records (Chapters). Sample Input is same as provided by OP.

Sample Output:

Chapter: 1 One: Birds and Trees Birds are beautiful and trees are amazing and they are dependent on each other. Birds most of the time choose to make their nests on trees since trees provide more stability. One day the bird sat on a tree and said; Bird: Oh my I'm so tired from all the flying, I should take a rest Tree: Mr Bird, you seem tired, perhaps you should take some rest, and here are some fruits to quench your thirst. Bird: Oh thank you very much! Reference: Chapter 1: birds and trees 

Add calls to trim, trim-leading or trim-trailing in the final statement to remove surrounding whitespace, as desired.

https://raku.org

deleted 2 characters in body
Source Link
jubilatious1
  • 3.9k
  • 10
  • 21
Loading
Source Link
jubilatious1
  • 3.9k
  • 10
  • 21
Loading