5
$\begingroup$

In Working With String Patterns, Wolfram documentation yields a simple Grep function:

Grep[file_, patt_] := With[{data = Import[file, "Lines"]}, Pick[Transpose[{Range[Length[data]], data}], StringFreeQ[data, patt], False]] 

However, the actual grep function is far more sophisticated than this. For example running grep -nr -C 2 <pattern> allows us to search for <pattern> recursively through a directory, showing 2 lines of context around each match. Wolfram should in principle be able to do this (and even far better than this, perhaps using datasets?).

Concrete Question: How can one use Wolfram to create a grep function that at least reproduces grep -nr -C N <pattern> functionality? (If it simply wraps the actual grep command that's fine).

$\endgroup$
2
  • 2
    $\begingroup$ I think the only way here is to call grep from WL. Rewriting all the functionality in WL would take a very long time. $\endgroup$ Commented Aug 23, 2019 at 22:17
  • 1
    $\begingroup$ Does every Mathematica system have access to grep (e.g. Windows)? $\endgroup$ Commented Aug 24, 2019 at 10:18

2 Answers 2

12
$\begingroup$

Mathematica allows text searching using regular expressions (based on the PCRE library). It would take some work to re-implement the whole grep functionality within Mathematica, but for your concrete example

grep -nr -C 2 <pattern> 

it is as easy as follows:

ClearAll[Grep] Grep[files_List, patt_, c_Integer: 0, style : {__} : {Red, Bold}] := Monitor[Do[Grep[files[[i]], patt, c, style], {i, Length[files]}], ProgressIndicator[i, {1, Length[files]}]]; Grep::noopen = "Can't open \"``\"."; Grep[file_, patt_, c_Integer: 0, style : {__} : {Red, Bold}] := Module[{lines, pos}, Quiet[Check[lines = ReadList[file, "String"], Return[Message[Grep::noopen, file], Module]], {ReadList::noopen, ReadList::stream}]; pos = Flatten@Position[StringContainsQ[lines, patt], True]; If[pos =!= {}, Echo@Grid[Prepend[{#, Column@StringReplace[lines[[Span[Max[# - c, 1], UpTo[# + c]]]], str : patt :> "\!\(\*StyleBox[\"" <> str <> "\"," <> StringRiffle[ToString /@ style, ", "] <> "]\)"]} & /@ pos, {file, SpanFromLeft}], Dividers -> All, Alignment -> Left]];]; 

where

  • file or files is a file name/path or a list of them

  • patt is a literal string, StringExpression or RegularExpression pattern to search for

  • c is number of additional lines of leading and trailing output context

  • style is a List of styling directives to be applied to the matching text (I use here great solution by halirutan from this answer); if you don't want to apply a style, put {{}} as the value for this option

For obtaining the complete listing of files in a directory and all its subdirectories at all levels one can use FileNames as Select[FileNames[All, dir, Infinity], Not@*DirectoryQ]. A very enlightening discussion of its usage for obtaining only specific filepaths can be found here.

Examples

Find lines containing the word "Welfare" and display them with 1 surrounding line of leading and trailing context:

Grep[FindFile@"ExampleData/USConstitution.txt", WordBoundary ~~ "Welfare" ~~ WordBoundary, 1] 

screenshot1

Search for word "eye" in all files in a directory and all its subdirectories:

dir = FileNameJoin[{$InstallationDirectory, "Documentation/English/System/ExamplePages"}]; files = Select[FileNames[All, dir, Infinity], Not@*DirectoryQ]; Grep[files, WordBoundary ~~ "eye" ~~ WordBoundary] 

(* during evaluation it displays ProgressIndicator *)

screenshot2

$\endgroup$
2
  • $\begingroup$ This is great. Is there a way to highlight the searched word throughout the Grid (something that grep does in most terminals)? $\endgroup$ Commented Sep 5, 2019 at 16:25
  • $\begingroup$ @George Please see the updated answer. $\endgroup$ Commented Sep 5, 2019 at 17:01
7
$\begingroup$

Update

As recommended in the comments by @b3m2a1, you can also use RunProcess as a simpler way to execute grep. You need to supply the command as a list of the command plus the space delimited arguments and set the ProcessDirectory. To do a recursive search for NotebookDirectory in notebook files enter the following:

cmd = "grep -RH \"NotebookDirectory\" --include=\"Int*.nb\" *"; RunProcess[{"bash", "-c", cmd}, "StandardOutput", ProcessDirectory -> NotebookDirectory[]] (* "Absorption/BakedSlider/InterphaseMassTransfer_slider.nb: \ RowBox[{\"NotebookDirectory\", \"[\", \"]\"}], \"]\"}], \";\"}], Absorption/BakedSlider/InterphaseMassTransfer_slider.nb:" *) 

Original Answer

Here is an example calling the system grep (using Cygwin and putting bash.exe in my path on Windows). Remember to escape special characters. The following does a recursive directory search on "NotebookDirectory" including Mathematica notebooks matching the pattern "Int*.nb".

SetDirectory[NotebookDirectory[]]; file = CreateFile[]; Run["grep -RH \"NotebookDirectory\" --include=\"Int*.nb\" * >>" <> file]; FilePrint[file] DeleteFile[file]; (*Absorption/BakedSlider/InterphaseMassTransfer_slider.nb: RowBox[{"NotebookDirectory", "[", "]"}], "]"}], ";"}], Absorption/BakedSlider/InterphaseMassTransfer_slider.nb: RowBox[{"NotebookDirectory", "[", "]"}], "]"}], ";"}], Absorption/BakedSlider/InterphaseMassTransfer_sliderb.nb: RowBox[{"NotebookDirectory", "[", "]"}], "]"}], ";"}], Absorption/BakedSlider/InterphaseMassTransfer_sliderb.nb: *) 
$\endgroup$
5
  • 1
    $\begingroup$ You can call RunProcess and have it adjust the path automatically so you don't need to put it on there. That'll also return the actual string. This is how I implemented my hook into Grep. $\endgroup$ Commented Aug 23, 2019 at 23:27
  • $\begingroup$ @b3m2a1 Thanks for the tip. Do you mind if I update my answer with your suggestion? $\endgroup$ Commented Aug 24, 2019 at 0:25
  • $\begingroup$ I mind not in the slightest $\endgroup$ Commented Aug 24, 2019 at 0:25
  • 2
    $\begingroup$ Use ProcessDirectory. You don’t need to set the directory at all. Also can use ProcessEnvironment to set the path. $\endgroup$ Commented Sep 7, 2019 at 20:20
  • $\begingroup$ @b3m2a1 That is a much better way to do it. I will edit the post. $\endgroup$ Commented Sep 8, 2019 at 17:52

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.