0

For every line of a file I need to search if an string, containing regular expressions, is found in another file.

The problem is that the files are big, the first is 24MB and the second 115MB. I've tried first $(cat file1) as first argument of grep but it complains for the file size and then I'm trying now with xargs grep but is the same error

If I do a simple string search works

find . -name records.txt | xargs grep "999987^00086" 999987^00086^14743^00061^4 

but then if a try to take all the file with cat as argument it fails

find . -name records.txt | xargs grep "$(records_tofix.txt)" -bash: /usr/bin/xargs Argument list too long on grep 
6
  • 1
    I would expect your code to print bash: records_tofix.txt: command not found instead Commented Nov 6, 2019 at 23:55
  • You'll have a much, much more efficient time of this if you can sort your files and do a single merge operation for set comparisons -- far less memory usage and time that way (after the sort is done, granted, but the sort only needs to be done once per file). See comm as the canonical UNIX tool for set arithmetic (unions, joins, and differences) on sorted input streams. Commented Nov 6, 2019 at 23:56
  • Also, note that xargs should only be used with -0 or -d $'\n' arguments (the latter is a GNUism, but it's a necessary GNUism if you want files with one line per record to be unambiguously and correctly parsed). Commented Nov 6, 2019 at 23:57
  • ...without one of those arguments, foo bar on one line will be treated as two separate records, foo and bar; backslashes, quotes, &c. also get special (shell-like but not-quite-shell-compatible) treatment. Commented Nov 6, 2019 at 23:58
  • 1
    Anyhow, find . -name records.txt -exec grep -f records_tofix.txt -- {} + is your friend; no reason to use xargs at all. Commented Nov 6, 2019 at 23:59

1 Answer 1

3

Use the -f option:

grep -f records_tofix.txt 

The file should contain the patterns each on its own line.

find can execute commands directly, no reason to call xargs. The + syntax of -exec doesn't call the command for each value separately, but fills the whole command line similarly to xargs:

find . -name records.txt -exec grep -f records_tofix.txt -- {} + 
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.