8

I have a text file, which contains a date in the form of dd/mm/yyyy (e.g 20/12/2012).

I am trying to use grep to parse the date and show it in the terminal, and it is successful, until I meet a certain case:

These are my test cases:

  • grep -E "\d*" returns 20/12/2012
  • grep -E "\d*/" returns 20/12/2012
  • grep -E "\d*/\d*" returns 20/12/2012
  • grep -E "\d*/\d*/" returns nothing
  • grep -E "\d+" also returns nothing

Could someone explain to me why I get this unexpected behavior?

EDIT: I get the same behavior if I substitute the " (weak quotes) for ' (strong quotes).

4 Answers 4

12

The syntax you used (\d) is not recognised by Bash's Extended regex.

Use grep -P instead which uses Perl regex (PCRE). For example:

grep -P "\d+/\d+/\d+" input.txt grep -P "\d{2}/\d{2}/\d{4}" input.txt # more restrictive 

Or, to stick with extended regex, use [0-9] in place of \d:

grep -E "[0-9]+/[0-9]+/[0-9]" input.txt grep -E "[0-9]{2}/[0-9]{2}/[0-9]{4}" input.txt # more restrictive 
Sign up to request clarification or add additional context in comments.

1 Comment

-E did it for me :)
4

You could also use -P instead of -E which allows grep to use the PCRE syntax

grep -P "\d+/\d+" file 

does work too.

Comments

2

grep and egrep/grep -E don't recognize \d. The reason your first three patterns work is because of the asterisk that makes \d optional. It is actually not found.

Use [0-9] or [[:digit:]].

1 Comment

You got a +1 from me, because you explained to me what's wrong, but actually your alternatives don't work for me :(
2

To help troubleshoot cases like this, the -o flag can be helpful as it shows only the matched portion of the line. With your original expressions:

grep -Eo "\d*" returns nothing - a clue that \d isn't doing what you thought it was.

grep -Eo "\d*/" returns / (twice) - confirmation that \d isn't matching while the slashes are.

As noted by others, the -P flag solves the issue by recognizing "\d", but to clarify Explosion Pills' answer, you could also use -E as follows:

grep -Eo "[[:digit:]]*/[[:digit:]]*/" returns 20/12/

EDIT: Per a comment by @shawn-chin (thanks!), --color can be used similarly to highlight the portions of the line that are matched while still showing the entire line:

grep -E --color "[[:digit:]]*/[[:digit:]]*/" returns 20/12/2012 (can't do color here, but the bold "20/12/" portion would be in color)

2 Comments

Good hint about using -o. Alternatively, use --color to highlight the matching text among the returned output.
Thank you for that answer! It's a fantastic one. It's only a shame I get to accept only one answer.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.