Remove specific number of digits using sed

Question

John, 1234567 Bob, 2839211 Alex, 2817821 Mary, 9371281

I am currently trying to retrieve the first column with the last 4 digits of the second column using sed, so the output should look like this:

John, 4567 Bob, 9211 Alex, 7821 Mary, 1281

This is my command: 's/\(.*,\)\(.*\)//', I think that this command matches the first column until the comma and the second column until the end, but I am unsure on how to continue.

there should be a line break after each name, number and the command should have an asterisk inside both the parentheses my apologies — NixyCron
– NixyCron, Commented Feb 15, 2021 at 12:24
Is the file expected to have more columns? Or the format will always be <text>, <number>? — KamilCuk
– KamilCuk, Commented Feb 15, 2021 at 12:26
@KamilCuk it works thanks a lot, my apologies I am still very new to this, the format is always text and then number — NixyCron
– NixyCron, Commented Feb 15, 2021 at 12:27
@Pyrous please clarify if you always have space character after the , character — Sundeep
– Sundeep, Commented Feb 15, 2021 at 13:05

Wiktor Stribiżew · Accepted Answer · 2021-02-15 12:28:22Z

You can use

sed 's/^\([^,]*\), *[0-9]*\([0-9]\{4\}\).*/\1, \2/' file

See the online demo.

Details

^ - start of string
\([^,]*\) - Group 1: any zero or more chars other than a comma
, * - a comma and zero or more spaces
[0-9]* - zero or more digits
\([0-9]\{4\}\) - Group 2: four digits
.* - the rest of the line
\1, \2 - The replacement is: Group 1, ,, space and Group 2 value.

Shawn · Accepted Answer · 2021-02-15 12:28:35Z

Just capture the last four digits of each line and delete any preceding digits:

$ sed 's/[0-9]*\([0-9]\{4\}\)$/\1/' input.txt John, 4567 Bob, 9211 Alex, 7821 Mary, 1281

If using a version of sed that supports POSIX Extended Regular Expressions, it can be cleaned up a bit to

sed -E 's/[0-9]*([0-9]{4})$/\1/' input.txt

RavinderSingh13 · Accepted Answer · 2021-02-15 12:38:36Z

In case you are ok with awk, could you please try following. Written and tested with shown samples in GNU awk.

awk 'BEGIN{FS=OFS=", "} {$2=substr($2,length($2)-3)} 1' Input_file

Explanation: Adding detailed explanation for above.

awk ' ##Starting awk program from here. BEGIN{ ##Starting BEGIN section of this program from here. FS=OFS=", " ##Setting FS and OFS to comma space here. } { $2=substr($2,length($2)-3) ##Getting last 4 digits now in 2nd field here. } 1 ##printing current edited/non-edited line. ' Input_file ##Mentioning Input_file name here.

2nd solution: Adding 1 more solution in case your 2nd column can have mix of digits and other non digits then following may help you.

awk 'BEGIN{FS=OFS=", "} {gsub(/[^0-9]+/,"",$2);$2=substr($2,length($2)-3)} 1' Input_file

Explanation: Adding detailed explanation for above.

awk ' ##Starting awk program from here. BEGIN{ ##Starting BEGIN section of this program from here. FS=OFS=", " ##Setting FS and OFS to comma space here. } { gsub(/[^0-9]+/,"",$2) ##Globally substituting everything apart from digits with NULL in 2nd field. $2=substr($2,length($2)-3) ##getting last 4 digits now in 2nd field here. } 1 ##printing current edited/non-edited line. ' Input_file ##Mentioning Input_file name here.

fpmurphy · Accepted Answer · 2021-02-15 12:52:44Z

Similar to KamilCuk's answer except uses a POSIX character class and anchors the digits to be removed:

sed 's/, [[:digit:]]\{3\}/, /'

KamilCuk · Accepted Answer · 2021-02-15 12:35:11Z

0

If the file format is just <text only alphanumeric characters>, <number exactly 7 digits>, you can just remove first 3 digits there are:

sed 's/[0-9][0-9][0-9]//'

edited Feb 15, 2021 at 12:35

answered Feb 15, 2021 at 12:29

KamilCuk

146k8 gold badges84 silver badges154 bronze badges

1 Comment

fpmurphy Over a year ago

This regex can be shorted to sed 's/[0-9]\{3\}//'

Collectives™ on Stack Overflow

Remove specific number of digits using sed

5 Answers 5

Comments

Comments

Comments

Comments

1 Comment

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

Comments

Comments

Comments

Comments

1 Comment

Related