2

I've received some files that need to be loaded into the database. These files have user input, and there are instances where the quote character is used an odd number of times. I'd like to filter out these records.

I would like to grep lines that contain a specific character an odd or even number of times.

Sample input:

12345|what"s wrong|20121212 
2
  • post a more extended input (valid and invalid lines) an odd or even - that means that they all should be displayed Commented Jun 6, 2017 at 22:02
  • i want to know how to do both separately, not in one call. i'll need to create 2 files, one with the good records, and one with the bad records that can be manually corrected Commented Jun 6, 2017 at 22:05

3 Answers 3

4

With awk:

awk -F \" 'NF % 2' < yourfile 

For even number of times (odd number of fields where fields are "-separated).

awk -F \" 'NF % 2 == 0' < yourfile 

Or to split the file into two files:

awk -F \" '{if (NF%2) print > "even.txt" else print > "odd.txt"}' < yourfile 

For odd number of times.

With grep, for even number:

grep -Ex '(([^"]*"){2})*[^"]*' 

For odd number, add the -v option.

1
  • the last one doesn' work word by word even wothout -x. I'm trying to print all the words that are not between double quotes but this inverse doesn't work as well egrep -ov '"[ a-zA-Z0-9]+",' testFile Commented Apr 2, 2019 at 12:59
2

Alternative perl approach:

-- to output lines with odd number of " occurrences

perl -ne 'print if y/\"// % 2' yourfile 

-- to output lines with even number of " occurrences

perl -ne 'print if y/\"// % 2 == 0' yourfile 

  • y/// - Perl transliteration operator
0
sed -ne ' h;:a s/"//;T s/"//;ta g;p ' yourfile 

Working

  1. Store the original since destructive process will commence.
  2. We setup a loop on in which we successively scrub a quote at a time, when the first del is unsuccessful => terminate operations for this line => even number of quotes were present.
  3. In 2nd scrub, if we were unsuccessful => an odd number of quotes, then we retrieve from hold the original and print.
  4. Else we loop back.

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.