1

I have a lot of csv files that I am having trouble reading since the delimiter is ',' and one of the fields is a list with comma separated values in square brackets. As an example:

first,last,list John,Doe,['foo','234','&3bar'] Johnny,Does,['foofo','abc234','d%9lk','other'] 

I would like to change the delimiter to '|' (or whatever else) to get:

first|last|list John|Doe|['foo','234','&3bar'] Johnny|Does|['foofo','abc234','d%9lk','other'] 

How can I do this? I'm trying to use sed right now, but anything that works is fine.

2 Answers 2

2

I don't know it could be possible through sed or awk but you could do this easily through perl.

$ perl -pe 's/\[.*?\](*SKIP)(*F)|,/|/g' file first|last|list John|Doe|['foo','234','&3bar'] Johnny|Does|['foofo','abc234','d%9lk','other'] 

Run the below command to save the changes made to that file.

perl -i -pe 's/\[.*?\](*SKIP)(*F)|,/|/g' file 
Sign up to request clarification or add additional context in comments.

1 Comment

thanks that worked, but also helped me realize theres even more issues with the format of my original files >:X
0

If it's always 2 values before the list, you could make use of the limit argument to split in perl:

perl -pe '$_ = join "|", split /,/, $_, 3' list 

This splits on commas up to a maximum number of 3 fields, then joins them back together with a pipe. The -p switch means that each line of input is stored as $_ and processed before, then $_ is printed.

Output:

first|last|list John|Doe|['foo','234','&3bar'] Johnny|Does|['foofo','abc234','d%9lk','other'] 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.