Timeline for Only remove commas embedded within quotes in a comma delimited file
Current License: CC BY-SA 3.0
13 events
| when toggle format | what | by | license | comment | |
|---|---|---|---|---|---|
| Nov 22, 2012 at 10:44 | comment | added | Stéphane Chazelas | That solution doesn't work in cases like echo '"", "asd"' | perl -pe 's/"(.+?[^\\])"/($ret = $1) =~ (s#,##g); $ret/ge' | |
| Sep 20, 2012 at 12:04 | comment | added | Thor | @CraigSanders: Agreed, that soon becomes too complicated. I did find a simpler two tier sed alternative which I added to my answer. | |
| Sep 20, 2012 at 11:28 | comment | added | cas | @Thor: nice. I started playing with /e but couldn't get it to replace multiple commas inside quotes or cope with multiple quote-delimited sections on a line. also, perl is easier to read and understand when sed scripts start getting too complicated. | |
| Sep 20, 2012 at 11:04 | comment | added | Thor | @CraigSanders: with GNU sed you sort of can with the execute flag and sub-shelling, e.g.: sed -r 's/(.*)"([^"]+)"(.*)/(echo -n "\1"; echo -n "\2" | tr -d ','; echo -n "\3")/e' | |
| Sep 20, 2012 at 10:53 | comment | added | user1146332 | Thanks for your comment. Would be interesting if either the [^"]* approach or the explicit non-greedy approach consumes less cpu time. | |
| Sep 20, 2012 at 10:21 | comment | added | cas | +1. after trying a few things with sed, I checked sed's docs and confirmed that it can't apply a replace to just the matching portion of a line...so gave up and tried perl. Ended up with a very similar approach but this version uses [^"]* to make the match non-greedy (i.e. matches everything from one " to the next "): perl -pe 's/"([^"]+)"/($match = $1) =~ (s:,::g);$match;/ge;'. It does not acknowledge the outlandish idea that a quote might be escaped with a backslash :-) | |
| Sep 20, 2012 at 9:31 | comment | added | tojrobinson | Including the non \ in your capture group produces an equivalent result. +1 | |
| Sep 20, 2012 at 9:26 | comment | added | user1146332 | Thanks for your objection, I have corrected that. Nevertheless I think we don't need look behind assertion here, or do we!? | |
| Sep 20, 2012 at 9:24 | history | edited | user1146332 | CC BY-SA 3.0 | edited body |
| Sep 20, 2012 at 9:13 | comment | added | tojrobinson | The [^\\] is going to have the undesired effect of matching the last character inside the quotes and removing it (non \ character), i.e., you should not consume that character. Try (?<!\\) instead. | |
| Sep 20, 2012 at 9:08 | history | edited | user1146332 | CC BY-SA 3.0 | added 53 characters in body |
| Sep 20, 2012 at 9:01 | history | edited | user1146332 | CC BY-SA 3.0 | added 1 characters in body |
| Sep 20, 2012 at 8:56 | history | answered | user1146332 | CC BY-SA 3.0 |