Skip to main content
13 events
when toggle format what by license comment
Nov 22, 2012 at 10:44 comment added Stéphane Chazelas That solution doesn't work in cases like echo '"", "asd"' | perl -pe 's/"(.+?[^\\])"/($ret = $1) =~ (s#,##g); $ret/ge'
Sep 20, 2012 at 12:04 comment added Thor @CraigSanders: Agreed, that soon becomes too complicated. I did find a simpler two tier sed alternative which I added to my answer.
Sep 20, 2012 at 11:28 comment added cas @Thor: nice. I started playing with /e but couldn't get it to replace multiple commas inside quotes or cope with multiple quote-delimited sections on a line. also, perl is easier to read and understand when sed scripts start getting too complicated.
Sep 20, 2012 at 11:04 comment added Thor @CraigSanders: with GNU sed you sort of can with the execute flag and sub-shelling, e.g.: sed -r 's/(.*)"([^"]+)"(.*)/(echo -n "\1"; echo -n "\2" | tr -d ','; echo -n "\3")/e'
Sep 20, 2012 at 10:53 comment added user1146332 Thanks for your comment. Would be interesting if either the [^"]* approach or the explicit non-greedy approach consumes less cpu time.
Sep 20, 2012 at 10:21 comment added cas +1. after trying a few things with sed, I checked sed's docs and confirmed that it can't apply a replace to just the matching portion of a line...so gave up and tried perl. Ended up with a very similar approach but this version uses [^"]* to make the match non-greedy (i.e. matches everything from one " to the next "): perl -pe 's/"([^"]+)"/($match = $1) =~ (s:,::g);$match;/ge;'. It does not acknowledge the outlandish idea that a quote might be escaped with a backslash :-)
Sep 20, 2012 at 9:31 comment added tojrobinson Including the non \ in your capture group produces an equivalent result. +1
Sep 20, 2012 at 9:26 comment added user1146332 Thanks for your objection, I have corrected that. Nevertheless I think we don't need look behind assertion here, or do we!?
Sep 20, 2012 at 9:24 history edited user1146332 CC BY-SA 3.0
edited body
Sep 20, 2012 at 9:13 comment added tojrobinson The [^\\] is going to have the undesired effect of matching the last character inside the quotes and removing it (non \ character), i.e., you should not consume that character. Try (?<!\\) instead.
Sep 20, 2012 at 9:08 history edited user1146332 CC BY-SA 3.0
added 53 characters in body
Sep 20, 2012 at 9:01 history edited user1146332 CC BY-SA 3.0
added 1 characters in body
Sep 20, 2012 at 8:56 history answered user1146332 CC BY-SA 3.0