Timeline for best / any way to count escapes in glibc regex(7) / GNU sed / grep / egrep Regular Expressions?
Current License: CC BY-SA 4.0
33 events
| when toggle format | what | by | license | comment | |
|---|---|---|---|---|---|
| Jan 3, 2024 at 1:26 | comment | added | Ed Morton | @JVD I know strings can contain escapes, I just don't know what you intended doubling them to mean as described in my comment. I'm not confused by any behavior ksh or zsh might have, just wanted to know what you intended your bash code to do. I see echo $s in your comment instead of echo "$s" - is that the intent of the double escapes, so you can remove the quotes that should be around "$s" and have one escape left after bash consumes one of them? | |
| Jan 2, 2024 at 21:08 | comment | added | JVD | @EdMorton : In bash shell, Double-Quoted strings can contain Escapes : $ declare s="'a quot\\'d string' 42"; $ declare -p s declare -- s="'a quot\\'d string' 42" $ echo $s 'a quot\'d string' 42 I think this is a source of the confusion, since zsh / ksh would have different behaviour. | |
| Jan 2, 2024 at 20:42 | history | edited | JVD | CC BY-SA 4.0 | added 3654 characters in body |
| Jan 2, 2024 at 20:36 | history | edited | JVD | CC BY-SA 4.0 | added 3654 characters in body |
| Jan 2, 2024 at 20:30 | history | edited | JVD | CC BY-SA 4.0 | added 3654 characters in body |
| Jan 2, 2024 at 20:23 | history | edited | JVD | CC BY-SA 4.0 | added 3654 characters in body |
| Dec 25, 2023 at 15:25 | comment | added | Ed Morton | In echo "'a quot\\'d string' 42" - is \\' intended to be an escaped ' or an escaped \ followed by a '? If the former, why do you have 2 \s before the '? If the latter, doesn't that make it an invalid string as it has an odd number of unescaped 's? | |
| Dec 25, 2023 at 15:16 | comment | added | Ed Morton | Please edit your question to provide concise, testable sample input and expected output that demonstrates all your needs so we can best help you. Focus on WHAT you want to do rather than HOW you're trying to do it. | |
| Dec 25, 2023 at 3:02 | answer | added | JVD | timeline score: -3 | |
| Dec 23, 2023 at 22:24 | history | became hot network question | |||
| Dec 23, 2023 at 17:14 | answer | added | Stéphane Chazelas | timeline score: 4 | |
| Dec 23, 2023 at 16:23 | comment | added | JVD | One could imagine, after processing the RE character sequence ']{X,}', that the compiler / matcher knows how many elements have matched the whole '[$class]{$at_least,$at_most}' group ; so, why not let programmers, with a new syntax, refer and react to this known number somehow ? Like '\#<expr>' to perform a test on that number ? And then being able to conditionally include $A if test succeeds, $B if it fails, in RE ? Yes, maybe old-style Back-References can be used - but I'd like to allow any program to using the glibc POSIX regexp library to handle quoted strings OK. | |
| Dec 23, 2023 at 16:16 | comment | added | JVD | @StéphaneChazelas : yes, I know, there was not enough room in previous comment to say that of course in that context I am using an "IFS=$'\x09'" setting, so bash will split $my_arr only on "\t" / tabs, but still where should my sed script insert the tabs ? If it cannot determine in the expression exactly how long is the preceding sequence of escapes, or if this is an ODD or EVEN number . | |
| Dec 23, 2023 at 16:09 | comment | added | Stéphane Chazelas | Doing arr=( $str ) or arr=( $(cmd) ) does split+glob on the contents of $str (or the output of cmd), quotes are not involved in that. | |
| Dec 23, 2023 at 15:54 | comment | added | JVD | @StéphaneChazelas : I an trying to find the easiest / simplest / most elegant way of getting Shell scripts to parse strings into tuples that contain singly or doubly quoted strings, is all - consider : $ declare my_mbrs="'a quot\'d string' 42"; $ declare -a my_arr=($my_mbrs); $ declare -p my_arr declare -a my_arr=([0]="'a" [1]="quot\\'d" [2]="string'" [3]="42") I am trying to develop the right sed expression so I can say : declare -a my_arr=($(echo "$my_mbrs"| sed -n "$X")) and I will get the correct (my_arr[0]=="a quot'd string" my_arr[1]==42 | |
| Dec 23, 2023 at 15:46 | comment | added | Stéphane Chazelas | FWIW, POSIX EREs have no back-references. POSIX BREs do. GNU BRE and ERE do except in GNU awk. | |
| Dec 23, 2023 at 15:39 | comment | added | Stéphane Chazelas | I can't make much sense of what you're asking. Maybe you can start by clarifying what problem you're trying to solve. | |
| Dec 23, 2023 at 15:34 | history | edited | JVD | CC BY-SA 4.0 | added 258 characters in body |
| Dec 23, 2023 at 15:32 | comment | added | JVD | @rowboat: the end-of-line escaping by '\' provided by grep / sed / bash will suffice - this should be done before text gets pattern-matched . | |
| Dec 23, 2023 at 15:29 | history | edited | JVD | CC BY-SA 4.0 | added 258 characters in body |
| Dec 23, 2023 at 15:26 | comment | added | JVD | @rowboat: yes, in previous invesgtigations of this, I did not have "\\'" in ${q} prefix & suffix, and it worked better, but as I mentioned in text, in order to enclose in any single quoted string you need backslash, and if in double quotes a single backslash is replaced by the char it escapes. Will ammend RE accordingly in Example # 2. | |
| Dec 23, 2023 at 15:20 | history | edited | JVD | CC BY-SA 4.0 | added 248 characters in body |
| Dec 23, 2023 at 15:17 | comment | added | rowboat | BTW [\'] matches on backslash or single quotation mark, so your expression matches \foo', 'foo\, \foo\ and 'foo' | |
| Dec 23, 2023 at 15:13 | comment | added | terdon♦ | Also, you mention "POSIX RE" (which presumably means Basic Regular Expressions, BRE) but also mention GNU grep which supports Extended Regular Expressions (ERE) but also Perl Compatible Regular Expressions (PCRE). Please clarify which regex language you are actually interested in. I am pretty sure you know significantly more about this than I do, but I still think you could clarify a bit and it isn't just my own ignorance that is confusing me. | |
| Dec 23, 2023 at 15:12 | comment | added | JVD | @terdon : OK , thanks, will do. | |
| Dec 23, 2023 at 15:11 | comment | added | terdon♦ | Please edit your question and add some example inputs and what you are expecting as output. Show us what kind of things you are trying to match. It is extremely hard to parse that regex without such examples. | |
| Dec 23, 2023 at 15:10 | comment | added | JVD | I think I'm going to have to develop & submit some sort of patch to glibc Regexps to be able to have, as a GNU extension, some kind of Previous { .. } Group Number of Elements Referral, ('}\#') and Test / Arithmetic, and Conditional Expressions inside REs - I could really use such a thing right now . | |
| Dec 23, 2023 at 15:05 | history | edited | JVD | CC BY-SA 4.0 | deleted 1 character in body |
| Dec 23, 2023 at 14:59 | history | edited | JVD | CC BY-SA 4.0 | added 7 characters in body |
| Dec 23, 2023 at 14:53 | history | edited | JVD | CC BY-SA 4.0 | added 12 characters in body |
| Dec 23, 2023 at 14:30 | history | edited | JVD | CC BY-SA 4.0 | added 182 characters in body |
| S Dec 23, 2023 at 14:23 | review | First questions | |||
| Dec 23, 2023 at 17:56 | |||||
| S Dec 23, 2023 at 14:23 | history | asked | JVD | CC BY-SA 4.0 |