Consider the following string:
stryng = "{\"Jordi\", \"Jordy\", \"Jorde\", \"Jordie\", \"Jordee\", \"Joardi\", \"Joardy\", \"Joarde\", \"Joardie\", \"Joardee\"}" That string represents a mathematical set of strings.
$\begin{Bmatrix} \mathtt{Jordi}, \mathtt{Jordy}, \mathtt{Jorde}, \mathtt{Jordie}, \mathtt{Jordee}, \mathtt{Joardi}, \mathtt{Joardy}, \mathtt{Joarde}, \mathtt{Joardie}, \mathtt{Joardee}\end{Bmatrix}$
If you have a string representation for a set of strings, such as {"i", "y", "ee"}, how do you convert that set of strings into a regular expression without the logical-disjunction vertical bar |?
I was hoping to have something like a character class except that instead of characters we used strings such that some of the strings were longer than one letter each.
For example, something close to a regular expression might be:
Joa?r{"i", "y", "ee"} That fake regex was intended to represent the set of all strings σ such that...
- The leftmost part of σ is
J- The part of σ second from the left is
o- The part of σ third from the left is zero or one instance of the letter
a- The fourth part of σ is the letter
r- The fifth part of σ is exactly one of the strings
i, ory, oree.
There is a regular expression [iye]e? which is somewhat like the following set:
{"i", "y", "e", "ie", "ye", "ee"}
We want to forbid the string ye.
In a python-flavored regex used in the library named re, can we write a set of strings, or do we have to write something else?
|"?|is part of the syntax of regular expressions. It's not clear what is allowed. If we take the standard syntax for regexps and remove|, all that is left is grouping ((...)) and Kleene star (*), which doesn't seem very expressive, but there are multiple different definitions of regexp syntax. Can you provide a self-contained definition of what you are looking for, and what definition of regular expression you are using and what syntax is allowed? $\endgroup$|, and for using a regexp for this. I wonder if this might be a XY problem, but it's hard to know. $\endgroup$Jordyewas in the set too, thenJoa+rd[iye]e+would work, where[iye]means "one character which can bei, yore", anda+means "zero or onea". But withoutJordyein the set I'm afraid you have to content withJoa+rd(i|y|e|ie|ee)orJoa+rd([iye]|[ie]e). $\endgroup$|, and for using a regexp for this. The reason is that in general, I want to convert sets containing thousands of strings into regular expressions so that the regular expressions are short in length. How do we find a reasonably short regex $r$ such that $\forall \sigma \in A$, $\sigma$ and regex $r$ match. Additionally, $\forall \sigma^{\prime} \in A^{-1}$, $\sigma^{\prime}$ and regex $r$ do not match. Note that $A$ is a set of strings and $\sigma$ is a string in set $A$. $\endgroup$