cant use case conversion by "regex" and "re" module in python [duplicate]

Question

import regex st = """ <!-- Start of page --> <HTML> <!-- Start of head --> <HEAD> <TITLE>My Title</TITLE> <!-- Page title --> </HEAD> <!-- Body --> <BODY> """ pat = regex.compile(r"<!-{2,}(.*?)-{2,}>") st2 = pat.sub(r'\U\1\E', st) print(st2)

In the above code, I am trying to implement a Case Conversion operation using "regex" module(indeed used "re" module too) and what I want to do is To convert all text between comments written in HTML to upper Case, for example  TO , but When I try to do so, With this syntactically correct code, that should have worked It gives me this error.

Traceback (most recent call last): File "C:/Users/m.m/PycharmProjects/untitled9/source.py", line 13, in <module> st2 = pat.sub(r'\U\1\E', st) File "C:\Users\m.m\.virtualenvs\untitled5\lib\site-packages\regex\regex.py", line 676, in _compile_replacement_helper is_group, items = _compile_replacement(source, pattern, is_unicode) File "C:\Users\m.m\.virtualenvs\untitled5\lib\site-packages\regex\_regex_core.py", line 1696, in _compile_replacement return False, [parse_repl_hex_escape(source, HEX_ESCAPES[ch], ch)] File "C:\Users\m.m\.virtualenvs\untitled5\lib\site-packages\regex\_regex_core.py", line 1764, in parse_repl_hex_escape source.string, source.pos) regex._regex_core.error: incomplete escape \U at position 3

It seem's that it does not know what is the purpose of \U and \L and gives "incomplete escape error"

I am currently using python 3.7 and I have tried to do so, with "re" module too, but it does not work.

I wanted to know , what is the problem ?, I have seen many books used to do case conversion with regex. but The fact that "Why should not this work ?" has been an enigma for me.

Is the problem from the syntax, or does it originate from the python implementation of regex itself, that does not support such operation to convert cases?

In This question, I am trying to convert the text using \E etc, or formally "using CaseConversion in regex"

Are there no HTML parsers/libraries that allow you to modify comments? Using RegEx for this seems horrible. — AMC
– AMC, Commented Dec 1, 2019 at 7:30
Also, you state it does not know what is the purpose of \U and \L yet your code contains \U and \E. Which is it? — AMC
– AMC, Commented Dec 1, 2019 at 7:34
@MohVahedi I’m asking whether your are using \U and \L or \U and \E. I’m not certain that the error is caused by the re module’s inability to handle certain valid expressions. — AMC
– AMC, Commented Dec 1, 2019 at 7:39
I’m having trouble finding information on \E, do you know of any good resources for either one? — AMC
– AMC, Commented Dec 1, 2019 at 7:40
@MohVahedi I would be interested in seeing that book. In any case, here is a similar question right here on Stack Overflow: stackoverflow.com/q/28588603/11301900. — AMC
– AMC, Commented Dec 1, 2019 at 7:42

score 2 · Accepted Answer · 2019-12-01 20:21:46Z

print(re.sub(r"<!-{2,}(.*?)-{2,}>",lambda x:"<!--"+x.group(1).upper()+"-->",st))

This is using the re module, sub can take the second argument as a string or as a callable, if it is a callable every match is sent as a argument to the callable, you can then do normal operations using that object.

This gives

<!--START OF PAGE--> <HTML> <!--START OF HEAD--> <HEAD> <TITLE>My Title</TITLE> <!--PAGE TITLE--> </HEAD> <!--BODY--> <BODY>

I am sorry, "Case conversion using what I Stated is not supported in python" and it gives error according to this resource which indeed is true resource
@TheNamesAlc Don’t forget to actually edit your answer once the realization has hit ;)

Collectives™ on Stack Overflow

cant use case conversion by "regex" and "re" module in python [duplicate]

1 Answer 1

4 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Linked

Related