Making letters uppercase using re.sub in python?

Question

In many programming languages, the following

find foo([a-z]+)bar and replace with GOO\U\1GAR

will result in the entire match being made uppercase. I can't seem to find the equivalent in python; does it exist?

Andrew Clark · Accepted Answer · 2012-01-19 23:02:06Z

You can pass a function to re.sub() that will allow you to do this, here is an example:

 def upper_repl(match): return 'GOO' + match.group(1).upper() + 'GAR'

And an example of using it:

 >>> re.sub(r'foo([a-z]+)bar', upper_repl, 'foobazbar') 'GOOBAZGAR'

I suppose that will have to do although it is awfully complicated, considering.

wim · Accepted Answer · 2022-10-25 19:12:32Z

Unfortunately this \U\1 syntax could never work in Python because \U in a string literal indicates the beginning of a 32-bit hex escape sequence. For example, "\U0001f4a9" == "💩".

However there are easy alternative to Perl's case conversion escapes available by using a replacement function. In re.sub(pattern, repl, string, count=0, flags=0) the replacement repl is usually a string, but it can also be a callable. If it is a callable, it's passed the Match object and must return a replacement string to be used.

So, for the example given in the question, this is possible:

>>> string = "fooquuxbar" >>> pattern = "foo([a-z]+)bar" >>> re.sub(pattern, lambda m: f"GOO{m.group(1).upper()}GAR", string) 'GOOQUUXGAR'

Here is a table of other string methods which might be useful for similar case conversions.

Modifier	Description	Example	Python callable to use
\U	Uppercase	foo BAR --> FOO BAR	`str.upper`
\L	Lowercase	foo BAR --> foo bar	`str.lower` or `str.casefold`
\I	Initial capital	foo BAR --> Foo Bar	`str.title`
\F	First capital	foo BAR --> Foo bar	`str.capitalize`

wim · Accepted Answer · 2022-10-25 19:13:40Z

If you already have a replacement string (template), you may not be keen on swapping it out with the verbosity of m.group(1)+...+m.group(2)+...+m.group(3)... Sometimes it's nice to have a tidy little string.

You can use the MatchObject's expand() function to evaluate a template for the match in the same manner as sub(), allowing you to retain as much of your original template as possible. You can use upper on the relevant pieces.

re.sub(r'foo([a-z]+)bar', lambda m: 'GOO' + m.expand(r'\1GAR').upper(), 'foobazbar')

While this would not be particularly useful in the example above, and while it does not aid with complex circumstances, it may be more convenient for longer expressions with a greater number of captured groups, such as a MAC address censoring regex, where you just want to ensure the full replacement is capitalized or not.

Just wanted to add that you might need to explicitly define the expansion string as regex, depending on the interpreter: m.expand(r'\1') works, while m.expand('\1') is treated as ASCII 001 (at least on 3.7.2).

highBandWidth · Accepted Answer · 2012-01-19 22:57:55Z

You could use some variation of this:

s = 'foohellobar' def replfunc(m): return m.groups()[0]+m.groups()[1].upper()+m.groups()[2] re.sub('(foo)([a-z]+)(bar)',replfunc,s)

gives the output:

'fooHELLObar'

Michael Delgado · Accepted Answer · 2015-05-09 21:47:43Z

For those coming across this on google...

You can also use re.sub to match repeating patterns. For example, you can convert a string with spaces to camelCase:

def to_camelcase(string): string = string[0].lower() + string[1:] # lowercase first return re.sub( r'[\s]+(?P<first>[a-z])', # match spaces followed by \w lambda m: m.group('first').upper(), # get following \w and upper() string) to_camelcase('String to convert') # --> stringToConvert

Collectives™ on Stack Overflow

Making letters uppercase using re.sub in python?

5 Answers 5

1 Comment

Comments

1 Comment

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

1 Comment

Comments

1 Comment

Comments

Comments

Linked

Related