Case insensitive replace

Question

What's the easiest way to do a case-insensitive string replacement in Python?

Brian Moeskau · Accepted Answer · 2020-11-25 19:03:24Z

294

The string type doesn't support this. You're probably best off using the regular expression sub method with the re.IGNORECASE option.

>>> import re >>> insensitive_hippo = re.compile(re.escape('hippo'), re.IGNORECASE) >>> insensitive_hippo.sub('giraffe', 'I want a hIPpo for my birthday') 'I want a giraffe for my birthday'

edited Nov 25, 2020 at 19:03

Brian Moeskau

20.5k8 gold badges73 silver badges75 bronze badges

answered May 28, 2009 at 3:39

Blair Conrad

244k25 gold badges137 silver badges114 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Chiara Coetzee Over a year ago

If you're only doing a single replace, or want to save lines of code, it's more efficient to use a single substitution with re.sub and the (?i) flag: re.sub('(?i)' + re.escape('hippo'), 'giraffe', 'I want a hIPpo for my birthday')

Eleno Over a year ago

Why re.escape for a string of letters only? Thanks.

Blair Conrad Over a year ago

@Elena, it's not needed for 'hippo', but would be useful if the to-replace value was passed into a function, so it's really more of a good example than anything else.

Michael Scheper Jul 23 at 15:04

Well, I think it's a great example, because I use 'hippo' as a metasyntactic variable sometimes, too. 🦛 But I thought the Dandelion Wine song was I want a hippopotamus for Christmas, not 'my birthday'? 🙂

Mark Amery Over a year ago

Besides having to re.escape your needle, there's another trap here which this answer fails to avoid, noted in stackoverflow.com/a/15831118/1709587: since re.sub processes escape sequences, as noted in docs.python.org/library/re.html#re.sub, you need to either escape all backslashes in your replacement string or use a lambda.

stenci Over a year ago

This doesn't work for replacing r'A\BC' with r'D\EF' in r'xxxA\BCxxxA\BCxxx') - The correct answer is blow, the one from johv

Unknown · Accepted Answer · 2009-05-28 03:41:04Z

117

import re pattern = re.compile("hello", re.IGNORECASE) pattern.sub("bye", "hello HeLLo HELLO") # 'bye bye bye'

answered May 28, 2009 at 3:41

Unknown

47k29 gold badges142 silver badges184 bronze badges

3 Comments

Louis Yang Over a year ago

Or one-liner: re.sub('hello', 'bye', 'hello HeLLo HELLO', flags=re.IGNORECASE)

fuenfundachtzig Over a year ago

Note that re.sub only supports this flag since Python 2.7.

urek mazino Over a year ago

some times it return random strings

Bill the Lizard · Accepted Answer · 2018-07-02 20:35:46Z

In a single line:

import re re.sub("(?i)hello","bye", "hello HeLLo HELLO") #'bye bye bye' re.sub("(?i)he\.llo","bye", "he.llo He.LLo HE.LLO") #'bye bye bye'

Or, use the optional "flags" argument:

import re re.sub("hello", "bye", "hello HeLLo HELLO", flags=re.I) #'bye bye bye' re.sub("he\.llo", "bye", "he.llo He.LLo HE.LLO", flags=re.I) #'bye bye bye'

Neeraj · Accepted Answer · 2017-06-03 12:39:11Z

25

Continuing on bFloch's answer, this function will change not one, but all occurrences of old with new - in a case insensitive fashion.

def ireplace(old, new, text): idx = 0 while idx < len(text): index_l = text.lower().find(old.lower(), idx) if index_l == -1: return text text = text[:index_l] + new + text[index_l + len(old):] idx = index_l + len(new) return text

edited Jun 3, 2017 at 12:39

Neeraj

1461 gold badge3 silver badges11 bronze badges

answered Jan 23, 2011 at 11:46

rsmoorthy

2,4121 gold badge27 silver badges29 bronze badges

4 Comments

fyngyrz Over a year ago

Very well done. Much better than regex; it handles all kinds of characters, whereas the regex is very fussy about anything non-alphanumeric. Preferred answer IMHO.

Mad Physicist Over a year ago

All you have to do is escape the regex: the accepted answer is much shorter and easier to read than this.

ideasman42 Over a year ago

Escape only works for matching, backslashes in the destination can mess things up still.

Eugene Over a year ago

Possibly the fastest method for a case-insensitive replace, tested against both using an arrayed string and using regex.

Mark Amery · Accepted Answer · 2016-06-26 16:20:58Z

Like Blair Conrad says string.replace doesn't support this.

Use the regex re.sub, but remember to escape the replacement string first. Note that there's no flags-option in 2.6 for re.sub, so you'll have to use the embedded modifier '(?i)' (or a RE-object, see Blair Conrad's answer). Also, another pitfall is that sub will process backslash escapes in the replacement text, if a string is given. To avoid this one can instead pass in a lambda.

Here's a function:

import re def ireplace(old, repl, text): return re.sub('(?i)'+re.escape(old), lambda m: repl, text) >>> ireplace('hippo?', 'giraffe!?', 'You want a hiPPO?') 'You want a giraffe!?' >>> ireplace(r'[binfolder]', r'C:\Temp\bin', r'[BinFolder]\test.exe') 'C:\\Temp\\bin\\test.exe'

bFloch · Accepted Answer · 2011-01-21 14:09:54Z

7

This doesn't require RegularExp

def ireplace(old, new, text): """ Replace case insensitive Raises ValueError if string not found """ index_l = text.lower().index(old.lower()) return text[:index_l] + new + text[index_l + len(old):]

answered Jan 21, 2011 at 14:09

bFloch

1472 silver badges1 bronze badge

3 Comments

rsmoorthy Over a year ago

Good one, however this does not change all occurrences of old with new, but only the first occurrence.

Johannes Bittner Over a year ago

It's less readable than the regex version. No need to reinvent the wheel here.

Chiara Coetzee Over a year ago

It would be interesting to do a performance comparison between this and the upvoted versions, it might be faster, which matters for some applications. Or it might be slower because it does more work in interpreted Python.

Peter · Accepted Answer · 2023-02-13 17:29:21Z

An interesting observation about syntax details and options:

# Python 3.7.2 (tags/v3.7.2:9a3ffc0492, Dec 23 2018, 23:09:28) [MSC v.1916 64 bit (AMD64)] on win32 >>> import re >>> old = "TREEROOT treeroot TREerOot" >>> re.sub(r'(?i)treeroot', 'grassroot', old) 'grassroot grassroot grassroot' >>> re.sub(r'treeroot', 'grassroot', old) 'TREEROOT grassroot TREerOot' >>> re.sub(r'treeroot', 'grassroot', old, flags=re.I) 'grassroot grassroot grassroot' >>> re.sub(r'treeroot', 'grassroot', old, re.I) 'TREEROOT grassroot TREerOot'

Using the (?i) prefix in the match expression or adding flags=re.I as a fourth argument will result in a case-insensitive match - however using just re.I as the fourth argument does not result in case-insensitive match.

For comparison:

>>> re.findall(r'treeroot', old, re.I) ['TREEROOT', 'treeroot', 'TREerOot'] >>> re.findall(r'treeroot', old) ['treeroot']

This does not provide an answer to the question. please edit your answer to ensure that it improves upon other answers already present in this question.
From the re.sub docs it 5 parameters: re.sub(pattern, repl, string, count=0, flags=0) which is why flags=re.I works but trying to pass it as a positional parameter fails, it's in the wrong position.

Mitchell van Zuylen · Accepted Answer · 2024-03-19 13:47:53Z

This function uses both the str.replace() and re.findall() functions. It will replace all occurences of pattern in string with repl in a case-insensitive way.

def replace_all(pattern, repl, string) -> str: occurences = re.findall(pattern, string, re.IGNORECASE) for occurence in occurences: string = string.replace(occurence, repl) return string

Nimantha · Accepted Answer · 2021-11-15 03:51:10Z

I was having \t being converted to the escape sequences (scroll a bit down), so I noted that re.sub converts backslashed escaped characters to escape sequences.

To prevent that I wrote the following:

Replace case insensitive.

import re def ireplace(findtxt, replacetxt, data): return replacetxt.join( re.compile(findtxt, flags=re.I).split(data) )

Also, if you want it to replace with the escape characters, like the other answers here that are getting the special meaning bashslash characters converted to escape sequences, just decode your find and, or replace string. In Python 3, might have to do something like .decode("unicode_escape") # python3

findtxt = findtxt.decode('string_escape') # python2 replacetxt = replacetxt.decode('string_escape') # python2 data = ireplace(findtxt, replacetxt, data)

Tested in Python 2.7.8

Nimantha · Accepted Answer · 2021-11-15 03:51:47Z

1

i='I want a hIPpo for my birthday' key='hippo' swp='giraffe' o=(i.lower().split(key)) c=0 p=0 for w in o: o[c]=i[p:p+len(w)] p=p+len(key+w) c+=1 print(swp.join(o))

edited Nov 15, 2021 at 3:51

Nimantha

6,5376 gold badges32 silver badges78 bronze badges

answered Feb 16, 2012 at 13:59

anddan

311 bronze badge

3 Comments

isaaclw Over a year ago

For learning: generally when you do a search and replace on a string, it's better to not have to turn it into an array first. That's why the first answer is probably the best. While it's using an external module, it's treating the string as one whole string. It's also a bit clearer what's happening in the process.

Todd Over a year ago

For learning: its very difficult for a developer with no context to read this code and decipher what its doing :)

LazerDance Over a year ago

Any code that has counter++ is bad in general.

Chadee Fouad · Accepted Answer · 2023-09-19 03:37:59Z

1 line simple solution without imports :-)

words = 'GREETINGS from EGYPT. GreeTings from Cairo' replace_what, replace_with, = 'Greetings', 'Hello' result = ' '.join([replace_with if word.lower() == replace_what.lower() else word for word in words.split(' ')]) print (result)

The result is:

Hello from EGYPT. Hello from Cairo

Collectives™ on Stack Overflow

Case insensitive replace

11 Answers 11

6 Comments

3 Comments

Comments

4 Comments

Comments

3 Comments

2 Comments

Comments

Comments

3 Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

11 Answers 11

6 Comments

3 Comments

Comments

4 Comments

Comments

3 Comments

2 Comments

Comments

Comments

3 Comments

Comments

Linked

Related