I have a code to read the html and modify some text using Beatiful Soup. It works fine but when I read the output, this part of my html file is changed automatically:
Original : <meta http-equiv="Content-Type" content="text/html; charset=**iso-8859-1**" />
Modified by itself: <meta http-equiv="Content-Type" content="text/html; charset=**utf-8**" />
I don't want any of the file contents to change automatically. Can someone help me with this.
Here is my code:
import re import sys from BeautifulSoup import BeautifulSoup f = open(sys.argv[1],"rw") data = f.read() soup = BeautifulSoup(data) comma = re.compile(',') for t in soup.findAll(text=comma): t.replaceWith(t.replace(',', '&sbquo')) print soup