Linked Questions

-1 votes
3 answers
4k views

def pre_process(t): """ (str) -> str returns a copy of the string with all punctuation removed, and all letters set to lowercase. The only characters in the output will be lowercase letters,...
Eric Choi's user avatar
-4 votes
1 answer
2k views

I have a string which contains both English and Arabic and now I need to remove special characters. I know there exist a regex solution: re.sub('[^A-Za-z0-9]+', '', mystring) but this regex is also ...
Mohit Rajpoot's user avatar
-2 votes
1 answer
236 views

I want to remove ALL symbols from my string efficiently. x = hello!! r = dict.fromkeys(map(ord, '\n ' + string.punctuation)) x.translate(r) I expected this to remove all symbols instead of only a ...
CAKEe's user avatar
  • 105
0 votes
1 answer
126 views

Here is my code: with open(path) as file: lines = file.readlines() print lines[0:5] However I get many extra characters, for example: ['cat2\xc2\xa0\xc2\xa0 2\xc2\xa0 0', 'cat1\xc2\xa00.5\...
user8524583's user avatar
-1 votes
1 answer
145 views

I just scraped text data from a website and that data contains numbers, special characters and punctuation. After splitting the data and I tried to keep plain text but I'm getting spcaes, numbers, ...
Jainmiah's user avatar
  • 477
113 votes
7 answers
34k views

Joining a list: >>> ''.join([ str(_) for _ in xrange(10) ]) '0123456789' join must take an iterable. Apparently, join's argument is [ str(_) for _ in xrange(10) ], and it's a list ...
Alcott's user avatar
  • 18.7k
97 votes
8 answers
82k views

From the python documentation on regex, regarding the '\' character: The solution is to use Python’s raw string notation for regular expression patterns; backslashes are not handled in any special ...
temporary_user_name's user avatar
37 votes
6 answers
33k views

I currently use re.findall to find and isolate words after the '#' character for hash tags in a string: hashtags = re.findall(r'#([A-Za-z0-9_]+)', str1) It searches str1 and finds all the hashtags. ...
deadlock's user avatar
  • 7,380
16 votes
4 answers
18k views

In Python, I wrote this: bvar=mht.get_value() temp=self.treemodel.insert(iter,0,(mht,False,*bvar)) I'm trying to expand bvar to the function call as arguments. But then it returns: File "./...
asdacap's user avatar
  • 848
5 votes
4 answers
41k views

from glob import glob pattern = "D:\\report\\shakeall\\*.txt" filelist = glob(pattern) def countwords(fp): with open(fp) as fh: return len(fh.read().split()) print "There are" ,sum(map(...
rocksland's user avatar
  • 163
2 votes
2 answers
9k views

Want to parse through text and return only letters, digits, forward and back slashes and replace all else with ''. Is it possible to use just one regex pattern as opposed to several which then calls ...
lukik's user avatar
  • 4,090
3 votes
2 answers
16k views

First off, I'm still learning about regular expression, I have googled about this but still doesn't work. How do I remove all characters except letters and numbers in a variable with sed? For example ...
Lin's user avatar
  • 1,931
0 votes
3 answers
20k views

I have list of strings and I have to remove all special characters (, - ' " .). My code is import glob import re files = [] for text in glob.glob("*.txt.txt"): with open(text) as f: fileRead =...
AST's user avatar
  • 137
3 votes
3 answers
10k views

Currently cleaning data from a csv file. Successfully mad everything lowercase, removed stopwords and punctuation etc. But need to remove special characters. For example, the csv file contains things ...
plshelpme_'s user avatar
2 votes
2 answers
10k views

I searched for image recognition using python. It seems there is no tutorial about Extracting Numbers from colored background so I followed THIS TUTORIAL import cv2 import matplotlib.pyplot as plt ...
wert trew's user avatar

15 30 50 per page