Linked Questions
33 questions linked to/from Remove all special characters, punctuation and spaces from string
-1 votes
3 answers
4k views
how to take away all the spaces and punctuation in a string? (python) [duplicate]
def pre_process(t): """ (str) -> str returns a copy of the string with all punctuation removed, and all letters set to lowercase. The only characters in the output will be lowercase letters,...
-4 votes
1 answer
2k views
How to remove special character from string in python if string contains script other than english [duplicate]
I have a string which contains both English and Arabic and now I need to remove special characters. I know there exist a regex solution: re.sub('[^A-Za-z0-9]+', '', mystring) but this regex is also ...
-2 votes
1 answer
236 views
Is there a function that will remove any symbols (%, #, & ect.) from a string? [duplicate]
I want to remove ALL symbols from my string efficiently. x = hello!! r = dict.fromkeys(map(ord, '\n ' + string.punctuation)) x.translate(r) I expected this to remove all symbols instead of only a ...
0 votes
1 answer
126 views
Reading from file each line to list in Python [duplicate]
Here is my code: with open(path) as file: lines = file.readlines() print lines[0:5] However I get many extra characters, for example: ['cat2\xc2\xa0\xc2\xa0 2\xc2\xa0 0', 'cat1\xc2\xa00.5\...
-1 votes
1 answer
145 views
Other than text how to remove numbers , punctuation, white spaces and special characters from text? [duplicate]
I just scraped text data from a website and that data contains numbers, special characters and punctuation. After splitting the data and I tried to keep plain text but I'm getting spcaes, numbers, ...
113 votes
7 answers
34k views
List comprehension without [ ] in Python [duplicate]
Joining a list: >>> ''.join([ str(_) for _ in xrange(10) ]) '0123456789' join must take an iterable. Apparently, join's argument is [ str(_) for _ in xrange(10) ], and it's a list ...
97 votes
8 answers
82k views
What exactly is a "raw string regex" and how can you use it?
From the python documentation on regex, regarding the '\' character: The solution is to use Python’s raw string notation for regular expression patterns; backslashes are not handled in any special ...
37 votes
6 answers
33k views
How to account for accent characters for regex in Python?
I currently use re.findall to find and isolate words after the '#' character for hash tags in a string: hashtags = re.findall(r'#([A-Za-z0-9_]+)', str1) It searches str1 and finds all the hashtags. ...
16 votes
4 answers
18k views
Why does unpacking a tuple cause a syntax error?
In Python, I wrote this: bvar=mht.get_value() temp=self.treemodel.insert(iter,0,(mht,False,*bvar)) I'm trying to expand bvar to the function call as arguments. But then it returns: File "./...
5 votes
4 answers
41k views
How to remove special characters from txt files using Python
from glob import glob pattern = "D:\\report\\shakeall\\*.txt" filelist = glob(pattern) def countwords(fp): with open(fp) as fh: return len(fh.read().split()) print "There are" ,sum(map(...
2 votes
2 answers
9k views
How to replace all characters except letters, numbers, forward and back slashes
Want to parse through text and return only letters, digits, forward and back slashes and replace all else with ''. Is it possible to use just one regex pattern as opposed to several which then calls ...
3 votes
2 answers
16k views
Bash: How to use sed to remove all characters except letters and numbers?
First off, I'm still learning about regular expression, I have googled about this but still doesn't work. How do I remove all characters except letters and numbers in a variable with sed? For example ...
0 votes
3 answers
20k views
strip or remove all special characters from list of strings in python
I have list of strings and I have to remove all special characters (, - ' " .). My code is import glob import re files = [] for text in glob.glob("*.txt.txt"): with open(text) as f: fileRead =...
3 votes
3 answers
10k views
How to remove special characters from csv using pandas
Currently cleaning data from a csv file. Successfully mad everything lowercase, removed stopwords and punctuation etc. But need to remove special characters. For example, the csv file contains things ...
2 votes
2 answers
10k views
Python number recognition (on colored screen)
I searched for image recognition using python. It seems there is no tutorial about Extracting Numbers from colored background so I followed THIS TUTORIAL import cv2 import matplotlib.pyplot as plt ...