Python 'utf8' codec can't decode byte 0xc3 in position 72: invalid continuation byte [duplicate]

Question

I am crawling a particular url from google.com but i get some error

'utf8' codec can't decode byte 0xc3 in position 72: invalid continuation byte

Code:

import re import os import MySQLdb import codecs import requests import base64 import random import gzip import time from multiprocessing.pool import Pool import datetime import time import sys reload(sys) sys.setdefaultencoding('utf-8') def proxy_mesh(): while True: try: data = requests.get('google.com') print data.text.encode('utf-8') except Exception, e: print e print "Trying again" time.sleep(3) proxy_mesh()

What is the FIX and how to over come this error?

In other words, you're trying to decode using utf-8 while the encoding was done differently. — Leb
– Leb, Commented Mar 23, 2016 at 1:33
Can you give the traceback? This could be occurring implicitly in several places. — ShadowRanger
– ShadowRanger, Commented Mar 23, 2016 at 1:37
@Mounarajan as suggested in the link I provided, you need to use different encoding. Can't tell you which one without more information. — Leb
– Leb, Commented Mar 23, 2016 at 1:41

Mark Tolonen · Accepted Answer · 2016-08-17 08:25:19Z

Keep it simple and it works. The data has already been decoded by the requests module.

import requests data = requests.get('https://www.whoisxmlapi.com/whoisserver/WhoisService?domainName=http://N%E2%94%[email protected]&outputFormat=json') print data.text

Since it is a JSON response, you may also want to process it:

import json print json.loads(data.text)

Collectives™ on Stack Overflow

Python 'utf8' codec can't decode byte 0xc3 in position 72: invalid continuation byte [duplicate]

1 Answer 1

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Linked

Related