Hey I am having this major issue with encoding in python. I am not too familiar with python and have been stuck on this bug for weeks. I feel like I've tried every possible thing but I can't seem to get it.
I am reading files in to work with and am getting the following error on some files that have Chinese charaters.
'ascii' codec can't encode characters in position 10314-10316: ordinal not in range(128) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/django/core/handlers/base.py", line 112, in get_response response = wrapped_callback(request, *callback_args, **callback_kwargs) File "/usr/lib/python2.7/site-packages/cc_counter-0.65-py2.7.egg/cc_counter/views.py", line 154, in reviewrequest_recent_cc prev_reviewrequest_ccdata = _reviewrequest_recent_cc(request, review_request_id, False, revision_offset=1) File "/usr/lib/python2.7/site-packages/cc_counter-0.65-py2.7.egg/cc_counter/views.py", line 140, in _reviewrequest_recent_cc filename, comparison_data = _download_comparison_data(request, review_request_id, revision, filediff_id, modified) File "/usr/lib/python2.7/site-packages/cc_counter-0.65-py2.7.egg/cc_counter/views.py", line 89, in _download_comparison_data revision, filediff_id, local_site, modified) File "/usr/lib/python2.7/site-packages/cc_counter-0.65-py2.7.egg/cc_counter/views.py", line 68, in _download_analysis temp_file.write(working_file) UnicodeEncodeError: 'ascii' codec can't encode characters in position 10314-10316: ordinal not in range(128) My code in this area looks this this:
working_file = get_original_file(filediff, request, encoding_list) if modified: working_file = get_patched_file(working_file, filediff, request) working_file = convert_to_unicode(working_file, encoding_list)[1] logging.debug("Encoding List: %s", encoding_list) logging.debug("Source File: " + filediff.source_file) temp_file_name = "cctempfile_" + filediff.source_file.replace("/","_") logging.debug("temp_file_name: " + temp_file_name) source_file = os.path.join(HOMEFOLDER, temp_file_name) logging.debug("File contents" + working_file) #temp_file = codecs.open(source_file, encoding='utf-8') #temp_file.write(working_file.encode('utf-8')) temp_file = open(source_file, 'w') temp_file.write(working_file) temp_file.close() Notice the commented out lines. Working file is never empty. The encoding from the logged "encoding list" is
Encoding List: [u'iso-8859-15'] Anything to help would be soooo appreciated. I have to take a break from this after 8 straight hours of debugging this + the previous two weeks.