8

Trying to convert output into Json format but getting the error. After removing the json.dump getting the data into base64 format. But when using json.dump it shows error.

Code:

import json import base64 with open(r"C:/Users/Documents/pdf2txt/outputImage.jpg","rb") as img: image = base64.b64encode(img.read()) data['ProcessedImage'] = image print(json.dump(data) 

Output:

TypeError: Object of type 'bytes' is not JSON serializable 

When using:

print(json.dumps(dict(data))) 

It's also showing the same error

2
  • Ensure to post valid code which results in the behavior described. The code shown will fail to parse / run, for at least two separate reasons. Without valid code which reproduces the issue described, one sometimes correct hypothesis, is that the actual problematic code and shown code differ. Commented Oct 17, 2020 at 17:37
  • You're getting the same error because it has nothing to do with what you are using in the print function call — it's from the image = base64.b64encode(img.read()) line. Commented Apr 20, 2021 at 14:24

3 Answers 3

7

You have to use the str.decode() method.

You are trying to serialize a object of type bytes to a JSON object. There is no such thing in the JSON schema. So you have to convert the bytes to a String first.

Also you should use json.dumps() instead of json.dump() because you dont want to write to a File.

In your example:

import json import base64 with open(r"C:/Users/Documents/pdf2txt/outputImage.jpg", "rb") as img: image = base64.b64encode(img.read()) data['ProcessedImage'] = image.decode() # not just image print(json.dumps(data)) 
Sign up to request clarification or add additional context in comments.

Comments

5

First of all, I think you should use json.dumps() because you're calling json.dump() with the incorrect arguments and it doesn't return anything to print.

Secondly, as the error message indicates, you can't serializable objects of type bytes which is what json.dumps() expects. To do this properly you need to decode the binary data into a Python string with some encoding. To preserve the data properly, you should use latin1 encoding because arbitrary binary strings are valid latin1 which can always be decoded to Unicode and then encoded back to the original string again (as pointed out in this answer by Sven Marnach).

Here's your code showing how to do that (plus corrections for the other not-directly-related problems it had):

import json import base64 image_path = "C:/Users/Documents/pdf2txt/outputImage.jpg" data = {} with open(image_path, "rb") as img: image = base64.b64encode(img.read()).decode('latin1') data['ProcessedImage'] = image print(json.dumps(data)) 

3 Comments

Still having the same issue getting the print output in the form of bs64 encoded , as i am new to this can you please show me how to get the real output as how to decode bs64 to get the real data . Thanks in advance .
Not sure exactly what you're asking. You can undo the decoding of the value and get the bytes back with image.encode('latin1').
why use base64 here?
3

image (or anythong returned by base64.b64encode) is a binary bytes object, not a string. JSON cannot deal with binary data. You must decode the image data if you want to serialize it:

data['ProcessedImage'] = image.decode() 

1 Comment

How to get the real data from bs64 encoded value , how i can decode it to get the actual value .

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.