Serialize Python objects without file system

Question

I want to serialize a trained scikit pipeline object to reload it for predictions. What I saw pickle and joblib dump are two common methods for that, whereas joblib is the preferable approach.

In my case I want to store the serialized python object in the database and load it from there, deserialize it and use it for predictions. Is it possible to serialize the object without any file system access?

johannesack · Accepted Answer · 2020-07-30 09:54:13Z

Yes, for the pickle library you can get the serialized version of an object by using pickle.dumps instead of pickle.dump.

serialized_object = pickle.dumps(object)

This returns a bytes object, which you should then be able to store in your database, potentially converting it to base64 before doing so, or maybe directly.

thanks, does joblib has the same problems to be not compatible when loading a dump in a different python version as pickle has?

Xu Qiushi · Accepted Answer · 2020-07-30 10:01:19Z

1

You can do this:

import joblib from io import BytesIO import base64 with BytesIO() as tmp_bytes: joblib.dump({"test": "test"}, tmp_bytes) bytes_obj = tmp_bytes.getvalue() base64_obj = base64.b64encode(bytes_obj)

Then, bytes_obj is a bytes object. And base64_obj is the base64 version. Select what you like.

answered Jul 30, 2020 at 10:01

Xu Qiushi

1,1711 gold badge7 silver badges11 bronze badges

6 Comments

HansHupe Over a year ago

Nice, interesting, any advantages over pickle.dumps?

Xu Qiushi Over a year ago

joblib is faster in storing numpy arrays

HansHupe Over a year ago

Regarding python version issues it's the same as pickle?

HansHupe Over a year ago

btw: how can i load it then from the dump?

Xu Qiushi Over a year ago

They have the same problem in different python version.

|

Collectives™ on Stack Overflow

Serialize Python objects without file system

2 Answers 2

2 Comments

6 Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

6 Comments

Related