201

I created some data and stored it several times like this:

with open('filename', 'a') as f: pickle.dump(data, f) 

Every time the size of file increased, but when I open file

with open('filename', 'rb') as f: x = pickle.load(f) 

I can see only data from the last time. How can I correctly read file?

5
  • 2
    You are appending objects to your file. When you unpicke you unpickle only first entry. Are you sure you need all those entries? If not - change to open('filename', 'wb') Commented Jan 28, 2016 at 17:26
  • Yes, I need all entries. The size of files shows that it contains all of it. Commented Jan 28, 2016 at 17:29
  • 2
    Then @jsbueno is right in his answer. Commented Jan 28, 2016 at 17:33
  • See also: How can I use pickle to save a dict? Commented Aug 4, 2017 at 14:29
  • I built something to view pickle files directly in your browser: pickleviewer.com Commented Apr 4, 2020 at 17:38

5 Answers 5

172

Pickle serializes a single object at a time, and reads back a single object - the pickled data is recorded in sequence on the file.

If you simply do pickle.load you should be reading the first object serialized into the file (not the last one as you've written).

After unserializing the first object, the file-pointer is at the beggining of the next object - if you simply call pickle.load again, it will read that next object - do that until the end of the file.

objects = [] with (open("myfile", "rb")) as openfile: while True: try: objects.append(pickle.load(openfile)) except EOFError: break 
Sign up to request clarification or add additional context in comments.

Comments

119

There is a read_pickle function as part of pandas 0.22+

import pandas as pd obj = pd.read_pickle(r'filepath') 

1 Comment

Are there any performance or compatibility differences between pd.read_pickle and pickle.load?
9

The following is an example of how you might write and read a pickle file. Note that if you keep appending pickle data to the file, you will need to continue reading from the file until you find what you want or an exception is generated by reaching the end of the file. That is what the last function does.

import os import pickle PICKLE_FILE = 'pickle.dat' def main(): # append data to the pickle file add_to_pickle(PICKLE_FILE, 123) add_to_pickle(PICKLE_FILE, 'Hello') add_to_pickle(PICKLE_FILE, None) add_to_pickle(PICKLE_FILE, b'World') add_to_pickle(PICKLE_FILE, 456.789) # load & show all stored objects for item in read_from_pickle(PICKLE_FILE): print(repr(item)) os.remove(PICKLE_FILE) def add_to_pickle(path, item): with open(path, 'ab') as file: pickle.dump(item, file, pickle.HIGHEST_PROTOCOL) def read_from_pickle(path): with open(path, 'rb') as file: try: while True: yield pickle.load(file) except EOFError: pass if __name__ == '__main__': main() 

Comments

6

I developed a software tool that opens (most) Pickle files directly in your browser (nothing is transferred so it's 100% private):

https://pickleviewer.com/ (formerly)

Now it's hosted here: https://fire-6dcaa-273213.web.app/

Edit: Available here if you want to host it somewhere: https://github.com/ch-hristov/Pickle-viewer

Feel free to host this somewhere.

4 Comments

Is this still active? I tried to load it, but no dice...
@autonopy hey, sorry it's down. I updated the answer, feel free to host this somewhere. Should be free to go. It's also available here: fire-6dcaa-273213.web.app
My output: 100% private, nothing is transferred. You can convert the file to JSON after opening it. This software only opens files created with Python 3.3 or less Contribute to development by supporting us on Patreon Sorry, we couldn't open your file. :( The following error occurred Unhandled pickle protocol version: 4
@questionto42 newer protocols aren't supported (only <= 3). You can check out the github repo for more info
1

You can also use joblib to read pickle files. It is especially useful if you were reading pickled scikit-learn models or numpy ndarray objects (joblib comes with scikit-learn and is specifically designed to handle numpy ndarrays).

import joblib x = joblib.load("my_file.pkl") 

Then again, both joblib and pandas use the pickle.load from the standard library, so in reality, both are almost the same as:

with open("my_file.pkl", "rb") as f: x = pickle.load(f) 

It's just that file handling and some backward compatibility considerations are handled under the hood in pandas and joblib.

In particular, for the OP's specific case, they cannot work and must use the same try-except block to read all objects, e.g.:

objects = [] with open("myfile", "rb") as openfile: while True: try: objects.append(pd.read_pickle(openfile)) except EOFError: break 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.