0

I have a csv called file.csv in the following format:

"hash";"timeStamp";"protocol";"type";"subTransactions";"gas";"contract" "0x48e306ab695e5ertretreter269baf6342325e8d952b2305875";"1619023543";"";"";"[{"type":"outgoing","symbol":"MATIC","amount":896.0375,"address":"0x7d1afa7erw324abc0cfc608aacfebb0"}]";"0.019060399";"0x401f6c983ea2343f84d70b31c151321188b" "0xd22b9622510a94b926456546aeb7ea1880fcf7c8fd9902c8b9c3771beb";"1619023794";"";"";"[{"type":"incoming","symbol":"MATIC","amount":296.0375,"address":"0x7d1afa7b718fb89werwefc608aacfebb0"}]";"0.00913276";"0xe93381fb4c4f14bwrer305d799241a" 

I parse the csv with the following code

import pandas as pd df = pd.read_csv("file.csv", sep=";") 

This results in the following:

hash timeStamp protocol type subTransactions gas contract 0x48e306ab695e5ertretreter269baf6342325e8d952b... 1619023543 NaN NaN [{type":"outgoing","symbol":"MATIC","amount":8... 0.019060 0x401f6c983ea2343f84d70b31c151321188b 0xd22b9622510a94b926456546aeb7ea1880fcf7c8fd99... 1619023794 NaN NaN [{type":"incoming","symbol":"MATIC","amount":2... 0.009133 0xe93381fb4c4f14bwrer305d799241a 

The problem I face is the values in the column subTransactions. Instead of [{type":"outgoing","symbo... it should be [{"type":"outgoing","symbo.... Ie. a double-quote is missing in front of type. I have tried to fix it with replace but it did not work.

1 Answer 1

1

Your data are not quite clean ... In this particular case you can use a workaround like this:

import pandas as pd import json def _conv(s): s = s.replace('[{type":', '[{"type":') return json.loads(s[:-1]) df = pd.read_csv("file.csv", sep=";", converters={'subTransactions': _conv}) print(df) 

NB: this is not a generic solution!

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.