2

I'm wondering how to have the keys be the column headers (Result, IP, time)?

Data in CSV looks like this:

"Result":"Success","IP":"0.0.0.0","time":"2018-08-20T12:00:00.000Z" "Result":"Failure","IP":"1.1.1.1","time":"2018-08-20T12:01:00.000Z" 

I'd like to format it like this:

Result IP time Success 0.0.0.0 2018-08-20T12:00:00.000Z Failure 1.1.1.1 2018-08-20T12:01:00.000Z 

My code so far:

import pandas as pd file = pd.read_csv("path_to.csv", sep='\n', names = ["Result","IP","time"]) df = pd.DataFrame(file) print(df.head(1)) 

2 Answers 2

2

You need:

import csv file = pd.read_csv("foo.csv", sep=',', header=None, quoting=csv.QUOTE_ALL, names=["Result", "IP", "time"]) df = pd.DataFrame(file) df = df.applymap(lambda x: (''.join(str(x).split(":")[1:]))[1:-1]) 

Output:

 Result IP time 0 Success 0.0.0.0 2018-08-20T120000.000Z 1 Failure 1.1.1.1 2018-08-20T120100.000Z 
Sign up to request clarification or add additional context in comments.

1 Comment

Excellent. Now I just need to find out why the values are offset to the left one column over. e.g. 0.0.0.0 is showing under Result, and 2018-08-20T12:00:00.000Z is showing up under IP.
1

Here is a similar approach, also using str.split, but which retains your timezone information by splitting on a regular expression that is more selective than just splitting on : (it splits on : only if preceded by 2 or more letters):

df = pd.read_csv('data.csv', header=None, sep=',', names=['Result', 'IP', 'Time']) new_df = df.apply(lambda x: x.str.split('[A-Za-z]{2,}:').str[1].str.strip('\"')) >>> new_df Result IP Time 0 Success 0.0.0.0 2018-08-20T12:00:00.000Z 1 Failure 1.1.1.1 2018-08-20T12:01:00.000Z 

1 Comment

Thanks! +1 on your answer as well (different variants on the same basic strategy)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.