Pandas - Pivot, Stack, Unstack?

Question

I'm wondering how to have the keys be the column headers (Result, IP, time)?

Data in CSV looks like this:

"Result":"Success","IP":"0.0.0.0","time":"2018-08-20T12:00:00.000Z" "Result":"Failure","IP":"1.1.1.1","time":"2018-08-20T12:01:00.000Z"

I'd like to format it like this:

Result IP time Success 0.0.0.0 2018-08-20T12:00:00.000Z Failure 1.1.1.1 2018-08-20T12:01:00.000Z

My code so far:

import pandas as pd file = pd.read_csv("path_to.csv", sep='\n', names = ["Result","IP","time"]) df = pd.DataFrame(file) print(df.head(1))

harpan · Accepted Answer · 2018-07-31 20:27:13Z

You need:

import csv file = pd.read_csv("foo.csv", sep=',', header=None, quoting=csv.QUOTE_ALL, names=["Result", "IP", "time"]) df = pd.DataFrame(file) df = df.applymap(lambda x: (''.join(str(x).split(":")[1:]))[1:-1])

Output:

 Result IP time 0 Success 0.0.0.0 2018-08-20T120000.000Z 1 Failure 1.1.1.1 2018-08-20T120100.000Z

Excellent. Now I just need to find out why the values are offset to the left one column over. e.g. 0.0.0.0 is showing under Result, and 2018-08-20T12:00:00.000Z is showing up under IP.

sacuL · Accepted Answer · 2018-07-31 20:27:03Z

Here is a similar approach, also using str.split, but which retains your timezone information by splitting on a regular expression that is more selective than just splitting on : (it splits on : only if preceded by 2 or more letters):

df = pd.read_csv('data.csv', header=None, sep=',', names=['Result', 'IP', 'Time']) new_df = df.apply(lambda x: x.str.split('[A-Za-z]{2,}:').str[1].str.strip('\"')) >>> new_df Result IP Time 0 Success 0.0.0.0 2018-08-20T12:00:00.000Z 1 Failure 1.1.1.1 2018-08-20T12:01:00.000Z

Thanks! +1 on your answer as well (different variants on the same basic strategy)

Collectives™ on Stack Overflow

Pandas - Pivot, Stack, Unstack?

2 Answers 2

1 Comment

1 Comment

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

1 Comment

Related