Pandas pivoting/stacking/reshaping

Question

I'm trying to import data to a pandas DataFrame with columns being date string, label, value. My data looks like the following (just with 4 dates and 5 labels)

from numpy import random import numpy as np import pandas as pd # Creating the data dates = ("2015-01-01", "2015-01-02", "2015-01-03", "2015-01-04") values = [random.rand(5) for _ in range(4)] data = dict(zip(dates,values))

So, the data is a dictionary where the keys are dates, the keys a list of values where the index is the label.

Loading this data structure into a DataFrame

df1 = pd.DataFrame(data)

gives me the dates as columns, the label as index, and the value as the value.

An alternative loading would be

df2 = pd.DataFrame() df2.from_dict(data, orient='index')

where the dates are index, and columns are labels.

In either of both cases do I manage to do pivoting or stacking to my preferred view.

How should I approach the pivoting/stacking to get the view I want? Or should I change my data structure before loading it into a DataFrame? In particular I'd like to avoid of having to create all the rows of the table beforehand by using a bunch of calls to zip.

That gives me the same DataFrame as if I would have used from_dict with orient='index' option. — Ignacio Vergara Kausel
– Ignacio Vergara Kausel, Commented Aug 1, 2017 at 14:11

piRSquared · Accepted Answer · 2017-08-01 14:11:48Z

IIUC:

Option 1
pd.DataFrame.stack

pd.DataFrame(data).stack() \ .rename('value').rename_axis(['label', 'date']).reset_index() label date value 0 0 2015-01-01 0.345109 1 0 2015-01-02 0.815948 2 0 2015-01-03 0.758709 3 0 2015-01-04 0.461838 4 1 2015-01-01 0.584527 5 1 2015-01-02 0.823529 6 1 2015-01-03 0.714700 7 1 2015-01-04 0.160735 8 2 2015-01-01 0.779006 9 2 2015-01-02 0.721576 10 2 2015-01-03 0.246975 11 2 2015-01-04 0.270491 12 3 2015-01-01 0.465495 13 3 2015-01-02 0.622024 14 3 2015-01-03 0.227865 15 3 2015-01-04 0.638772 16 4 2015-01-01 0.266322 17 4 2015-01-02 0.575298 18 4 2015-01-03 0.335095 19 4 2015-01-04 0.761181

Option 2
comprehension

pd.DataFrame( [[i, d, v] for d, l in data.items() for i, v in enumerate(l)], columns=['label', 'date', 'value'] ) label date value 0 0 2015-01-01 0.345109 1 1 2015-01-01 0.584527 2 2 2015-01-01 0.779006 3 3 2015-01-01 0.465495 4 4 2015-01-01 0.266322 5 0 2015-01-02 0.815948 6 1 2015-01-02 0.823529 7 2 2015-01-02 0.721576 8 3 2015-01-02 0.622024 9 4 2015-01-02 0.575298 10 0 2015-01-03 0.758709 11 1 2015-01-03 0.714700 12 2 2015-01-03 0.246975 13 3 2015-01-03 0.227865 14 4 2015-01-03 0.335095 15 0 2015-01-04 0.461838 16 1 2015-01-04 0.160735 17 2 2015-01-04 0.270491 18 3 2015-01-04 0.638772 19 4 2015-01-04 0.761181

Collectives™ on Stack Overflow

Pandas pivoting/stacking/reshaping

1 Answer 1

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Related