@@ -60,6 +60,8 @@ To select out everything for variable ``A`` we could do:
6060
6161 df[df[' variable' ] == ' A' ]
6262
63+ .. image :: _static/reshaping_pivot.png
64+
6365But suppose we wish to do time series operations with the variables. A better
6466representation would be where the ``columns `` are the unique variables and an
6567``index `` of dates identifies individual observations. To reshape the data into
@@ -96,10 +98,12 @@ are homogeneously-typed.
9698Reshaping by stacking and unstacking
9799------------------------------------
98100
99- Closely related to the :meth: `~DataFrame.pivot ` method are the related
100- :meth: `~DataFrame.stack ` and :meth: `~DataFrame.unstack ` methods available on
101- ``Series `` and ``DataFrame ``. These methods are designed to work together with
102- ``MultiIndex `` objects (see the section on :ref: `hierarchical indexing
101+ .. image :: _static/reshaping_stack.png
102+
103+ Closely related to the :meth: `~DataFrame.pivot ` method are the related
104+ :meth: `~DataFrame.stack ` and :meth: `~DataFrame.unstack ` methods available on
105+ ``Series `` and ``DataFrame ``. These methods are designed to work together with
106+ ``MultiIndex `` objects (see the section on :ref: `hierarchical indexing
103107<advanced.hierarchical>`). Here are essentially what these methods do:
104108
105109 - ``stack ``: "pivot" a level of the (possibly hierarchical) column labels,
@@ -109,6 +113,8 @@ Closely related to the :meth:`~DataFrame.pivot` method are the related
109113 (possibly hierarchical) row index to the column axis, producing a reshaped
110114 ``DataFrame `` with a new inner-most level of column labels.
111115
116+ .. image :: _static/reshaping_unstack.png
117+
112118The clearest way to explain is by example. Let's take a prior example data set
113119from the hierarchical indexing section:
114120
@@ -149,13 +155,18 @@ unstacks the **last level**:
149155
150156 .. _reshaping.unstack_by_name :
151157
158+ .. image :: _static/reshaping_unstack_1.png
159+
152160If the indexes have names, you can use the level names instead of specifying
153161the level numbers:
154162
155163.. ipython :: python
156164
157165 stacked.unstack(' second' )
158166
167+
168+ .. image :: _static/reshaping_unstack_0.png
169+
159170Notice that the ``stack `` and ``unstack `` methods implicitly sort the index
160171levels involved. Hence a call to ``stack `` and then ``unstack ``, or vice versa,
161172will result in a **sorted ** copy of the original ``DataFrame `` or ``Series ``:
@@ -266,11 +277,13 @@ the right thing:
266277Reshaping by Melt
267278-----------------
268279
280+ .. image :: _static/reshaping_melt.png
281+
269282The top-level :func: `~pandas.melt ` function and the corresponding :meth: `DataFrame.melt `
270- are useful to massage a ``DataFrame `` into a format where one or more columns
271- are *identifier variables *, while all other columns, considered *measured
272- variables *, are "unpivoted" to the row axis, leaving just two non-identifier
273- columns, "variable" and "value". The names of those columns can be customized
283+ are useful to massage a ``DataFrame `` into a format where one or more columns
284+ are *identifier variables *, while all other columns, considered *measured
285+ variables *, are "unpivoted" to the row axis, leaving just two non-identifier
286+ columns, "variable" and "value". The names of those columns can be customized
274287by supplying the ``var_name `` and ``value_name `` parameters.
275288
276289For instance,
@@ -285,7 +298,7 @@ For instance,
285298 cheese.melt(id_vars = [' first' , ' last' ])
286299 cheese.melt(id_vars = [' first' , ' last' ], var_name = ' quantity' )
287300
288- Another way to transform is to use the :func: `~pandas.wide_to_long ` panel data
301+ Another way to transform is to use the :func: `~pandas.wide_to_long ` panel data
289302convenience function. It is less flexible than :func: `~pandas.melt `, but more
290303user-friendly.
291304
@@ -332,8 +345,8 @@ While :meth:`~DataFrame.pivot` provides general purpose pivoting with various
332345data types (strings, numerics, etc.), pandas also provides :func: `~pandas.pivot_table `
333346for pivoting with aggregation of numeric data.
334347
335- The function :func: `~pandas.pivot_table ` can be used to create spreadsheet-style
336- pivot tables. See the :ref: `cookbook<cookbook.pivot> ` for some advanced
348+ The function :func: `~pandas.pivot_table ` can be used to create spreadsheet-style
349+ pivot tables. See the :ref: `cookbook<cookbook.pivot> ` for some advanced
337350strategies.
338351
339352It takes a number of arguments:
@@ -485,7 +498,7 @@ using the ``normalize`` argument:
485498 pd.crosstab(df.A, df.B, normalize = ' columns' )
486499
487500 ``crosstab `` can also be passed a third ``Series `` and an aggregation function
488- (``aggfunc ``) that will be applied to the values of the third ``Series `` within
501+ (``aggfunc ``) that will be applied to the values of the third ``Series `` within
489502each group defined by the first two ``Series ``:
490503
491504.. ipython :: python
@@ -508,8 +521,8 @@ Finally, one can also add margins or normalize this output.
508521Tiling
509522------
510523
511- The :func: `~pandas.cut ` function computes groupings for the values of the input
512- array and is often used to transform continuous variables to discrete or
524+ The :func: `~pandas.cut ` function computes groupings for the values of the input
525+ array and is often used to transform continuous variables to discrete or
513526categorical variables:
514527
515528.. ipython :: python
@@ -539,8 +552,8 @@ used to bin the passed data.::
539552Computing indicator / dummy variables
540553-------------------------------------
541554
542- To convert a categorical variable into a "dummy" or "indicator" ``DataFrame ``,
543- for example a column in a ``DataFrame `` (a ``Series ``) which has ``k `` distinct
555+ To convert a categorical variable into a "dummy" or "indicator" ``DataFrame ``,
556+ for example a column in a ``DataFrame `` (a ``Series ``) which has ``k `` distinct
544557values, can derive a ``DataFrame `` containing ``k `` columns of 1s and 0s using
545558:func: `~pandas.get_dummies `:
546559
@@ -577,7 +590,7 @@ This function is often used along with discretization functions like ``cut``:
577590 See also :func: `Series.str.get_dummies <pandas.Series.str.get_dummies> `.
578591
579592:func: `get_dummies ` also accepts a ``DataFrame ``. By default all categorical
580- variables (categorical in the statistical sense, those with `object ` or
593+ variables (categorical in the statistical sense, those with `object ` or
581594`categorical ` dtype) are encoded as dummy variables.
582595
583596
@@ -587,7 +600,7 @@ variables (categorical in the statistical sense, those with `object` or
587600 ' C' : [1 , 2 , 3 ]})
588601 pd.get_dummies(df)
589602
590- All non-object columns are included untouched in the output. You can control
603+ All non-object columns are included untouched in the output. You can control
591604the columns that are encoded with the ``columns `` keyword.
592605
593606.. ipython :: python
@@ -640,7 +653,7 @@ When a column contains only one level, it will be omitted in the result.
640653
641654 pd.get_dummies(df, drop_first = True )
642655
643- By default new columns will have ``np.uint8 `` dtype.
656+ By default new columns will have ``np.uint8 `` dtype.
644657To choose another dtype, use the``dtype`` argument:
645658
646659.. ipython :: python
0 commit comments