-
- Notifications
You must be signed in to change notification settings - Fork 19.4k
Closed
Labels
Visualizationplottingplotting
Milestone
Description
The default log-scaled axes, activated by the logx, logy, and loglog methods to the Pandas plotting API, do the straightforward thing and take the log of 0 values. It then attempt to plot with these infinite logs, and makes the entire plot unusable without warning in the presence of 0s.
For example:
draws = pd.DataFrame({'freq': np.random.zipf(1.7, 1000) - 1}) draws['rank'] = (-draws['freq']).rank() draws.plot(x='rank', y='freq', kind='scatter', loglog=True)Matplotlib provides another scale, the symlog scale, that makes a small region near 0 linear to avoid these problems. For quick-and-dirty 'look at my data on a log axis' plotting, symlog is significantly more useful.
I can access it like this:
draws = pd.DataFrame({'freq': np.random.zipf(1.7, 1000) - 1}) draws['rank'] = (-draws['freq']).rank() p = draws.plot(x='rank', y='freq', kind='scatter', loglog=True) p.set_xscale('symlog') p.set_yscale('symlog') p Either making the symlog scale the default log scale for plotting, or supporting a loglog='sym' option, would make it significantly easier to do quick data inspection with Pandas' convenience plotting.
Metadata
Metadata
Assignees
Labels
Visualizationplottingplotting