2

I have a dataframe in pandas that stores a column containing ratios. The ratios need to be transformed into a log2 scale for plotting but the ratio values are often 0, leading in log2(0) which is recorded as inf or a missing value in pandas. I want to visualize these since in my dataframe a ratio value of 0 is meaningful. What is the best way to deal with this in pandas/numpy? When I take the log values, is the preferred way to do this?

# take log with tiny value added c = 0.0000001 df[col].apply(lamda x: log2(c + x)) 

or are there other ways? thanks.

4
  • This is equivalent to just replacing your infs with a completely arbitrary number log2(.0000001) Why don't you just remove the infs when you plot and leave the 0s when not plotting? Commented Feb 28, 2013 at 0:45
  • 3
    Assuming your ratios are positive, if you take c = 1.0 then log2(c + x) will map [0,inf) --> [0,inf). Commented Feb 28, 2013 at 0:47
  • @askewchan: because I want to plot the 0s too. They make up a substantial part of the data. Commented Feb 28, 2013 at 0:50
  • 1
    If you want to plot 0 in a log plot (which should be at $\infty$, you could consider the pyplot.symlog function. Commented Feb 28, 2013 at 1:00

1 Answer 1

3

I guess you can use numpy.inf to identify those that are infinity and treat them separately.

Ref: github.com/pydata/pandas

Sign up to request clarification or add additional context in comments.

Comments