2

I am trying to bound every value in a dataframe between 0.01 and 0.99

I have successfully normalised the data between 0 and 1 using: .apply(lambda x: (x - x.min()) / (x.max() - x.min())) as follows:

df = pd.DataFrame({'one' : ['AAL', 'AAL', 'AAPL', 'AAPL'], 'two' : [1, 1, 5, 5], 'three' : [4,4,2,2]}) df[['two', 'three']].apply(lambda x: (x - x.min()) / (x.max() - x.min())) df 

Now I want to bound all values between 0.01 and 0.99

This is what I have tried:

def bound_x(x): if x == 1: return x - 0.01 elif x < 0.99: return x + 0.01 df[['two', 'three']].apply(bound_x) 

​ df

But I receive the following error:

ValueError: ('The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().', u'occurred at index two') 

2 Answers 2

11

There's an app, err clip method, for that:

import pandas as pd df = pd.DataFrame({'one' : ['AAL', 'AAL', 'AAPL', 'AAPL'], 'two' : [1, 1, 5, 5], 'three' : [4,4,2,2]}) df = df[['two', 'three']].apply(lambda x: (x - x.min()) / (x.max() - x.min())) df = df.clip(lower=0.01, upper=0.99) 

yields

 two three 0 0.01 0.99 1 0.01 0.99 2 0.99 0.01 3 0.99 0.01 

The problem with

df[['two', 'three']].apply(bound_x) 

is that bound_x gets passed a Series like df['two'] and then if x == 1 requires x == 1 be evaluated in a boolean context. x == 1 is a boolean Series like

In [44]: df['two'] == 1 Out[44]: 0 False 1 False 2 True 3 True Name: two, dtype: bool 

Python tries to reduce this Series to a single boolean value, True or False. Pandas follows the NumPy convention of raising an error when you try to convert a Series (or array) to a bool.

Sign up to request clarification or add additional context in comments.

Comments

2

So I had a similar problem where I wanted customized normalization in that I regular percentile of datum or z-score was not adequate. Sometimes I knew what the feasible max and min of the population were, and therefore wanted to define it other than my sample, or a different midpoint, or whatever! So i built a custom function (used extra steps in the code here to make it as readable as possible):

def NormData(s,low='min',center='mid',hi='max',insideout=False,shrinkfactor=0.): if low=='min': low=min(s) elif low=='abs': low=max(abs(min(s)),abs(max(s)))*-1.#sign(min(s)) if hi=='max': hi=max(s) elif hi=='abs': hi=max(abs(min(s)),abs(max(s)))*1.#sign(max(s)) if center=='mid': center=(max(s)+min(s))/2 elif center=='avg': center=mean(s) elif center=='median': center=median(s) s2=[x-center for x in s] hi=hi-center low=low-center center=0. r=[] for x in s2: if x<low: r.append(0.) elif x>hi: r.append(1.) else: if x>=center: r.append((x-center)/(hi-center)*0.5+0.5) else: r.append((x-low)/(center-low)*0.5+0.) if insideout==True: ir=[(1.-abs(z-0.5)*2.) for z in r] r=ir rr =[x-(x-0.5)*shrinkfactor for x in r] return rr 

This will take in a pandas series, or even just a list and normalize it to your specified low, center, and high points. also there is a shrink factor! to allow you to scale down the data away from 0 and 1 (I had to do this when combining colormaps in matplotlib:Single pcolormesh with more than one colormap using Matplotlib) So you can likely see how the code works, but basically say you have values [-5,1,10] in a sample, but want to normalize based on a range of -7 to 7 (so anything above 7, our "10" is treated as a 7 effectively) with a midpoint of 2, but shrink it to fit a 256 RGB colormap:

#In[1] NormData([-5,2,10],low=-7,center=1,hi=7,shrinkfactor=2./256) #Out[1] [0.1279296875, 0.5826822916666667, 0.99609375] 

It can also turn your data inside out... this may seem odd, but I found it useful for heatmapping. Say you want a darker color for values closer to 0 rather than hi/low. You could heatmap based on normalized data where insideout=True:

#In[2] NormData([-5,2,10],low=-7,center=1,hi=7,insideout=True,shrinkfactor=2./256) #Out[2] [0.251953125, 0.8307291666666666, 0.00390625] 

So now "2" which is closest to the center, defined as "1" is the highest value.

Anyways, I thought my issue was very similar to yours and this function could be useful to you.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.