String indexing in one column in pandas dataframe using index value from other column

Question

In a column my Pandas DataFrame I have strings that needs to to limited in length to a value that exist in another column in the same dataframe.

I have tried creating a new column and using normal python string indexing with the other column as the value.

Here is a MWE of the code I'm trying to run:

import pandas as pd data = [[5, 'LONSTRING'], [3, 'LONGERSTRING'], [7, 'LONGESTSTRINGEVER']] df = pd.DataFrame(data, columns=['String Limit', 'String']) df['Short String'] = df['String'][:df['String Limit']] print(df)

I expected a new column with shorter strings:

 String Limit String Short String 0 5 LONSTRING LONST 1 3 LONGERSTRING LON 2 7 LONGESTSTRINGEVER LONGEST

Instead I get a TypeError:

TypeError: cannot do slice indexing on <class 'pandas.core.indexes.range.RangeIndex'> with these indexers [0 5 1 3 2 7 Name: String Limit, dtype: int64] of <class 'pandas.core.series.Series'>

It seems that string indexing can't be done this way because df['String Limit'] is the whole Series and not just the one row value - but are there any alternative ways of doing this?

jezrael · Accepted Answer · 2019-08-09 08:13:40Z

Problem is you need filter all values separately, so use DataFrame.apply with axis=1 for loop by rows:

df['Short String'] = df.apply(lambda x: x['String'][:x['String Limit']], axis=1)

Or use zip with list comprehension:

df['Short String'] = [x[:y] for x, y in zip(df['String'], df['String Limit'])]

print(df) String Limit String Short String 0 5 LONSTRING LONST 1 3 LONGERSTRING LON 2 7 LONGESTSTRINGEVER LONGEST

Two solutions, oh my! Both work perfectly. I'll try the one that has the shortest runtime for my code. Thank you very much!

Collectives™ on Stack Overflow

String indexing in one column in pandas dataframe using index value from other column

1 Answer 1

1 Comment

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Related