Skip to main content
replaced http://stackoverflow.com/ with https://stackoverflow.com/
Source Link
URL Rewriter Bot
URL Rewriter Bot

Using StefanStefan's help, I solved it like this.

In (283): frame1 = test[['score1']] frame2 = test[['score2']] frame2.rename(columns={'score2': 'score1'}, inplace=True) test = pandas.concat([frame1, frame2]) test Out[283]: score1 0 a 1 b 2 c 3 d 4 e 0 b 1 a 2 k 3 n 4 c 

Notice the duplicate indices. The indexes have been preserved, which is what I wanted. Now, lets get to business - the group by operation.

In (283): groups = test.groupby('score1') groups.get_group('a') # Get group with key a Out[283]: score1 0 a 1 a In (283): groups.get_group('b') # Get group with key b Out[283]: score1 1 b 0 b In (283): groups.get_group('c') # Get group with key c Out[283]: score1 2 c 4 c In (283): groups.get_group('k') # Get group with key k Out[283]: score1 2 k 

I'm baffled by how pandas retrieves rows with the correct index even though they are duplicated. As I understand, the group by operation uses an inverted index data structure to store the references (indexes) to rows. Any insights would be greatly appreciated. Anyone who answers this will have their answer accepted :)

Using Stefan's help, I solved it like this.

In (283): frame1 = test[['score1']] frame2 = test[['score2']] frame2.rename(columns={'score2': 'score1'}, inplace=True) test = pandas.concat([frame1, frame2]) test Out[283]: score1 0 a 1 b 2 c 3 d 4 e 0 b 1 a 2 k 3 n 4 c 

Notice the duplicate indices. The indexes have been preserved, which is what I wanted. Now, lets get to business - the group by operation.

In (283): groups = test.groupby('score1') groups.get_group('a') # Get group with key a Out[283]: score1 0 a 1 a In (283): groups.get_group('b') # Get group with key b Out[283]: score1 1 b 0 b In (283): groups.get_group('c') # Get group with key c Out[283]: score1 2 c 4 c In (283): groups.get_group('k') # Get group with key k Out[283]: score1 2 k 

I'm baffled by how pandas retrieves rows with the correct index even though they are duplicated. As I understand, the group by operation uses an inverted index data structure to store the references (indexes) to rows. Any insights would be greatly appreciated. Anyone who answers this will have their answer accepted :)

Using Stefan's help, I solved it like this.

In (283): frame1 = test[['score1']] frame2 = test[['score2']] frame2.rename(columns={'score2': 'score1'}, inplace=True) test = pandas.concat([frame1, frame2]) test Out[283]: score1 0 a 1 b 2 c 3 d 4 e 0 b 1 a 2 k 3 n 4 c 

Notice the duplicate indices. The indexes have been preserved, which is what I wanted. Now, lets get to business - the group by operation.

In (283): groups = test.groupby('score1') groups.get_group('a') # Get group with key a Out[283]: score1 0 a 1 a In (283): groups.get_group('b') # Get group with key b Out[283]: score1 1 b 0 b In (283): groups.get_group('c') # Get group with key c Out[283]: score1 2 c 4 c In (283): groups.get_group('k') # Get group with key k Out[283]: score1 2 k 

I'm baffled by how pandas retrieves rows with the correct index even though they are duplicated. As I understand, the group by operation uses an inverted index data structure to store the references (indexes) to rows. Any insights would be greatly appreciated. Anyone who answers this will have their answer accepted :)

Source Link
lostsoul29
  • 756
  • 2
  • 11
  • 21

Using Stefan's help, I solved it like this.

In (283): frame1 = test[['score1']] frame2 = test[['score2']] frame2.rename(columns={'score2': 'score1'}, inplace=True) test = pandas.concat([frame1, frame2]) test Out[283]: score1 0 a 1 b 2 c 3 d 4 e 0 b 1 a 2 k 3 n 4 c 

Notice the duplicate indices. The indexes have been preserved, which is what I wanted. Now, lets get to business - the group by operation.

In (283): groups = test.groupby('score1') groups.get_group('a') # Get group with key a Out[283]: score1 0 a 1 a In (283): groups.get_group('b') # Get group with key b Out[283]: score1 1 b 0 b In (283): groups.get_group('c') # Get group with key c Out[283]: score1 2 c 4 c In (283): groups.get_group('k') # Get group with key k Out[283]: score1 2 k 

I'm baffled by how pandas retrieves rows with the correct index even though they are duplicated. As I understand, the group by operation uses an inverted index data structure to store the references (indexes) to rows. Any insights would be greatly appreciated. Anyone who answers this will have their answer accepted :)