how to preserve original indexes in the new dataframe

Question

def answer_eight(): templist = list() for county, region, p15, p14, ste, cty in zip(census_df.CTYNAME, census_df.REGION, census_df.POPESTIMATE2015, census_df.POPESTIMATE2014, census_df.STNAME, census_df.CTYNAME): # print(county) if region == 1 or region == 2: if county.startswith('Washington'): if p15 > p14: templist.append((ste, cty)) labels = ['STNAME', 'CTYNAME'] df = pd.DataFrame.from_records(templist, columns=labels) return df STNAME CTYNAME 0 Iowa Washington County 1 Minnesota Washington County 2 Pennsylvania Washington County 3 Rhode Island Washington County 4 Wisconsin Washington County

All these CTYNAME has different indexes in the original census_df. How could I transfer them over to the new DF so the answer looks like:

 STNAME CTYNAME 12 Iowa Washington County 222 Minnesota Washington County 400 Pennsylvania Washington County 2900 Rhode Island Washington County 2999 Wisconsin Washington County

It seems you need df = pd.DataFrame.from_records(templist, columns=labels, index=census_df.index) — jezrael
– jezrael, Commented May 5, 2017 at 7:04
<code>ValueError: Shape of passed values is (2, 5), indices imply (2, 3193)</code> produces: ValueError: Shape of passed values is (2, 5), indices imply (2, 3193) — feedthemachine
– feedthemachine, Commented May 5, 2017 at 7:06
I have seen this thread but it doesn't work for me either: raise AttributeError("Can only use .str accessor with string " AttributeError: Can only use .str accessor with string values, which use np.object_ dtype in pandas — feedthemachine
– feedthemachine, Commented May 5, 2017 at 7:12

piRSquared · Accepted Answer · 2017-05-05 07:32:20Z

I'd include the index with the other things your are zipping

def answer_eight(): templist = list() index = list() zipped = zip( census_df.CTYNAME, census_df.REGION, census_df.POPESTIMATE2015, census_df.POPESTIMATE2014, census_df.STNAME, census_df.CTYNAME, census_df.index ) for county, region, p15, p14, ste, cty, idx in zipped: # print(county) if region == 1 or region == 2: if county.startswith('Washington'): if p15 > p14: templist.append((ste, cty)) index.append(idx) labels = ['STNAME', 'CTYNAME'] df = pd.DataFrame(templist, index, labels) return df.rename_axis(census_df.index.name)

Max Power · Accepted Answer · 2017-05-05 07:07:17Z

1

Before you start filtering, you can assign the original index to a column with:

census_df['original index'] = census_df.index

Then just treat it like one of the other columns you're selecting from.

answered May 5, 2017 at 7:07

Max Power

9,14616 gold badges64 silver badges109 bronze badges

5 Comments

feedthemachine Over a year ago

thanks, I implemented census_df['original index'] = census_df.index before filtering but it doesn't change the outcome. If I add the index = census_df['original index'] to the new FD creation it returns: ValueError: Shape of passed values is (2, 5), indices imply (2, 3193)

Max Power Over a year ago

when did you get that error? when you change ` templist.append((ste, cty)) to templist.append((ste, cty, original_index_column))?

feedthemachine Over a year ago

if I do this I got the right results but how can I use the last value as an index? prntscr.com/f4bnq3

Max Power Over a year ago

your_data_frame_name.reset_index('original_index_column')

feedthemachine Over a year ago

print([k[2] for k in templist]) produces [896, 1419, 2345, 2355, 3163] but when I use it in the reset_idnex statement it returns the same results: df = pd.DataFrame.from_records(templist) df.reset_index([k[2] for k in templist])

Collectives™ on Stack Overflow

how to preserve original indexes in the new dataframe

2 Answers 2

Comments

5 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

5 Comments

Linked

Related