0

I would like to re-work my code to use a For Loop to get row counts by specific columns using Python (there are 15 columns in total and I am looking for row counts for 4 specific ones at this time) -

This is the current input:

#get row count by affiliate, race, ethnicity and abortion type print('Column_1:', batch_df.groupby('Column_1').size().sum()) print('Column_2:', batch_df.groupby('Column_2').size().sum()) print('Column_3:',batch_df.groupby('Column_3').size().sum()) print('Column_4:',batch_df.groupby('Column_4').size().sum()) 

The output (which is correct) is below:

Column_1: 468676 Column_2: 465755 Column_3: 468400 Column_4: 468676 

Is there a way to re-work the input so that it is a For Loop?

2
  • What have you tried so far? It's a simple for loop. If you don't know how to do this read about for loops. Commented Oct 6, 2021 at 16:32
  • Did you try to use df[[your columns]].count() ? Commented Oct 6, 2021 at 16:36

2 Answers 2

1

This should work if you want to specify the columns by name:

for col in ['Column_1', 'Column_2', 'Column_3', 'Column_4']: print('{}:'.format(col), batch_df.groupby(col).size().sum()) 
Sign up to request clarification or add additional context in comments.

Comments

0

No need to write all column names as df.columns returns column names and then you can loop them true:

for c in df.columns: print(c) 

For example with dataframe

df = pd.DataFrame({ 'Column_1': ['1', '2', '3', '4'], 'Column_2' : ['11','12','13','14'], 'Column_3': ['101', '102','103', '104']}) 

will print

Column_1 Column_2 Column_3 

1 Comment

Is it possible to please get a little more context on this? Do I need to define the data frame to specifically call out these columns? For example, per your example I would want it to print - Column_1: 4 (for the number or rows) and I also only want row counts for specific columns, not all 15

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.