1

I have 2 functions that read a csv file and count the following as checks:

  1. number of rows in that csv
  2. number of rows that have a null value in the 'ID' column

I am trying to create a dataframe that looks like this

Checks Summary Findings
Check #1 Number of records on file function #1 results (Number of records on file: 10)
Check #2 Number of records missing an ID function #2 results (Number of records missing an ID: 2)

function 1 looks like this:

def function1(): with open('data.csv') as file: record_number = len(list(file)) print("Number of records on file:",record_number) function1() 

and outputs "Number of records on file: 10"

function 2 looks like this:

def function2(): df = pd.read_csv('data.csv', low_memory=False) missing_id = df["IDs"].isna().sum() print("Number of records missing an ID:", missing_id) function2() 

and outputs "Number of records missing an ID: 2"

I attempt to create a dictionary first and create my dictionary

table = { 'Checks' : ['Check #1', 'Check #2'], 'Summary' : ['Number of records on file', 'Number of records missing an ID'], 'Findings' : [function1, function2] } df = pd.DataFrame(table) df 

However, this is what the dataframe looks like:

Checks Summary Findings
Check #1 Number of records on file <function function1 at 0x7efd2d76a730>
Check #2 Number of records missing an ID <function2 at 0x7efd25cd0b70>

Is there any way to make it so that my Findings column outputs the actual results as seen above?

3 Answers 3

2

You need to change your functions so they return values, not output them, that is do

def function1(): with open('data.csv') as file: record_number = len(list(file)) return record_number 

and

def function2(): df = pd.read_csv('data.csv', low_memory=False) return df["IDs"].isna().sum() 

and call these functions like so

table = { 'Checks' : ['Check #1', 'Check #2'], 'Summary' : ['Number of records on file', 'Number of records missing an ID'], 'Findings' : [function1(), function2()] } df = pd.DataFrame(table) df 
Sign up to request clarification or add additional context in comments.

Comments

2

The reason is that you're printing the function objects, and not their results:

function1 != function1()

So for your case you need:

table = { 'Checks' : ['Check #1', 'Check #2'], 'Summary' : ['Number of records on file', 'Number of records missing an ID'], 'Findings' : [function1(), function2()] } df = pd.DataFrame(table) df 

Edit: Oh damn and I also missed what the other user commented. You definitely need to return a value from your functions as well :)

Comments

0

For expected ouput add return with f-strings to both functions, in DataFrame call functions with parentheses:

def function1(): with open('data.csv') as file: record_number = len(list(file)) return f"function #1 results (Number of records on file: {record_number})") def function2(): df = pd.read_csv('data.csv', low_memory=False) missing_id = df["IDs"].isna().sum() return f"function #2 results (Number of records missing an ID: {missing_id})") table = { 'Checks' : ['Check #1', 'Check #2'], 'Summary' : ['Number of records on file', 'Number of records missing an ID'], 'Findings' : [function1(), function2()] } df = pd.DataFrame(table) 

Solution with one function:

def function(): with open('data.csv') as file: record_number = len(list(file)) missing_id = df["IDs"].isna().sum() return [f"function #1 results (Number of records on file: {record_number})"), f"function #2 results (Number of records missing an ID: {missing_id})")] table = { 'Checks' : ['Check #1', 'Check #2'], 'Summary' : ['Number of records on file', 'Number of records missing an ID'], 'Findings' : function() } df = pd.DataFrame(table) 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.