I cleaned previously my dataset and I am trying to compute correlation between some variables from my csv files. This is all of my columns :
Index(['ID', 'age', 'sex', 'city', 'province', 'country', 'latitude', 'longitude', 'geo_resolution', 'date_onset_symptoms', 'date_admission_hospital', 'date_confirmation', 'symptoms', 'lives_in_Wuhan', 'travel_history_dates', 'travel_history_location', 'reported_market_exposure', 'additional_information', 'chronic_disease_binary', 'chronic_disease', 'source', 'sequence_available', 'outcome', 'date_death_or_discharge', 'notes_for_discussion', 'location', 'admin3', 'admin2', 'admin1', 'country_new', 'admin_id', 'data_moderator_initials', 'travel_history_binary'], dtype='object') - I computed correlation by doing df.corr and I had this output :
latitude longitude chronic_disease_binary admin_id latitude 1.000000 0.180020 -0.027188 0.172725 longitude 0.180020 1.000000 0.016198 0.242909 chronic_disease_binary -0.027188 0.016198 1.000000 -0.007604 admin_id 0.172725 0.242909 -0.007604 1.000000 I tried to compute correlation between sex and latitude to understand why I cannot have all columns and I have casting error :
TypeError: unsupported operand type(s) for /: 'str' and 'int'
How can I compute correlation between all variables ?