Code to reverse geo-code 'latitude' and 'longitude' to ZIP codes in US area; originally used to determine ZIP codes of shooting incidents in NYC.
3 Answers
Example Output:
lat lon zipcode 0 40.896504 -73.859042 10470 1 40.732804 -74.005666 10014 2 40.674142 -73.936206 11213 3 40.648025 -73.904011 11236 4 40.764694 -73.914348 11103 ... ... ... ... 20654 40.710989 -73.942949 11211 20655 40.682398 -73.840079 11416 20656 40.651014 -73.945707 11226 20657 40.835990 -73.916276 10452 20658 40.857771 -73.894606 10458 Load Dataset (not required):
#load used dataset df_shooting = pd.read_csv('Shooting_NY.csv',sep=';',low_memory=False) Code for reverse geo-coding:
pip install uszipcode # Import packages from uszipcode import SearchEngine search = SearchEngine(simple_zipcode=True) from uszipcode import Zipcode import numpy as np #define zipcode search function def get_zipcode(lat, lon): result = search.by_coordinates(lat = lat, lng = lon, returns = 1) return result[0].zipcode #load columns from dataframe lat = df_shooting['Latitude'] lon = df_shooting['Longitude'] #define latitude/longitude for function df = pd.DataFrame({'lat':lat, 'lon':lon}) #add new column with generated zip-code df['zipcode'] = df.apply(lambda x: get_zipcode(x.lat,x.lon), axis=1) #print result print(df) #(optional) save as csv #df.to_csv(r'zip_codes.csv') Be aware of long run times (20k rows = 5-7min). However, most effective code we managed to figure out without leveraging the (paid) Google API.
Comments
Here is another solution (incl. commented code): https://medium.com/@moritz.kittler/ever-struggled-with-reverse-geo-coding-36fe948ad5a3
Comments
This is my code, I think it is a little bit easier:
# !pip install uszipcode # Import packages from uszipcode import SearchEngine search = SearchEngine(simple_zipcode=True) from uszipcode import Zipcode # Define zipcode search function for index, row in df.iterrows(): result = search.by_coordinates(lat = row[df lat column number], lng = row[df lon column number], returns = 1) zip = result[0].zipcode # Add zipcode to the dataframe df["Zipcode"] = zip # Save dataframe to csv file (specify path) df.to_csv("Resouces/df.csv", index=False) # You can also use itertuples(). It is really faster than iterrows() # Your for loop may change like the following for row in df.itertuples(index = False): # follow remaining code explained above 1 Comment
zswqa
No need to create another answer. It is possible to update first answer and add second answer to it.