0

Code to reverse geo-code 'latitude' and 'longitude' to ZIP codes in US area; originally used to determine ZIP codes of shooting incidents in NYC.

3 Answers 3

1

Example Output:

 lat lon zipcode 0 40.896504 -73.859042 10470 1 40.732804 -74.005666 10014 2 40.674142 -73.936206 11213 3 40.648025 -73.904011 11236 4 40.764694 -73.914348 11103 ... ... ... ... 20654 40.710989 -73.942949 11211 20655 40.682398 -73.840079 11416 20656 40.651014 -73.945707 11226 20657 40.835990 -73.916276 10452 20658 40.857771 -73.894606 10458 

Load Dataset (not required):

#load used dataset df_shooting = pd.read_csv('Shooting_NY.csv',sep=';',low_memory=False) 

Code for reverse geo-coding:

pip install uszipcode # Import packages from uszipcode import SearchEngine search = SearchEngine(simple_zipcode=True) from uszipcode import Zipcode import numpy as np #define zipcode search function def get_zipcode(lat, lon): result = search.by_coordinates(lat = lat, lng = lon, returns = 1) return result[0].zipcode #load columns from dataframe lat = df_shooting['Latitude'] lon = df_shooting['Longitude'] #define latitude/longitude for function df = pd.DataFrame({'lat':lat, 'lon':lon}) #add new column with generated zip-code df['zipcode'] = df.apply(lambda x: get_zipcode(x.lat,x.lon), axis=1) #print result print(df) #(optional) save as csv #df.to_csv(r'zip_codes.csv') 

Be aware of long run times (20k rows = 5-7min). However, most effective code we managed to figure out without leveraging the (paid) Google API.

Sign up to request clarification or add additional context in comments.

Comments

0

Here is another solution (incl. commented code): https://medium.com/@moritz.kittler/ever-struggled-with-reverse-geo-coding-36fe948ad5a3

Comments

0

This is my code, I think it is a little bit easier:

# !pip install uszipcode # Import packages from uszipcode import SearchEngine search = SearchEngine(simple_zipcode=True) from uszipcode import Zipcode # Define zipcode search function for index, row in df.iterrows(): result = search.by_coordinates(lat = row[df lat column number], lng = row[df lon column number], returns = 1) zip = result[0].zipcode # Add zipcode to the dataframe df["Zipcode"] = zip # Save dataframe to csv file (specify path) df.to_csv("Resouces/df.csv", index=False) # You can also use itertuples(). It is really faster than iterrows() # Your for loop may change like the following for row in df.itertuples(index = False): # follow remaining code explained above 

1 Comment

No need to create another answer. It is possible to update first answer and add second answer to it.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.