Here's a list of each of the data sources we used. There are three sources of data: the Bay Area site inventory, the parcels dataset, and the SF permits dataset. This data is included in data/raw_data.
Source: https://opendata.mtc.ca.gov/datasets/da0765ab82ae475d985688e140f931bd_0
Location: data/raw_data/housing_sites/xn--Bay_Area_Housing_Opportunity_Sites_Inventory__20072023_-it38a.shp
Download script:
wget https://opendata.arcgis.com/datasets/da0765ab82ae475d985688e140f931bd_0.zip?outSR=%7B%22latestWkid%22%3A4326%2C%22wkid%22%3A4326%7D -O housing_sites.zip
mkdir data/raw_data/housing_sites
unzip housing_sites.zip -d data/raw_data/housing_sitesSource:
- https://opendata.mtc.ca.gov/datasets/residential-building-permits-features (shapefiles)
- https://opendata.mtc.ca.gov/datasets/residential-building-permits-attributes (row attributes)
Location: data/raw_data/abag_building_permits/permits.shp
Download script:
mkdir data/raw_data/abag_building_permits
wget https://opendata.arcgis.com/datasets/92a2e55f00c94295adf9feac3d695f1e_0.csv -O data/raw_data/permits.csv
wget https://opendata.arcgis.com/datasets/8f95c18719d04416a259854334443f3a_0.zip?outSR=%7B%22latestWkid%22%3A4326%2C%22wkid%22%3A4326%7D -O building_permits.zip
unzip building_permits.zip -d data/raw_data/abag_building_permits
# Rename the files to something meaningful
for f in $(ls data/raw_data/abag_building_permits)
do
mv data/raw_data/abag_building_permits/$f data/raw_data/abag_building_permits/permits.${f##*.}
doneDownload script:
wget 'https://data.sfgov.org/api/geospatial/acdm-wktn?method=export&format=Shapefile' -O all_parcels.zip
mkdir data/raw_data/all_parcels
unzip all_parcels.zip -o data/raw_data/all_parcels
# When you unzip the folder, the files will be called `geo_export_{some random string}.{dbf,prj,shp,shx}`.
# The random string will be different every time, so here we rename the files to `all_parcels.{dbf,prj,shp,shx}`.
# so that the notebooks work regardless of what the downloaded files are called.
for f in $(ls data/raw_data/all_parcels)
do
mv data/raw_data/all_parcels/$f data/raw_data/all_percels/all_parcels.${f##*.}
doneDownload script:
import pandas as pd
sf_permits = pd.read_csv('https://data.sfgov.org/api/views/p4e4-a5a7/rows.csv?accessType=DOWNLOAD')
sf_permits.to_csv('./data/raw_data/sf_permits.csv', index=False)For reproducibility's sake, these notebooks stick to the SF permits data as of 2/15/2021. If you would like to update this data and retrieve permits issued after 2/15/2021, just run the code chunk above.