diff --git a/docs/source/_static/images/Vision_Zero_Network_Community_Map_February_2024.jpg b/docs/source/_static/images/Vision_Zero_Network_Community_Map_February_2024.jpg new file mode 100644 index 0000000..da1926d Binary files /dev/null and b/docs/source/_static/images/Vision_Zero_Network_Community_Map_February_2024.jpg differ diff --git a/docs/source/explanations/datasets.rst b/docs/source/explanations/datasets.rst index 3f8448f..b8bc9e3 100644 --- a/docs/source/explanations/datasets.rst +++ b/docs/source/explanations/datasets.rst @@ -3,18 +3,17 @@ Dataset Choices Since creating a synthesized dataset based on multiple datasets is a critical part of the data processing pipeline, we have to make some choices regarding the datasets that we use. This document serves to explain the significant yet non-obvious logic and choices regarding different types of datasets. - Crosswalk Dataset ----------------- The crosswalk dataset should contain polygons that represent the boundaries of the crosswalks. The dataset ideally should accurately reflect the real world, but there is an implicit understanding that the dataset may not be perfect since most cities do not have a comprehensive dataset of crosswalks. The dataset should be in a format that can be easily read by the software, such as GeoJSON. In the testing module, you'll find the tests for fetching the crosswalk dataset and mapping it for the city of Boston. We've chosen Boston for its relative ease of access to various datasets and physical proximity to the team at Olin College of Engineering. -UMass Amherst has been developing a dataset of all crosswalks in Massachusetts using computer vision model (YOLOv8) and aerial imagery. The dataset is not perfect, but it is a good starting point for our project, and has the potential to be applicable for states that also do not have a thorough catalog of their crosswalk assets. The dataset can be viewed at the `following link `_. +UMass Amherst has been developing a dataset of all crosswalks in Massachusetts using computer vision model (YOLOv8) and aerial imagery. The dataset is not perfect, but it is a good starting point for our project, and has the potential to be applicable for states that also do not have a thorough catalog of their crosswalk assets. The dataset can be viewed at the `UMass Crosswalk Dataset `_. Traffic Dataset --------------- -Since our project is focused on pedestrian safety at nighttime on crosswalks, we need a dataset that contains information about the volume of traffic. MassDOT provides a convenient dataset that includes average annual daily traffic (AADT) counts for most roads in Massachusetts. The counts will be used to inform the risk of a pedestrian being hit by a car at a given crosswalk. The dataset can be viewed at the `following link `_. +Since our project is focused on pedestrian safety at nighttime on crosswalks, we need a dataset that contains information about the volume of traffic. MassDOT provides a convenient dataset that includes average annual daily traffic (AADT) counts for most roads in Massachusetts. The counts will be used to inform the risk of a pedestrian being hit by a car at a given crosswalk. The dataset can be viewed at the `MassDOT Traffic Dataset `_. Population Density Dataset -------------------------- @@ -45,9 +44,22 @@ Density Calculation Methodology Streetlights Dataset ******************** -The main information that the streetlights dataset should contain is the location of the streetlights. Additional information such as the type of bulb, last-replacement year, and wattage, etc. are useful to have as well. After talking to Michael Donaghy, Superintendent of Street Lighting at the City of Boston Public Works Department, we learned that Boston has recently completed a full catalog of their streetlight assets in 2023. We acknowledge that many cities might not have this data available, in which case, `OpenStreetMap features `_ could be used to roughly estimate the streetlight locations. The Boston streetlight dataset can be viewed at the `following link `_. +The main information that the streetlights dataset should contain is the location of the streetlights. Additional information such as the type of bulb, last-replacement year, and wattage, etc. are useful to have as well. After talking to Michael Donaghy, Superintendent of Street Lighting at the City of Boston Public Works Department, we learned that Boston has recently completed a full catalog of their streetlight assets in 2023. We acknowledge that many cities might not have this data available, in which case, `OpenStreetMap features `_ could be used to roughly estimate the streetlight locations. The Boston streetlight dataset can be viewed at the `Boston Streetlight Dataset `_. Income Dataset ************** The income dataset is also sourced from the American Community Survey (ACS) 5-year estimates. The dataset includes median household income data for each census tract within a specified state and year. The data is used to analyze the relationship between income levels and pedestrian safety, as well as to identify areas with possible infrastructure inequity. + +Past Accidents Vision Zero Dataset +********************************** + +We have chosen to use the pedestrian accidents dataset from the Vision Zero initiative in Boston. The dataset contains information about the location of accidents, the date and time of accident, and the severity of the accident (injury/fatality). The dataset is used to identify high-risk areas for pedestrian accidents and to inform the prioritization of crosswalks for safety improvements. The dataset can be viewed at the `Vision Zero Dataset `_. + +.. figure:: ../_static/images/Vision_Zero_Network_Community_Map_February_2024.jpg + :alt: Vision Zero Network Community Map + :width: 1000px + + Vision Zero Network Community Map (February 2024) + +Vision Zero initiatives are a nationwide effort to eliminate traffic fatalities and severe injuries. Growing number of cities have contributed to this effort and collected data, which will help this project be applicable outside of Boston as well. \ No newline at end of file diff --git a/src/night_light/past_accidents/vision_zero.py b/src/night_light/past_accidents/vision_zero.py new file mode 100644 index 0000000..65a3ea4 --- /dev/null +++ b/src/night_light/past_accidents/vision_zero.py @@ -0,0 +1,31 @@ +import geopandas as gpd +import requests +import pandas as pd + +BOSTON_VISION_ZERO_URL = "https://data.boston.gov/api/3/action/datastore_search_sql" +BOSTON_VISION_ZERO_PED_CRASHES = '"e4bfe397-6bfc-49c5-9367-c879fac7401d"' +BOSTON_VISION_ZERO_PED_FATALITIES = '"92f18923-d4ec-4c17-9405-4e0da63e1d6c"' + + +def _boston_vision_zero_ped_accidents(resource_id: str) -> gpd.GeoDataFrame: + params = {"sql": "SELECT * from " + resource_id + " WHERE mode_type = 'ped'"} + response = requests.get(BOSTON_VISION_ZERO_URL, params=params) + response.raise_for_status() + data = response.json()["result"]["records"] + gdf = gpd.GeoDataFrame.from_records(data) + gdf.drop(["_full_text"], axis=1, inplace=True) + gdf.set_geometry( + gpd.points_from_xy(gdf["long"], gdf["lat"]), inplace=True, crs="EPSG:4326" + ) + return gdf + + +def boston_vision_zero_ped_accidents(): + gdf_crashes = _boston_vision_zero_ped_accidents(BOSTON_VISION_ZERO_PED_CRASHES) + gdf_fatalities = _boston_vision_zero_ped_accidents( + BOSTON_VISION_ZERO_PED_FATALITIES + ) + gdf_crashes["is_fatal"] = False + gdf_crashes.rename(columns={"dispatch_ts": "date_time"}, inplace=True) + gdf_fatalities["is_fatal"] = True + return pd.concat([gdf_crashes, gdf_fatalities], ignore_index=True) diff --git a/src/night_light/socioeconomic/population.py b/src/night_light/socioeconomic/population.py index b97367d..946647f 100644 --- a/src/night_light/socioeconomic/population.py +++ b/src/night_light/socioeconomic/population.py @@ -1,9 +1,6 @@ -from pygris import tracts from pygris.data import get_census from night_light.utils.fips import StateFIPS -mass_tracts = tracts(state="MA", cb=True, cache=True, year=2021) - def get_population( year: int = 2021, state: StateFIPS = StateFIPS.MASSACHUSETTS, **kwargs diff --git a/tests/test_boston_vision_zero_mapping.py b/tests/test_boston_vision_zero_mapping.py new file mode 100644 index 0000000..95952b0 --- /dev/null +++ b/tests/test_boston_vision_zero_mapping.py @@ -0,0 +1,43 @@ +import os +import time + +import folium + +from night_light.past_accidents.vision_zero import boston_vision_zero_ped_accidents +from night_light.utils import ( + create_folium_map, + LAYER_STYLE_DICT, + LAYER_HIGHLIGHT_STYLE_DICT, + Tooltip, +) +from night_light.utils.mapping import open_html_file +from tests.conftest import BOSTON_CENTER_COORD + + +def test_boston_vision_zero_map(): + """Test creating a map of the Boston Vision Zero accidents""" + boston_vision_zero_accidents = boston_vision_zero_ped_accidents() + map_filename = "test_boston_vision_zero.html" + accidents_layer = folium.GeoJson( + boston_vision_zero_accidents, + name="Vision Zero Accidents", + style_function=lambda x: LAYER_STYLE_DICT, + highlight_function=lambda x: LAYER_HIGHLIGHT_STYLE_DICT, + smooth_factor=2.0, + tooltip=Tooltip( + fields=["_id", "is_fatal"], + aliases=["Accident ID", "Fatality"], + max_width=800, + ), + ) + create_folium_map( + layers=[accidents_layer], + zoom_start=12, + center=BOSTON_CENTER_COORD, + map_filename=map_filename, + ) + assert os.path.exists(map_filename) + open_html_file(map_filename) + time.sleep(1) + os.remove(map_filename) + assert not os.path.exists(map_filename) diff --git a/tests/test_boston_vision_zero_query.py b/tests/test_boston_vision_zero_query.py new file mode 100644 index 0000000..2b5f09a --- /dev/null +++ b/tests/test_boston_vision_zero_query.py @@ -0,0 +1,33 @@ +import os + +from night_light.past_accidents.vision_zero import boston_vision_zero_ped_accidents +from night_light.utils import query_geojson + + +def test_query_boston_vision_zero_ped_accidents(): + gdf_accidents = boston_vision_zero_ped_accidents() + assert not gdf_accidents.empty + assert "mode_type" in gdf_accidents.columns + assert gdf_accidents.geometry.geom_type.unique() == "Point" + assert gdf_accidents.crs == "EPSG:4326" + assert gdf_accidents["mode_type"].unique() == "ped" + assert not gdf_accidents["lat"].isnull().any() + assert not gdf_accidents["long"].isnull().any() + assert not gdf_accidents["is_fatal"].isnull().any() + + +def test_save_boston_vision_zero_ped_accidents(): + """Test saving the Boston Vision Zero accidents data to a GeoJSON file""" + gdf_accidents = boston_vision_zero_ped_accidents() + geojson_filename = "test_boston_vision_zero.geojson" + + query_geojson.save_geojson(gdf_accidents, geojson_filename) + saved_gdf = query_geojson.gpd.read_file(geojson_filename) + + assert gdf_accidents.crs == saved_gdf.crs + assert set(gdf_accidents.columns) == set(saved_gdf.columns) + assert gdf_accidents.index.equals(saved_gdf.index) + assert gdf_accidents.shape == saved_gdf.shape + + os.remove(geojson_filename) + assert not os.path.exists(geojson_filename)