Skip to content

Commit

Permalink
Merge pull request #10 from olincollege/SAN-38-vision-zero-accidents
Browse files Browse the repository at this point in the history
SAN-38 vision zero accidents
  • Loading branch information
cory0417 authored Nov 17, 2024
2 parents 3843235 + ab6341d commit dbb4d08
Show file tree
Hide file tree
Showing 6 changed files with 123 additions and 7 deletions.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
20 changes: 16 additions & 4 deletions docs/source/explanations/datasets.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,18 +3,17 @@ Dataset Choices

Since creating a synthesized dataset based on multiple datasets is a critical part of the data processing pipeline, we have to make some choices regarding the datasets that we use. This document serves to explain the significant yet non-obvious logic and choices regarding different types of datasets.


Crosswalk Dataset
-----------------

The crosswalk dataset should contain polygons that represent the boundaries of the crosswalks. The dataset ideally should accurately reflect the real world, but there is an implicit understanding that the dataset may not be perfect since most cities do not have a comprehensive dataset of crosswalks. The dataset should be in a format that can be easily read by the software, such as GeoJSON. In the testing module, you'll find the tests for fetching the crosswalk dataset and mapping it for the city of Boston. We've chosen Boston for its relative ease of access to various datasets and physical proximity to the team at Olin College of Engineering.

UMass Amherst has been developing a dataset of all crosswalks in Massachusetts using computer vision model (YOLOv8) and aerial imagery. The dataset is not perfect, but it is a good starting point for our project, and has the potential to be applicable for states that also do not have a thorough catalog of their crosswalk assets. The dataset can be viewed at the `following link <https://www.arcgis.com/apps/mapviewer/index.html?url=https://gis.massdot.state.ma.us/arcgis/rest/services/Assets/Crosswalk_Poly/FeatureServer/0&source=sd>`_.
UMass Amherst has been developing a dataset of all crosswalks in Massachusetts using computer vision model (YOLOv8) and aerial imagery. The dataset is not perfect, but it is a good starting point for our project, and has the potential to be applicable for states that also do not have a thorough catalog of their crosswalk assets. The dataset can be viewed at the `UMass Crosswalk Dataset <https://www.arcgis.com/apps/mapviewer/index.html?url=https://gis.massdot.state.ma.us/arcgis/rest/services/Assets/Crosswalk_Poly/FeatureServer/0&source=sd>`_.

Traffic Dataset
---------------

Since our project is focused on pedestrian safety at nighttime on crosswalks, we need a dataset that contains information about the volume of traffic. MassDOT provides a convenient dataset that includes average annual daily traffic (AADT) counts for most roads in Massachusetts. The counts will be used to inform the risk of a pedestrian being hit by a car at a given crosswalk. The dataset can be viewed at the `following link <https://www.arcgis.com/apps/mapviewer/index.html?url=https://gis.massdot.state.ma.us/arcgis/rest/services/Roads/VMT/FeatureServer/10&source=sd>`_.
Since our project is focused on pedestrian safety at nighttime on crosswalks, we need a dataset that contains information about the volume of traffic. MassDOT provides a convenient dataset that includes average annual daily traffic (AADT) counts for most roads in Massachusetts. The counts will be used to inform the risk of a pedestrian being hit by a car at a given crosswalk. The dataset can be viewed at the `MassDOT Traffic Dataset <https://www.arcgis.com/apps/mapviewer/index.html?url=https://gis.massdot.state.ma.us/arcgis/rest/services/Roads/VMT/FeatureServer/10&source=sd>`_.

Population Density Dataset
--------------------------
Expand Down Expand Up @@ -45,9 +44,22 @@ Density Calculation Methodology
Streetlights Dataset
********************

The main information that the streetlights dataset should contain is the location of the streetlights. Additional information such as the type of bulb, last-replacement year, and wattage, etc. are useful to have as well. After talking to Michael Donaghy, Superintendent of Street Lighting at the City of Boston Public Works Department, we learned that Boston has recently completed a full catalog of their streetlight assets in 2023. We acknowledge that many cities might not have this data available, in which case, `OpenStreetMap features <https://wiki.openstreetmap.org/wiki/Tag:highway%3Dstreet_lamp>`_ could be used to roughly estimate the streetlight locations. The Boston streetlight dataset can be viewed at the `following link <https://sdmaps.maps.arcgis.com/apps/dashboards/84e1553e754b424f9c544ab5079ed99f>`_.
The main information that the streetlights dataset should contain is the location of the streetlights. Additional information such as the type of bulb, last-replacement year, and wattage, etc. are useful to have as well. After talking to Michael Donaghy, Superintendent of Street Lighting at the City of Boston Public Works Department, we learned that Boston has recently completed a full catalog of their streetlight assets in 2023. We acknowledge that many cities might not have this data available, in which case, `OpenStreetMap features <https://wiki.openstreetmap.org/wiki/Tag:highway%3Dstreet_lamp>`_ could be used to roughly estimate the streetlight locations. The Boston streetlight dataset can be viewed at the `Boston Streetlight Dataset <https://sdmaps.maps.arcgis.com/apps/dashboards/84e1553e754b424f9c544ab5079ed99f>`_.

Income Dataset
**************

The income dataset is also sourced from the American Community Survey (ACS) 5-year estimates. The dataset includes median household income data for each census tract within a specified state and year. The data is used to analyze the relationship between income levels and pedestrian safety, as well as to identify areas with possible infrastructure inequity.

Past Accidents Vision Zero Dataset
**********************************

We have chosen to use the pedestrian accidents dataset from the Vision Zero initiative in Boston. The dataset contains information about the location of accidents, the date and time of accident, and the severity of the accident (injury/fatality). The dataset is used to identify high-risk areas for pedestrian accidents and to inform the prioritization of crosswalks for safety improvements. The dataset can be viewed at the `Vision Zero Dataset <https://experience.arcgis.com/experience/bae68e65908f45e1bcc86fe5f089d266/page/>`_.

.. figure:: ../_static/images/Vision_Zero_Network_Community_Map_February_2024.jpg
:alt: Vision Zero Network Community Map
:width: 1000px

Vision Zero Network Community Map (February 2024)

Vision Zero initiatives are a nationwide effort to eliminate traffic fatalities and severe injuries. Growing number of cities have contributed to this effort and collected data, which will help this project be applicable outside of Boston as well.
31 changes: 31 additions & 0 deletions src/night_light/past_accidents/vision_zero.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
import geopandas as gpd
import requests
import pandas as pd

BOSTON_VISION_ZERO_URL = "https://data.boston.gov/api/3/action/datastore_search_sql"
BOSTON_VISION_ZERO_PED_CRASHES = '"e4bfe397-6bfc-49c5-9367-c879fac7401d"'
BOSTON_VISION_ZERO_PED_FATALITIES = '"92f18923-d4ec-4c17-9405-4e0da63e1d6c"'


def _boston_vision_zero_ped_accidents(resource_id: str) -> gpd.GeoDataFrame:
params = {"sql": "SELECT * from " + resource_id + " WHERE mode_type = 'ped'"}
response = requests.get(BOSTON_VISION_ZERO_URL, params=params)
response.raise_for_status()
data = response.json()["result"]["records"]
gdf = gpd.GeoDataFrame.from_records(data)
gdf.drop(["_full_text"], axis=1, inplace=True)
gdf.set_geometry(
gpd.points_from_xy(gdf["long"], gdf["lat"]), inplace=True, crs="EPSG:4326"
)
return gdf


def boston_vision_zero_ped_accidents():
gdf_crashes = _boston_vision_zero_ped_accidents(BOSTON_VISION_ZERO_PED_CRASHES)
gdf_fatalities = _boston_vision_zero_ped_accidents(
BOSTON_VISION_ZERO_PED_FATALITIES
)
gdf_crashes["is_fatal"] = False
gdf_crashes.rename(columns={"dispatch_ts": "date_time"}, inplace=True)
gdf_fatalities["is_fatal"] = True
return pd.concat([gdf_crashes, gdf_fatalities], ignore_index=True)
3 changes: 0 additions & 3 deletions src/night_light/socioeconomic/population.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,6 @@
from pygris import tracts
from pygris.data import get_census
from night_light.utils.fips import StateFIPS

mass_tracts = tracts(state="MA", cb=True, cache=True, year=2021)


def get_population(
year: int = 2021, state: StateFIPS = StateFIPS.MASSACHUSETTS, **kwargs
Expand Down
43 changes: 43 additions & 0 deletions tests/test_boston_vision_zero_mapping.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
import os
import time

import folium

from night_light.past_accidents.vision_zero import boston_vision_zero_ped_accidents
from night_light.utils import (
create_folium_map,
LAYER_STYLE_DICT,
LAYER_HIGHLIGHT_STYLE_DICT,
Tooltip,
)
from night_light.utils.mapping import open_html_file
from tests.conftest import BOSTON_CENTER_COORD


def test_boston_vision_zero_map():
"""Test creating a map of the Boston Vision Zero accidents"""
boston_vision_zero_accidents = boston_vision_zero_ped_accidents()
map_filename = "test_boston_vision_zero.html"
accidents_layer = folium.GeoJson(
boston_vision_zero_accidents,
name="Vision Zero Accidents",
style_function=lambda x: LAYER_STYLE_DICT,
highlight_function=lambda x: LAYER_HIGHLIGHT_STYLE_DICT,
smooth_factor=2.0,
tooltip=Tooltip(
fields=["_id", "is_fatal"],
aliases=["Accident ID", "Fatality"],
max_width=800,
),
)
create_folium_map(
layers=[accidents_layer],
zoom_start=12,
center=BOSTON_CENTER_COORD,
map_filename=map_filename,
)
assert os.path.exists(map_filename)
open_html_file(map_filename)
time.sleep(1)
os.remove(map_filename)
assert not os.path.exists(map_filename)
33 changes: 33 additions & 0 deletions tests/test_boston_vision_zero_query.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
import os

from night_light.past_accidents.vision_zero import boston_vision_zero_ped_accidents
from night_light.utils import query_geojson


def test_query_boston_vision_zero_ped_accidents():
gdf_accidents = boston_vision_zero_ped_accidents()
assert not gdf_accidents.empty
assert "mode_type" in gdf_accidents.columns
assert gdf_accidents.geometry.geom_type.unique() == "Point"
assert gdf_accidents.crs == "EPSG:4326"
assert gdf_accidents["mode_type"].unique() == "ped"
assert not gdf_accidents["lat"].isnull().any()
assert not gdf_accidents["long"].isnull().any()
assert not gdf_accidents["is_fatal"].isnull().any()


def test_save_boston_vision_zero_ped_accidents():
"""Test saving the Boston Vision Zero accidents data to a GeoJSON file"""
gdf_accidents = boston_vision_zero_ped_accidents()
geojson_filename = "test_boston_vision_zero.geojson"

query_geojson.save_geojson(gdf_accidents, geojson_filename)
saved_gdf = query_geojson.gpd.read_file(geojson_filename)

assert gdf_accidents.crs == saved_gdf.crs
assert set(gdf_accidents.columns) == set(saved_gdf.columns)
assert gdf_accidents.index.equals(saved_gdf.index)
assert gdf_accidents.shape == saved_gdf.shape

os.remove(geojson_filename)
assert not os.path.exists(geojson_filename)

0 comments on commit dbb4d08

Please sign in to comment.