Learning Multimodal Embeddings for Traffic Accident Prediction and Causal Estimation

Northeastern University

Example satellite images showing different types of roads. Each image is centered around a road network node and captures both the physical characteristics of the road, such as layout, width, and intersections, and the surrounding context, including vegetation, buildings, and terrain.

Abstract

We consider analyzing traffic accident patterns using both road network and satellite images aligned to the road graph nodes.

Previous work for predicting accident occurrences has utilized graph-structural features extracted from road networks, which do not incorporate physical and environmental aspects of the road. This work constructs MMTraCE, a large-scale dataset across six U.S. states, comprising nine million traffic accident records from official sources, and one million high-resolution satellite images for each node of the road network. Additionally, every node is annotated with features such as the region's weather statistics, traffic volume, and road types (e.g., residential vs. motorway).

Utilizing this dataset, we conduct a comprehensive evaluation of multimodal learning methods that integrate both visual and network embeddings. Our findings show that by combining network and visual features, multimodal learning achieves an accurate prediction of accident occurrences with an average AUROC of 90.1%, outperforming graph-based methods by 3.7% on average. With the enhanced accuracy provided by multimodal embeddings, we conduct a causal analysis based on a matching estimator to examine the contributing factors of traffic accidents. The findings suggest that accident frequency increases under higher precipitation by 24%, and on higher speed limit roads such as motorways by 22%, after adjusting for other confounding factors through the embeddings. Seasonal factors increase accident rates by 29%. Ablation studies validate the importance of satellite imagery features for achieving accurate prediction.

MMTraCE Dataset

Statistics of the total number of edges, average edge length in meters, road network density, availability of traffic volume, period of accident records, total number of accident records, and total number of satellite images.

The proportion of different road types among six states' road networks. Residential roads account for the vast majority of the total, making up approximately 74.5% of all roads. Other types, such as tertiary, secondary, and primary, contribute much smaller proportions by comparison.

Total accident count
Average accident count

Accident count of motorway (M), motorway link (M_L), primary (Pri), primary link (Pri_L), residential (Res), secondary (Sec), secondary link (Sec_L), tertiary (Ter), tertiary link (Ter_L), trunk (Tru), trunk link (Tru_L), living street, road, and trailhead.

The average number of accidents per month for each year in Massachusetts, Iowa, Delaware, and Maryland. The sharp drop in 2020 is due to the impact of COVID-19.

Main Results

Main results of GNNs, vision models, and multimodal fusion strategies. The performance is evaluated using the mean absolute error (MAE) and area under the ROC curve (AUROC) on the test split. A leave-one-out analysis is also attached. To account for variability, each experiment is repeated with three different random seeds, and we report the averaged results along with standard deviations.

Causal Analysis

MA accidents in Spring
MA accidents in Winter
IA accidents in Spring
IA accidents in Winter

Seasonal accident counts of Massachusetts and Iowa. It is evident that accident points are more densely distributed in winter, indicating a higher frequency of incidents likely due to adverse weather conditions.

Average treatment effect on the treated (AAT) among all six states. We analyze the effect of seasonal variation, road type, and precipitation. We vary for different years to compute the mean and standard deviations.

Contact

If you have any questions, feel free to contact zhang.zini@northeastern.edu.