Peer-Reviewed Journal Details
Mandatory Fields
Credit, K
2021
January
Geographical Analysis
Spatial Models or Random Forest? Evaluating the Use of Spatially Explicit Machine Learning Methods to Predict Employment Density around New Transit Stations in Los Angeles
Published
0 ()
Optional Fields
LIGHT RAIL TRANSIT RAPID-TRANSIT PROPERTY-VALUES LAND-USE IMPACTS WASHINGTON PROXIMITY PHOENIX
The increasing use of "new" machine learning techniques, such as random forest, provides an impetus to researchers to better understand the role of space in these models. Thus, this article develops an approach for constructing spatially explicit random forest models by including spatially lagged variables to mirror various spatial econometric specifications in order to test their comparative performance against traditional spatial and nonspatial regression models for predicting block-level employment density around new transit stations in Los Angeles. This article employs a "post hoc" testing approach to isolate the impact of a particular variable (transit proximity)-and supplemental diagnostics (such as partial dependence plots and permutation importances)-to help inform explanatory relationships. The results indicate that random forest models slightly outperform spatial econometric models, and the inclusion of spatial lag parameters modestly improves random forest model accuracy-the best-fit spatial random forest model demonstrates 84.61% accuracy in predicting post-construction employment density around newly built transit stations, compared to 81.88% for the best-fit spatial econometric model and 84.37% for the nonspatial random forest model. However, given these somewhat small differences, it is not possible to conclude that the random forest approach is clearly superior to traditional spatial econometric models from these results alone.
HOBOKEN
0016-7363
10.1111/gean.12273
Grant Details