Abstract

Objective: This meta-analysis aims to evaluate machine learning methods in remote sensing applications for monitoring Sustainable Development Goals (SDGs). Specifically, it aims to (1) estimate the average performance (summary effect size); (2) determine the degree of heterogeneity within and across studies; (3) assess whether specific study features influence model performance; and (4) compare the sample-weighted and unweighted estimate summary effect.

Methods: The meta-analysis used the PRISMA guidelines. A search was performed across multiple academic databases to identify peer-reviewed studies that applied machine learning models to remote sensing data for SDG monitoring. A random sample of 200 relevant studies was selected for abstract screening, which was reduced to \(n = 20\) studies with \(k = 86\) effect sizes for the analysis. To estimate the overall accuracy of machine learning models both a three-level random-effects model and an unweighted model were used.

Results: The average overall accuracy of the unweighted model, \(\hat{\mu}_{_\text{unweighted}} =\) 0.90 (95% CI [0.85; 0.94]), which is not substantially different from the weighted model, \(\hat{\mu}_{_\text{weighted}} =\) 0.89 (CI 95% [0.85, 0.94]. The weighted models found substantial heterogeneity between results. Unsurprisingly, the proportion of the majority class was identified as the most important factor affecting the overall accuracy, followed by the inclusion of ancillary data. However, machine learning model group (i.e., neural networks, tree-based models) or SDG goal did not have a significant effect on the reported overall accuracy.

Conclusion: This study demonstrates the high variability model performance in remote sensing applications. As well as the impact class imbalance has on the reported overall accuracy. These findings suggest the need for precise metrics to assess model performance, particularly in imbalanced datasets. Future research should examine a broader range of performance metrics and explore additional study features to explore further what features affect the outcomes. In addition, the robustness of the random-effects meta-analysis methods application to this field should be further examined.