Assessing Urban-Rural Income Disparities in the USA: A Data-Driven Approach Using Predictive Analytics
PDF

Keywords

Income Inequality
Urban-Rural Disparities
Predictive Analytics
Machine Learning
Economic Data
Demographic Analysis
Geographic Information Systems

How to Cite

Hossain, M. I. ., Khan, M. N. M. ., Fariha, N. ., Tasnia, R. ., Sarker, B. ., Doha, M. Z. ., Kawsar, M. ., Jui, A. H. ., & Siam , M. A. . (2025). Assessing Urban-Rural Income Disparities in the USA: A Data-Driven Approach Using Predictive Analytics. Journal of Ecohumanism, 4(4), 300 –. https://doi.org/10.62754/joe.v4i4.6733

Abstract

Income inequality in the US is a major quagmire that requires a multifaceted understanding of the difference between urban and rural regions. This study seeks to utilize predictive analytics and machine learning methodologies to help analyze these disparities in a thorough manner. The principal objective of this research was to design machine learning models that classify and analyze the urban-rural gap in income by employing a blend of demographic, geographic, and economic variables. The data for measuring urban-rural income inequality in the USA has been carefully pieced together from a range of trusted sources for broad coverage and reliability. The U.S. Census Bureau provides critical demographic and economic data through the American Community Survey (ACS), which offers nuanced detail on income, education, and work by geographic location. For the analysis of socioeconomic data in this study, three different models were chosen based on their individual strengths and appropriateness for the task of classification. To thoroughly analyze the performance of each of these models, a set of evaluation metrics was used that includes accuracy, precision, recall, F1-score, and ROC-AUC. XG-Boost has the highest accuracy, followed by Logistic Regression. The SVM model has a slightly lower accuracy. From the comparison, one sees that both Logistic Regression and XG-Boost perform significantly better than SVM in classifying the dataset, while SVM, although the least accurate, still has a robust performance. Combining machine learning with social science holds tremendous promise for creating evidence-based policy suggestions that target socioeconomic inequalities in the USA. Using the predictive strength of machine learning algorithms, researchers have the ability to study large sets of datasets and reveal patterns and insights that may escape other methodologies.

https://doi.org/10.62754/joe.v4i4.6733
PDF
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.