Abstract
Drinking water treatment process requires periodic measurements every hour to meet drinkable water standards. This study aims to classify water potability in Gowa and perform feature selection to identify the most optimal parameters. The research uses the SVM and XGBoost for classification and employs RFECV and SelectKBest for feature selection. The results show that most correlations between parameters and the target are weak, indicating that each parameter operates independently and has unique value in determining drinking water potability. The study achieves high model accuracy, with 95.8% for SVM and 97.8% for XGBoost. After feature selection, the final accuracy for the SVM model is 95.8% using the SelectKBest with 3 selected features: turbidity, free chlorine, and temperature. Using the RFECV, the accuracy is 96% with 5 selected features: turbidity, temperature, free chlorine, alkalinity, and TDS. For XGBoost, the final accuracy after feature selection is 97.8% using the SelectKBest with 5 selected features: turbidity, free chlorine, temperature, pH, and alkalinity. The RFECV feature selection for XGBoost also maintains the same accuracy of 97.8%. Based on the results, XGBoost performs slightly better than SVM, but RFECV improves SVM accuracy while maintaining XGBoost accuracy. The SelectKBest method also maintains the accuracy for both models.

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.