TY - JOUR
T1 - Electricity Theft Detection using Machine Learning
AU - Petrlik, Ivan
AU - Lezama, Pedro
AU - Rodriguez, Ciro
AU - Inquilla, Ricardo
AU - Reyna-González, Julissa Elizabeth
AU - Esparza, Roberto
N1 - Publisher Copyright:
© 2022, International Journal of Advanced Computer Science and Applications. All Rights Reserved.
PY - 2022
Y1 - 2022
N2 - This research work dealt with the indiscriminate theft of electric power, reported as a non-technical loss, affecting electric distribution companies and customers, triggering serious consequences including fires and blackouts. The research focused on recommending the best prediction model using Machine Learning in electrical energy theft. The source of the information on the electricity consumption of 42372 consumers was a dataset published in the State Grid Corporation of China. The method used was data imputation, data balancing (oversampling and under sampling), and feature extraction to improve energy theft detection. Five Machine Learning models were tested. As a result, the accuracy indicator of the SVM model was 81%, K-Nearest Neighbors 79%, Random Forest 80%, Logistic Regression 69%, and Naive Bayes 68%. It is concluded that the best performance, with an accuracy of 81%, is obtained by using the SVM model.
AB - This research work dealt with the indiscriminate theft of electric power, reported as a non-technical loss, affecting electric distribution companies and customers, triggering serious consequences including fires and blackouts. The research focused on recommending the best prediction model using Machine Learning in electrical energy theft. The source of the information on the electricity consumption of 42372 consumers was a dataset published in the State Grid Corporation of China. The method used was data imputation, data balancing (oversampling and under sampling), and feature extraction to improve energy theft detection. Five Machine Learning models were tested. As a result, the accuracy indicator of the SVM model was 81%, K-Nearest Neighbors 79%, Random Forest 80%, Logistic Regression 69%, and Naive Bayes 68%. It is concluded that the best performance, with an accuracy of 81%, is obtained by using the SVM model.
KW - Energy theft
KW - Machine learning
KW - Non-technical losses
KW - Support vector machine
UR - http://www.scopus.com/inward/record.url?scp=85146668448&partnerID=8YFLogxK
U2 - 10.14569/IJACSA.2022.0131251
DO - 10.14569/IJACSA.2022.0131251
M3 - Article
AN - SCOPUS:85146668448
SN - 2158-107X
VL - 13
SP - 420
EP - 425
JO - International Journal of Advanced Computer Science and Applications
JF - International Journal of Advanced Computer Science and Applications
IS - 12
ER -