Application of Data Mining and Machine Learning Techniques to Predict Loan Approval and Payment Time

Document Type : Research Paper

Authors

1 PhD Student of Information Technology Management, Department of Management, Hamedan Branch, Islamic Azad University, Hamedan, Iran.

2 Associate Professor, Department of Computer Engineering, Hamedan Branch, Islamic Azad University, Hamedan, Iran.

3 Assistant Professor, Department of Knowledge and Information Science, Hamedan Branch, Islamic Azad University, Hamedan, Iran.

4 Professor, Department of Knowledge and Information Science, Hamedan Branch, Islamic Azad University, Hamedan, Iran.

5 Assistant Professor, Department of Industrial Engineering, Sharif University of Technology, Tehran, Iran.

Abstract

One of the most important issues regarding banks is knowing the customers, their behaviors, and the decisions these institutions make regarding customers' preferences. Their main task is to provide banking facilities. Bank facilities carry the risk of default in repayment. Failure to evaluate and review factors related to repayment can cause significant damage to banks. On the other hand, investment in the private sector and various industries is also increasingly important. This action can lead to economic growth, increased employment, and national income. This research aims to identify the effective features related to the fixed capital facility data of one of the active banks in Iran, in line with the classification of customers into two categories good customers and overdue customers to predict the duration of the facility payment. The five-step method is based on data mining techniques.  The most important steps of this method are data preparation, analysis with rough set methods, and common classification techniques such as artificial neural networks, tree types, Bayes types, and support vector machines. One of the most important results of this research was the identification of the features that affect the repayment and duration of fixed capital facilities. Additionally, among other results of the present research, the ANN method demonstrated superior performance in evaluating credit risk with an accuracy value of 70.27%, and the J48 technique showed superior performance in predicting the duration of payment of facilities with an accuracy of 72.54%.

Keywords

Main Subjects


  1. Khoshhaikel, et al., Identification of barriers to the development of electronic banking. Business Intelligence Management Studies, 2016. 4(16): p. 123-145..
  2. Gholamian, Mozafari, and Azimeh, Predicting the value of new bank customers based on the R model. F. M using an improved decision tree to reduce the maximum memory required.
  3. Mittal, A., et al. A study on credit risk assessment in the banking sector using data mining techniques. In 2018 International Conference on Advanced Computation and Telecommunication (ICACAT). 2018. IEEE.
  4. Mandala, I.G.N.N., C.B. Nawangpalupi, and F.R. Praktikto, Assessing credit risk: An application of data mining in a rural bank. Procedia Economics and Finance, 2012. 4: p. 406-412.
  5. Thomas, L.C., A survey of credit and behavioural scoring: forecasting financial risk of lending to consumers. International journal of forecasting, 2000. 16(2): p. 149-172.
  6. Sadatrasoul, S.M., et al., Credit scoring in banks and financial institutions via data mining techniques: A literature review. Journal of AI and Data Mining, 2013. 1(2): p. 119-129.
  7. Elah, T.F. and A.N. Majid, The role of the banking system's payment credits and the government's budget in the formation of gross domestic fixed capital. 2005.
  8. Salehi and K. Katoli, choosing the optimal features in order to determine the credit risk of bank customers. Business Intelligence Management Studies, 2018. 6(22): p. 129-154.
  9. BASHA, S.G., Importance of Data Mining in Banking Sectors. 2017.
  10. Marek, W. and Z. Pawlak, Rough sets and information systems. Fundamenta Informaticae, 1984. 7(1): p. 105-115.
  11. Pawlak, Z., Rough sets. International journal of computer & information sciences, 1982. 11: p. 341-356.
  12. Papakyriakou, D. and I.S. Barbounakis, Data mining methods: A review. Int. J. Comput. Appl, 2022. 183(48): p. 5-19.
  13. Jackson, J., Data mining; a conceptual overview. Communications of the Association for Information Systems, 2002. 8(1): p. 19.
  14. Padhy, N., D.P. Mishra, and R. Panigrahi, The survey of data mining applications and feature scope. arXiv preprint arXiv:1211.5723, 2012.
  15. Kesavaraj, G. and S. Sukumaran. A study on classification techniques in data mining. in the 2013 fourth international conference on computing, communications and networking technologies (ICCCNT). 2013. IEEE.
  16. Jadhav, S.D. and H. Channe, Comparative study of K-NN, naive Bayes and decision tree classification techniques. International Journal of Science and Research (IJSR), 2016. 5(1): p. 1842-1845.
  17. Gerhana, Y., et al. Comparison of naive Bayes classifier and C4. 5 algorithms in predicting student study period. In Journal of Physics: Conference Series. 2019. IOP Publishing.
  18. Vijayarani, S. and S. Dhayanand, Liver disease prediction using SVM and Naïve Bayes algorithms. International Journal of Science, Engineering and Technology Research (IJSETR), 2015. 4(4): p. 816-820.
  19. Muralidharan, V. and V. Sugumaran, A comparative study of Naïve Bayes classifier and Bayes net classifier for fault diagnosis of monoblock centrifugal pump using wavelet analysis. Applied Soft Computing, 2012. 12(8): p. 2023-2029.
  20. Abiodun, O.I., et al., State-of-the-art in artificial neural network applications: A survey. Heliyon, 2018. 4(11).
  21. Fletcher, T., Support vector machines explained. Tutorial paper, 2009. 1118: p. 1-19.
  22. Zandi, S., et al., Attention-based Dynamic Multilayer Graph Neural Networks for Loan Default Prediction. arXiv preprint arXiv:2402.00299, 2024.
  23. Chen, B., W. Jin, and H. Lu, Using a genetic backpropagation neural network model for credit risk assessment in the micro, small and medium-sized enterprises. Heliyon, 2024. 10(14).
  24. Montevechi, A.A., et al., Advancing credit risk modeling with Machine Learning: A comprehensive review of the state-of-the-art. Engineering Applications of Artificial Intelligence, 2024. 137: p. 109082.
  25. Zhang, X. and L. Yu, Consumer credit risk assessment: A review from the state-of-the-art classification algorithms, data traits, and learning methods. Expert Systems with Applications, 2024. 237: p. 121484.
  26. Addy, W.A., et al., Predictive analytics in credit risk management for banks: A comprehensive review. GSC Advanced Research and Reviews, 2024. 18(2): p. 434-449.
  27. Chandrasiri, T.D. and S.C. Premaratne Enhancing Credit Risk Analysis of SME Loans by Using Data Mining Techniques. 2023.
  28. Jumaa, M., M. Saqib, and A. Attar, Improving credit risk assessment through deep learning-based consumer loan default prediction model. International Journal of Finance & Banking Studies (2147-4486), 2023. 12(1): p. 85-92.
  29. Chen, Q., Interpretable Data Mining Approaches to Predict Term Deposits Subscriptions. BCP Business & Management, 2023. 44: p. 345-350.
  30. Anand, M., A. Velu, and P. Whig, Prediction of loan behaviour with machine learning models for secure banking. Journal of Computer Science and Engineering (JCSE), 2022. 3(1): p. 1-13.
  31. Munoz, J., et al., Deep learning based bi-level approach for proactive loan prospecting. Expert Systems with Applications, 2021. 185: p. 115607.
  32. Desta, A.W. and J.S. Nixon, Data mining application in predicting bank loan defaulters. International Journal of Innovative Technology and Exploring Engineering, 2020. 9(4).
  33. Wang, J., et al., Rough set and scatter search metaheuristic based feature selection for credit scoring. Expert Systems with Applications, 2012. 39(6): p. 6123-6128.
  34. Crone, S.F. and S. Finlay, Instance sampling in credit scoring: An empirical study of sample size and balancing. International Journal of Forecasting, 2012. 28(1): p. 224-238.
  35. Koutanaei, F.N., H. Sajedi, and M. Khanbabaei, A hybrid data mining model of feature selection algorithms and ensemble learning classifiers for credit scoring. Journal of Retailing and Consumer Services, 2015. 27: p. 11-23.
  36. Gulsoy, N. and S. Kulluk, A data mining application in credit scoring processes of small and medium enterprises commercial corporate customers. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2019. 9(3): p. e1299.
  37. Jisha, M. and D.V. Kumar, A CASE STUDY ON DATA MINING APPLICATIONS ON BANKING SECTOR. 2018.
  38. Hamid, A.J. and T.M. Ahmed, Developing prediction model of loan risk in banks using data mining. Machine Learning and Applications: An International Journal, 2016. 3(1): p. 1-9.
  39. Hooman, A., et al., Statistical and data mining methods in credit scoring. The Journal of Developing Areas, 2016. 50(5): p. 371-381.
  40. Eskandari, J. and Rouhi, credit risk management of bank customers using improved decision vector machine method with genetic algorithm with data mining approach. Asset Management and Financing, 2017. 5(4): p. 17-32.