Credit scoring: Does XGboost outperform logistic regression?A test on Italian SMEs

Zedda, Stefano
First
2024-01-01

Abstract

The old-fashioned logistic regression is still the most used method for credit scoring. Recent developments have evolved new instruments coming from the machine learning approach, including random forests. In this paper, we tested the efficiency of logistic regression and XGBoost methods for default forecasting on a sample of 35,535 cases from 7 different business sectors of Italian SMEs, on a set of 28 banking variables and 55 balance sheet ratios for verifying which approach is better supporting the lending decisions. With this aim, we developed an efficiency index for measuring each model's capability to correctly select good borrowers, balancing the different effects of refusing the loan to a good customer and lending to a defaulter. Also, we computed the balancing spread to quantify the different models' efficiency in terms of credit costs for the borrower firms. Results show that different sectors report different results. However, generally speaking, the two methods report similar capabilities, while the cutoff setting can make a substantial difference in the actual use of those models for lending decisions.
2024
2024
Inglese
70
28
Esperti anonimi
internazionale
scientifica
Credit scoringLogistic regressionXGBoostBank lendingSMEs
no
Zedda, Stefano
1.1 Articolo in rivista
info:eu-repo/semantics/article
1 Contributo su Rivista::1.1 Articolo in rivista
262
1
none
Files in This Item:
There are no files associated with this item.

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Questionnaire and social

Share on:
Impostazioni cookie