Subject. This article deals with the issues of validation of the consistency of rating-based model forecasts. Objectives. The article aims to provide developers and validators of rating-based models with a practical fundamental test for benchmarking study of the estimated default probability values obtained as a result of the application of models used in the rating system. Methods. For the study, I used the classical interval approach to testing of statistical hypotheses focused on the subject area of calibration of rating systems. Results. In addition to the generally accepted tests for the correspondence of the predicted probabilities of default of credit risk objects to the historically realized values, the article proposes a new statistical test that corrects the shortcomings of the generally accepted ones, focused on "diagnosing" the consistency of the implemented discrimination of objects by the rating model. Examples of recognizing the reasons for a negative test result and negative consequences for lending are given while maintaining the current settings of the rating model. In addition to the bias in the assessment of the total frequency of defaults in the loan portfolio, the proposed method makes it possible to objectively reveal the inadequacy of discrimination against borrowers with a calibrated rating model, diagnose the “disease” of the rating model. Conclusions and Relevance. The new practical benchmark test allows to reject the hypothesis about the consistency of assessing the probability of default by the rating model at a given level of confidence and available historical data. The test has the advantage of practical interpretability based on its results, it is possible to draw a conclusion about the direction of the model correction. The offered test can be used in the process of internal validation by the bank of its own rating models, which is required by the Bank of Russia for approaches based on internal ratings.
Keywords: credit risk, probability of default, statistical test, Gini, ROC curve
Tasche D. Validation of Internal Rating Systems and PD Estimates. In: The Analytics of Risk Model Validation. Elsevier, 2008, pp. 169–196. URL: Link
González F., Coppens F., Winkler G. The Performance of Credit Rating Systems in the Assessment of Collateral Used in Eurosystem Monetary Policy Operations. European Central Bank Occasional Paper Series, 2007, no. 65, 42 p. URL: Link
Miu P., Ozdemir B. Estimating and Validating Long-Run Probability of Default with Respect to Basel II Requirements. Journal of Risk Model Validation, 2008, vol. 2, no. 2, pp. 3–41. URL: Link
Sauer S., Coppens F., Mayer M. et al. Advances in Multivariate Back-Testing for Credit Risk Underestimation. European Central Bank Working Paper Series, 2016, no. 1885, 35 p. URL: Link
Westfall P.H., Wolfinger R.D. Multiple Tests with Discrete Distributions. The American Statistician, 1997, vol. 51, iss. 1, pp. 3–8. URL: Link
Hosmer D.W., Lemeshow S. Applied Logistic Regression. New York, John Wiley & Sons, Inc., 2000. URL: Link
Sokal R.R., Rohlf F.J. Biometry: The Principles and Practice of Statistics in Biological Research. New York, Freeman, 1994, 880 p.
McDonald J.H. Small numbers in chi-square and G–tests. Handbook of Biological Statistics. Baltimore, Maryland, Sparky House Publishing, 2014, pp. 86–89.
Spiegelhalter D.J. Probabilistic Prediction in Patient Management and Clinical Trials. Statistics in Medicine, 1986, vol. 5, iss. 5, pp. 421–433. URL: Link
Geary R.C. The Frequency Distribution of the Quotient of Two Normal Variates. Journal of the Royal Statistical Society, 1930, vol. 93, no. 3, pp. 442–446. URL: Link
Hinkley D.V. On the Ratio of Two Correlated Normal Random Variables. Biometrika, 1969, vol. 56, iss. 3, pp. 635–639. URL: Link
Hayya J., Armstrong D., Gressis N. A Note on the Ratio of Two Normally Distributed Variables. Management Science, 1975, vol. 21, no. 11, pp. 1338–1341. URL: Link
Pomazanov M.V. [ROC Analysis and Calibration of Scoring Models Based on Second Order Accuracy Metrics]. Upravlenie finansovymi riskami = Financial Risk Management, 2021, no. 2, pp. 100–121. (In Russ.) URL: Link
Hong Ch.-S., Lee W.-Y. ROC Curve Fitting with Normal Mixtures. The Korean Journal of Applied Statistics, 2011, vol. 24, iss. 2, pp. 269–278. URL: Link
Engelmann B., Hayden E., Tasche D. Measuring the Discriminative Power of Rating Systems. Discussion Paper Series 2: Banking and Financial Supervision, 2003, no. 01, 24 p. URL: Link
Hanley J.A., McNeil B.J. The Meaning and Use of the Area under a Receiver Operating Characteristics (ROC) Curve. Radiology, 1982, vol. 143, no. 1, pp. 29–36. URL: Link