Connect with us

Global Banking and Finance Review is an online platform offering news, analysis, and opinion on the latest trends, developments, and innovations in the banking and finance industry worldwide. The platform covers a diverse range of topics, including banking, insurance, investment, wealth management, fintech, and regulatory issues. The website publishes news, press releases, opinion and advertorials on various financial organizations, products and services which are commissioned from various Companies, Organizations, PR agencies, Bloggers etc. These commissioned articles are commercial in nature. This is not to be considered as financial advice and should be considered only for information purposes. It does not reflect the views or opinion of our website and is not to be considered an endorsement or a recommendation. We cannot guarantee the accuracy or applicability of any information provided with respect to your individual or personal circumstances. Please seek Professional advice from a qualified professional before making any financial decisions. We link to various third-party websites, affiliate sales networks, and to our advertising partners websites. When you view or click on certain links available on our articles, our partners may compensate us for displaying the content to you or make a purchase or fill a form. This will not incur any additional charges to you. To make things simpler for you to identity or distinguish advertised or sponsored articles or links, you may consider all articles or links hosted on our site as a commercial article placement. We will not be responsible for any loss you may suffer as a result of any omission or inaccuracy on the website. .

Banking

Econometric models and Machine Learning models in the Banking Industry from Regulatory Perspective

iStock 1215837832 - Global Banking | Finance
509 - Global Banking | Finance

Anand Pandey

510 - Global Banking | Finance

Debashish Jana

By Anand Pandey, Domain Consultant, CRO Unit, TCS, Tata Consultancy Services, Bengaluru, India

Debashish Jana, Domain Consultant, CRO Unit, TCS, Tata Consultancy Services, Kolkata, India

Abstract— Banks have a crucial role in the financial system and models help bank to detect risk in time and apply preventive measures. Econometric models have been widely used in banks for risk assessment and forecasting. Due to recent advancement in technology, machine learning methods are getting more attention to generate insights from the data. In this paper we analyze the conduct risk of the banks using several machine learning models and econometric model. Banks that fail to bring conduct risk in line pay hefty fines and compromise with their reputation. We compare the traditional used econometric model Logistic regression with more advanced machine learning based models. We evaluate the advantages of machine learning models in terms of predictive power.

Although the financial sector is inclined to adopt machine learning tools to manage credit risk, there is a pressure on the banks to ensure compatibility with regulatory requirements. Any use of predictive models in the financial sector calls for thorough bias testing. Otherwise, banks may face the risk of increasing injustice instead of reducing it. The main advantage of machine learning models over econometric models is that it works not only with large volumes of structured data, but also with unstructured data.

Keywords— Machine Learning, Econometric Models, Risk, Banks

I.     Introduction

Financial Institutions are by fundamentally built around information and data flows. They have been becoming storehouse of large volumes of diversified and they are analyzed speedily & promptly, they can highlight meaningful pattern and assist with more informed decisions in a useful and usable way. To analyze such volumes of data, Financial Institutions are dependent upon different kind of models.

Models plays an important role in the risk management practice because they are used as an elucidation of reality, however assumptions used to elucidate model, eliminates various aspects of reality’s complexity and provides unexpected results. These results, expected to follow a reasonable probability which will not deviate significantly from the actual results and remain within the acceptable margin. This defines the overall risk appetite of the Financial Institutions.

The risk of the Financial Institutions lies in the health of its assets that needs to agree with the bank’s risk appetite. In recent times, number of models are increasing at a rapid pace, mostly at large Financial Institutions for an ever-broadening scope of decision making. The regulatory models are gaining precedence to meet a variety of regulatory and financial reporting standards (such as IFRS 9, CECL and Basel guidelines).

Currently, transition is underway within the Financial Institutions where traditional econometric models, built by domain experts based on theories, statistics, and assumptions, are being challenged by Machine Learning and Artificial Intelligence modeling framework. The new framework models are trained to predict a required output based on inputs observations. Much emphasis is given on the input data rather than on experts’ assumptions.

Challenger Models are being developed by Financial Institutions to compare the performance where there is already a model in place. The purpose of a challenger model is to challenge the model in use and ML models can contribute to more advanced challenger models within a limited span of time. Since the Challenger models will not be put into production, they can remain unexplainable which makes all the more suitable including black box models such as neural networks.

Within the financial risk management framework, ML can bring additional value, or supersede models and can be used in various model types. However, one should remain cognizant about the mechanism while applying ML models as every technique has its limitations. Added with practicality, most ML models will deliver sound solution, when modeling. ML can be a capable tool to refine modeling activities within the financial risk management.

II.    Model requirements from Regulatory perspective

Credit scoring models play an important role in the risk management practice at most Financial Institutions. They are used to quantify credit risk which is the risk of borrower not repaying loan, credit card or any other type of loan in the different phases of the credit cycle (e.g., application, behavioral, collection models) using statistics and machine learning. Credit scoring models are adhering to the New Basel Capital Accord. Basel I accord was introduced in year 1988 to focus on credit risk and capital adequacy ratio which is also known as Capital to Risk Assets Ratio. Basel II accord was introduced in June 2004 to eliminate the limitations of Basel I. It focused on operational and market risk along with credit risk.

Basel III has incorporated several risk measures to counter issues which were identified and highlighted in 2008 financial crisis. It emphasizes on revised capital standards (such as leverage ratios), stress testing and tangible equity capital which is the component with the greatest loss-absorbing capacity. The International Accounting Standards Board (IASB) has introduced a new standard IFRS9 on impairment, a three-step approach, which in general replaces the current incurred impairment model with a new expected loss model.

Probability of Default Model from IFRS-9 point of view

Parameters Basel III IFRS 9
Objective Expected + Unexpected Loss Expected Loss
PD One year PD 12-month PD for stage 1 assets, Lifetime PD for stage 2 and 3 assets
Rating Philosophy TTC rating philosophy PIT rating philosophy
LGD Downturn LGD (both direct + indirect costs) Best estimate LGD (only direct costs)
EAD Downturn EAD Best estimate EAD
Expected Loss /Expected Credit Loss (ECL) EL=PD*LGD*EAD EL=PD*PV of cash shortfalls

III.  Trending Models in the Banking Industry

For Financial Institutions, the credit rating systems help them to decide whether to grant credit to consumers or not. Econometrics and machine learning, both have the same objective to build a predictive model however there is a fundamental difference in both approaches.  Econometrics models are probabilistic in nature to explain the economic phenomena, whereas machine learning uses algorithms of learning from their mistakes.

The different techniques in ML methods are used such as neural network techniques, SVM, Naïve Bayes, Markov Chain, HMM, Bayesian Networks, KNN, Decision Tree, Bayesian Ensemble, Hybrid models and Ensemble Models.

Hybrid models combine part of two or more algorithms. There are various types of hybrid models depending on the combination methods such as ‘Classification + Classification’, ‘Classification + Clustering’, ‘Clustering + Classification’, and ‘Clustering + Clustering’ techniques. It is difficult to say that which hybrid machine learning model performs the best, but the study done by Tsai and Chen (2009)[i] shows that the ‘Classification + Classification’ hybrid model based on the combination of logistic regression and neural networks provide the highest prediction accuracy and maximize the profit.

In general, as a well-known statistical method, the logistic regression model is used to discriminate good and bad customers due to its simplicity and transparency in predictions Most of the literature reveals that the advance machine learning models are more advance than the logistic regression, but there are few challenges: the inability of some of the machine learning models to explain predictions; and the issue of imbalanced datasets. This article investigates the econometric and machine learning models in credit scoring to fill the gap in the literature. An empirical study on the customer complaints was performed using both econometric and machine learning techniques in this article with following objectives:

  • Identify Systemic Risk based on the similarity of the Complaints
  • Improvise Complaints Redressal by identifying and prioritizing complaints with high risk and address them
  • Develop Visualization tools for better understanding the Dataset
  • Comparison of various regression techniques using both econometrics and machine learning.

IV.   Data Sources and Methodology

Consumer Complaint dataset released by CFPB is used for the study. US Consumer finance complaints provided by Consumer Finance Protection Bureau (CFPB; URL – https://www.kaggle.com/cfpb/us-consumer-finance-complaints). This dataset provides information about the dispute, their location and nature of dispute.

For data preprocessing, NLP libraries were used on Consumer Complaints by tokenizing, removing stopwords, lemmatizing and to perform spell correction to obtain clean, usable data (Figure 1). Reclassification was done for the departments under which the complaints filed to improvise redressal by using Machine Learning algorithms after generating TF-IDF Vectors (term frequency-inverse document frequency). TF-IDF is a text vectorizer that transforms the text into a usable vector by combining term frequency and document frequency. Different classification techniques such as logistic regression, SVM (supervised machine learning algorithm), XGBoost algorithms were applied on TF-IDF transformed data.

Picture11 - Global Banking | Finance

Fig-1: Process Flow Adopted

V.    Results and Discussion

In the data pre-processing, necessary actions were performed such as upper-case conversion, punctuation removal, stopwords removal and lemmatization.

In the applied logistic regression, SVM and XGBoost on TF-IDF transformed data, was observed that the general accuracy trend is XGBoost > Logistic > SVM but time taken to train is Logistic =< SVM < XGBoost.

Accuracy: XGBoost (85.09)  >  Logistic Regression (84.57) >=  SVM (84.51)

Speed: Logistic Regression >=  SVM >   XGBoost

For comparing the econometric models with machine learning based models, further options were explored such as KNN, Random Forest, Naïve Bayes and Gradient Boosting. Figure 2 shows the ROC curve under various model techniques. KNN is the best model amongst all because it has balanced precision and recall values, hence a good F1 Score and maximum AUC too.

Picture2 - Global Banking | Finance

Fig-2: ROC Curve of various Modeling Techniques

VI.   Conclusion and Limitation

In the study, solution approach tries to tackle various econometric and machine learning methods which would make crunching and extracting information from large Text based datasets. Machine learning models are observed to be more effective in terms accuracy than traditional econometric methos. The other advantage of machine learning methods are their capabilities of handling much larger datasets.

[i] Tsai C-F and Chen M-L (2010) Credit rating by hybrid machine learning techniques, Applied Soft Computing, Vol. 10, pp.374-380, https://doi.org/10.1016/j.asoc.2009.08.003

Global Banking & Finance Review

 

Why waste money on news and opinions when you can access them for free?

Take advantage of our newsletter subscription and stay informed on the go!


By submitting this form, you are consenting to receive marketing emails from: Global Banking & Finance Review │ Banking │ Finance │ Technology. You can revoke your consent to receive emails at any time by using the SafeUnsubscribe® link, found at the bottom of every email. Emails are serviced by Constant Contact

Recent Post