Augment Credit Risk Modeling for One of the Banking Majors in South East Asia

Industry: Banking


Our client already had an in-house analytics team, which was responsible for credit risk modelling for retail and corporate customers.

However, the number of defaulters have increased by 8.3% on average considering last 5 years for retail customers, whereas the number was 13.25% for corporate ones.

This was a big problem, which completely indicates that existing methodologies was not equipped enough to minimize the number of defaulters.

Project Journey

While addressing the issue, Abzooba tried to be Consulting partner to solve their business problem, and not as an Analytics as a service provider.

To start with, multiple meetings happen with different stakeholders, both from business side and technology side. After multiple round of discussion, we found the current set of explanatory variables are not robust enough to bring out the necessary insights.

We preferred to do a pilot for a sample of customers (Retail) from different industries. According to our suggestion, our client has made a contract with LinkedIn to buy the public data of their customers.


We have implemented Apache Spark as a crawling infrastructure to collect continuous data from LinkedIn. As we have Name, Current Organization & Contact details from Bank DB, we were able to bring out data of the customers with more than 92% accuracy. Though Apache Spark was not very familiar to the project team, experienced faculties and highly enriched contents of Abzooba Innovation Academy (AIA) helped associates to make project ready within 4 weeks.

Once the data comes in our environment, Abzooba’s NLP tool XPRESSOTM was used to bring out the customer’s credibility factor, from different connection’s recommendation text.

Output of XPRESSOTM, along with other KPIs like Years of Experience, Frequency of Job Change, Key Skills Endorsement etc. have been added as explanatory variables in final credit scoring model. We have checked multicollinearity first to remove similar type of variables. This was followed by creating a Machine Learning (Logistic Regression) model to correctly identify the defaulters.

Once the pilot is successful, we have deployed this model for whole gamut of retail customers of 3 specific regions.

Business Benefits

Decrease of the rate of defaulters by 3.23%, thereby preventing a potential loss of $1.2 Million in 1st year itself.

Client has liked our approach to be a consulting partner to understand and bring out the exact business problem, and not only just adhering to fulfill their analytical requirement.

Future Scope

Currently we are trying to develop a Cognitive search module using Artificial Intelligence, Deep learning & NLP, to increase the efficiency of the sales persons to find out relevant materials quickly from huge amount of internal repository. This will be implemented by a Search Bot which would understand natural language and provide necessary collaterals to sales persons.