Editorial & Advertiser Disclosure Global Banking And Finance Review is an independent publisher which offers News, information, Analysis, Opinion, Press Releases, Reviews, Research reports covering various economies, industries, products, services and companies. The content available on globalbankingandfinance.com is sourced by a mixture of different methods which is not limited to content produced and supplied by various staff writers, journalists, freelancers, individuals, organizations, companies, PR agencies Sponsored Posts etc. The information available on this website is purely for educational and informational purposes only. We cannot guarantee the accuracy or applicability of any of the information provided at globalbankingandfinance.com with respect to your individual or personal circumstances. Please seek professional advice from a qualified professional before making any financial decisions. Globalbankingandfinance.com also links to various third party websites and we cannot guarantee the accuracy or applicability of the information provided by third party websites. Links from various articles on our site to third party websites are a mixture of non-sponsored links and sponsored links. Only a very small fraction of the links which point to external websites are affiliate links. Some of the links which you may click on our website may link to various products and services from our partners who may compensate us if you buy a service or product or fill a form or install an app. This will not incur additional cost to you. A very few articles on our website are sponsored posts or paid advertorials. These are marked as sponsored posts at the bottom of each post. For avoidance of any doubts and to make it easier for you to differentiate sponsored or non-sponsored articles or links, you may consider all articles on our site or all links to external websites as sponsored . Please note that some of the services or products which we talk about carry a high level of risk and may not be suitable for everyone. These may be complex services or products and we request the readers to consider this purely from an educational standpoint. The information provided on this website is general in nature. Global Banking & Finance Review expressly disclaims any liability without any limitation which may arise directly or indirectly from the use of such information.

Combating Insurance Fraud With Machine Learning

By Georgios Kapetanvasileiou, Analytical Consultant at SAS

Most insurance companies depend on human expertise and business rules-based software to protect themselves from fraud. However, people move on. And the drive for digital transformation and process automation means data and scenarios change faster than you can update the rules.

Machine learning has the potential to allow insurers to move from the current state of “detect and react” to “predict and prevent.” It excels at automating the process of taking large volumes of data, analysing multiple fraud indicators in parallel – which taken individually may often be quite normal – and finding potential fraud. Generally, there are two ways to teach or train a machine learning algorithm, which depend on the available data: supervised and unsupervised learning.

Predictive modelling

In predictive modelling or supervised learning, algorithms make predictions based on a set of examples from historical data. You can present an algorithm with historical claims information and associated outcomes often called labelled data. It will attempt to identify the underlying patterns in fraudulent cases. Once the algorithm has been trained on past examples, you can use it to infer the probability of a new claim being fraudulent. AKSigorta Insurance is using advanced predictive modelling as part of its investigation process. The company has managed to increase its fraud detection rate by 66% and prevent fraud in real time.

There is a wide variety of predictive modelling algorithms to choose from, so users should take into account issues such as accuracy, interpretability, training time and ease of use. There is no single approach that works universally. Even experienced data scientists have to try different methods to find the right algorithm for a specific problem. It is, therefore, best to start simple and explore more advanced machine learning methodologies later. Decision trees, for example, are an excellent way to start exploring complex relationships within data. They are relatively easy to implement and fast to train on large volumes of data. More importantly, they are very easy to understand or interpret, and can be a good starting point for new business rules.

Other options for more accuracy

Decision trees can, however, become unstable over time. When accuracy becomes a priority, practitioners should look at other options. Support vector machines (SVMs) and neural networks are capable of learning complex class boundaries and generalise well to unseen cases. They have been extensively used for fraud detection. Tree-based algorithms, such as gradient boosting and random forests, have also become more popular in recent years. Ideally, analysts should try multiple approaches in parallel before deciding what works best.

Supervised learning is effective in identifying familiar cases of fraudulent activity but cannot uncover new patterns. Another challenge is the limited numbers of fraud examples with which to train the algorithm. Fraud is a relatively rare event, after all. The ratio between fraud and nonfraud cases can sometimes be as much as 1 to 10,000. This means that predictive algorithms tend to be overwhelmed by the sheer volume of nonfraud cases, and may miss the fraudulent ones. Labelling new data for training a model can also be time consuming and expensive.

Unsupervised learning

Unsupervised learning algorithms are trained against data with no historical labels. In other words, the algorithm is not given the answer or outcome beforehand. It is merely asked to explore the data and uncover any “interesting” structures within them. For example, given certain behavioral information, unsupervised learning algorithms can identify groups (or clusters) of customer transactions that appear similar. Anything that appears different or rare could be flagged as an anomaly (or an outlier) for further investigation.

Unsupervised learning methods can, therefore, identify both existing and new types of fraud. They are not restricted to predefined labels, so can quickly adapt to new and emerging patterns of dishonest behaviour. For example, a New Zealand health insurer used unsupervised learning methods to identify cases where practitioners were deliberately overcharging patients for a particular procedure or providing unnecessary treatment for certain diagnoses.

Unsupervised anomaly detection methods include univariate outlier analysis or clustering-based methods such as k-means. However, the recent move towards digitalisation means more data, at higher volumes, from a wider range of data sources. New algorithms, such as Support Vector Data Description, Isolation Forest or Auto-encoders, have been introduced to address this. These may be a more efficient way of detecting anomalies and allow for faster reaction to new fraud.

Social network analysis

These methods are useful for identifying opportunistic fraud. However, many fraudsters today operate as part of professional, organised rings. Activity may include staged motor accidents to collect on premiums, ghost brokering, or collusion between patients and health practitioners to inflate claim amounts. These career fraudsters can repeatedly disguise their identities and evolve their way of operating over time.

Social network analysis is a tool for analysing and visually representing relationships between known entities. Examples of shared entities could be different applicants using the same telephone number or IP address, or a motor accident involving multiple people. Social network methods can automate the process of drawing connections from disparate data sources and visually representing them as a network. This significantly reduces the investigation time – in one case, from 10 days to just two hours. In the UK, a large P&C insurer made £7 million savings per annum by uncovering groups of collaborating fraudsters using network analytics.

A hybrid approach

No single technique, however, is capable of systematically identifying all complex fraud schemes. Instead, insurers need to combine sophisticated business rules and advanced machine learning approaches. This will allow them to cast the net wide, but improve accuracy and reduce false positives, making fraud detection more efficient.