Jason Robson is Head of Software Development at Equiniti Riskfactor
Machine Learning (ML) is a branch of the more commonly understood field of Artificial Intelligence (AI), the subject of many Hollywood dystopian ‘rise-of-the-machines’ style movies.
In essence, Artificial Intelligence attempts to mimic human intelligence or behaviours. Machine Learning attempts to analyse and associate patterns of behaviour in diverse data sets to support data-driven decision making based on new knowledge and understanding.
Traditional risk models have used statistical or expert-driven heuristics, but now the next generation of risk analytics is taking advantage of the work being done in this growing field of Data Science.
As fraud is thankfully a relatively a rare occurrence within an organisation, developing simulation tools is key to understanding the lifecycle of a fraud. Using real world examples, we are now able to model the patterns of behaviour surrounding a fraud in order to reproduce the event with diverse sets of changing dynamics. This allows us to represent and understand the fraud over a range of time periods and with utilising differing levels of funding.
Most of the work of a Data Scientist is at this (slightly unglamorous) end of the workflow – essentially the acquisition of test data and its transformation into more suitable forms for use in data analytics.
‘Data Munging’ is the delightful phrase that has been given to this activity.
Aside from a background in probability and statistics, the Data Scientist’s toolbox consists of technologies such as the programming languages Python and R, which can be tailored to accommodate statistical computing and graphics.
Cloud computing providers such as Microsoft Azure and Amazon also have services dedicated to Machine Learning problem domains.
Machine Learning algorithms allow the matching of patterns and connections that can’t be expressed easily, or even at all, by people. Imagine the field of speech recognition, where devices from Google, Amazon and Apple can not only identify what is being said, but which person in a household is saying it.
The unique patterns of speech can be recognised even though the reasons why could never be easily conveyed to its owner in words. Now swap the rises and falls in pitch and amplitude with time series metrics derived from a commercial finance facility, and you will immediately see the future possibilities we are exploring.
The abundance of data that surrounds us covers not only our work lives and business connections, but also information about our social interests and friends. This rich picture will play a hugely important role in fully understanding the events we wish to model.
The wealth of data in the world we inhabit today is moving the bar above mere fraud detection,towards future fraud prediction. And yes, if you are thinking ‘Minority Report’, Hollywood does seem to have got there first).