By Christopher J.A. Messina
Data are overwhelmingly abundant throughout the financial services and capital markets ecosystems. Over time, processes and software have been developed to bite off discrete chunks of that data for use in specific tasks. Due to demand from market participants with deep pockets, the supply of well-crafted and increasingly data-enabled systems has grown robustly over the last 30 years.
But crucially, the 'white space' on the financial technology supply-demand curve is for quantitatively-based systems to meet the needs of non-quantitatively inclined professionals who are suddenly highly reliant on the information contained in quantitative information.
That white space has become more glaring following the passage in the US of the Dodd-Frank Act and EMIR in Europe. Sudden legislative fiat means that traditionally 'non-quantitative' lawyers and compliance experts are tasked with massive amounts of data digestion along with reporting to a slew of new audiences. The real financial penalties associated with failing to comply with the new reporting requirements are harsh and mandated by statute, with no room for rational compromise between mature, experienced adults in developed markets.
There has always been a tension in the capital markets between the differing comprehension that market professionals and regulators can have of events, markets and rules. That tension most often manifests itself in repetitive anecdotes about the conflicts between regulators and market participants, where each 'side' of a regulatory dispute is genuinely astonished that the other side interprets the same event utterly differently. That well-trodden path I leave for someone else to walk down. Besides, as one of my statistics professors was wont to repeat with little to no prodding: The plural of anecdote is not 'data.'
Today, the more fundamental issue is the unprecedented amount of data that many financial industry entities now have to gather, maintain, analyze and report on to meet a burgeoning regulatory mandate. This flood of compliance reporting has hit the industry with huge associated costs. But, as many firms are finding, even if you can find a sufficient number of educated, experienced staff to handle these new mandates, it does not assure that you are going to fulfill your obligations properly or in a timely manner.
It is so very challenging to select just one illustrative example from the constellation of data-reliant reports which impassioned, crisis-wrought legislation has brought into being. We can gain some insights by examining the practical impacts of one filing required in the United States which has hit 30 Bank Holding Companies in the US since 2012. The Federal Reserve Board requires that banks file periodic reports of their Comprehensive Capital Analysis and Review (CCAR), which is in large part the result of stress testing of numerous operational data sets.
There can be up to 300 factors examined by the CCAR, which means first off huge new data collection, retention, access and analysis needs by staff throughout the enterprise. This massive data governance problem has attracted a great deal of investment in personnel and systems, with a number of well-thought-out data management solutions offered by a range of capable technology vendors. Most of the reporting requires varying levels of granularity and places varying emphasis on different subject matter experts across the business – from trading to risk management to middle office functions, and including compliance and legal departments.
The white space still exists, however, because the output of all these very powerful data aggregation, governance and analysis tools is still a dashboard or a table or graph. Moving, managing and analyzing vast amounts of data are not trivial tasks, but once all that data manipulation is done, the same bottleneck exists: To wit, a human expert then needs to perform further analyses and write up the results. Given the vast array of new report types that need to be created and the number of times each report needs to be customized for different audiences, that write-up time alone is a massive cost sink for firms struggling to comply.
As well as 'the doing' of turning reams of data into meaningful monthly or quarterly reporting, there is 'the demonstrating'. Under this mandate, banks also need to be able to show in detail how that they are complying with the legislation. This is part of the regulatory reality that must be addressed, and is another layer of costs incurred in report-writing.
Lurking at the back of every executive's and every risk manager's mind is also the fear that maybe something crucial was missed. Maybe it happened in the choice of data sampling methodology, perhaps it was further down the chain in the way a certain factor among 300 was (potentially) mischaracterized – the list of possible errors in the translation from machine output through human expert processing is vast, unnerving and something to be driven out of your conscious mind as firmly as possible.
This topic is of huge interest to the data scientists at Arria NLG. The entire suite of IP underlying Arria's technology happens to fill in significant pieces of that very painful white space. The Arria NLG Engine helps banks deal with exactly this headache. The NLG Engine unlocks the bottleneck of human time constraints by doing with quantitative outputs what a human SME does, including crucially writing the final range of required reports. Arria's artificial intelligence engine acts as a translation layer between hard data and the wide range of audiences who need to act on that data, many of whom do not have significant quantitative training.
Importantly for real business use, the NLG Engine produces articulate, customized texts that replicate the language required by the firm's SMEs, keeping a critical sense of brand consistency and tone that can be adapted according to the user's needs. It's in no way what one would call 'robotext' – quite the contrary: Arria's systems currently in place write texts which the clients' in-house owners cannot distinguish from those written by their human colleagues.
Thirty years ago, Ehud Reiter and Robert Dale began a process of exploration that started with a very simple question: Can we build computer systems that will not only perform highly sophisticated analysis of big, complex data sets, but will take that data analysis to its logical conclusion and generate articulate, grammatically-correct written human language summaries out the other end?
Halfway through that journey, in 2000 they co-authored Building Natural Language Generation Systems, which remains the definitive text on the subject. Twenty-seven years into that journey, sufficient market demand had arisen to turn the academic project into a commercial offering.
We have all experienced this in one way or another: Some of the best technology we now use every day was created by tinkerers, creative types and academics – not with a potential application in mind, but as an intellectual exercise, a way to solve a problem that an inventor finds interesting.
Arria's Computational Linguistics solution was not designed to meet the needs of time-pressed lawyers, risk managers and compliance officers dealing with an unprecedented torrent of regulatory reporting, but it has turned into one of the most powerful tools available to make that compliance effort as efficient and painless as possible.
For more color on how Arria's NLG Engine is applicable to risk management and compliance, please see a recent interview with Larry Tabb: http://tabbforum.com/videos/finding-risk-before-it's-a-problem.
About the Author
Christopher Messina is SVP, Business Development for Arria NLG plc. Mr. Messina has 20 years of experience in the global capital markets, private equity and financial technology, including deep understanding of the evolution of financial regulatory regimes. He has worked in the Americas, Africa, Europe, Australia, and the Middle East. A member of Business Executives for National Security, Mr. Messina is a graduate of the University of Chicago and the Australian Graduate School of Management. He is a contributing author to Shari'ah Compliant Private Equity: A Primer for the Executive (Euromoney Books, 2010), and has lectured on derivatives, commodities and Shar'iah finance at law schools and conferences globally.