COMPLEXITY – THE BIG DATA CHALLENGE
By Adrian Carr, Vice President EMEA, MarkLogic
When it comes to big data, financial services firms are struggling to manage not only massive volumes of data, but data in a variety of formats – from structured and semi-structured trade messages to unstructured client onboarding data, contracts, news and social media content. Driven by mounting regulatory requirements, it’s this growing data complexity that’s bringing new challenges to the fore and putting the industry into fire-fighting mode.
Many organisations are using hierarchical data structures such as XML and JSON to tackle the problem. And, while they have seen some benefit, many have also found themselves constrained by the underlying relational database management system platforms used to manage the data.
This is because the relational approach to handling hierarchical information requires that data be “shredded” into tables – so a customer/derivative trade/legal document, with all its hierarchical attributes, has to be shoehorned into a model that satisfies the referential integrity of the underlying relational database system. And while this can work, it clearly has limitations.
Many financial services firms are finding that next-generation Enterprise NoSQL (Not Only SQL) is a faster and more efficient way to manage data in a variety of formats. Data can be loaded into a NoSQL platform as is – providing an alternative to the shredding required by relational database management systems.
And, with some of today’s Enterprise NoSQL offerings, financial services organisations do not have to go without support for ACID transactions, as well as other enterprise qualities such as fine-grained entitlements, point-in-time recovery and high availability, all of which are expected in enterprise and mission-critical systems.
Here are three examples of how financial services organisations are using Enterprise NoSQL to address their big data challenges.
The Operational Trade Store
Once a trade is made it needs to be processed by the back office and reported to the regulators. Trade data is typically read off of the message bus connecting the trading systems, and persisted into a relational database, which becomes the system of record for post-trade processing and compliance.
The original data formats are either XML (FpML, FIXML) or text based (FIX), and have to be transformed into normalised relational representation. This may sound easy, but in practice it is getting more complicated, due to the high rate of innovation in the front office. New, complex instruments are frequently introduced, making it harder to continually push data into a relational store, and leading to a proliferation of multiple schemas and databases. Even worse, interim workarounds are often put in place that allow data to be shoved into existing schemas – such as flags that indicate a record is of a different type than expected, or an empty shell into which any variable can be fitted. These workarounds can create expensive trade exceptions downstream, which then need to be resolved manually. The ensuing expense is then compounded by the high maintenance costs of complex relational database management systems, leading to higher costs-per-trade.
This can be solved using NoSQL, by persisting trade messages ‘as-is’, without the need for transforming them into a normalised relational schema.
Customer data is often spread throughout the organisation, with different systems having different notions and data models encapsulating a customer. The drive to incorporate web and social media data is making this even tougher. As a result, obtaining the illusive 360 customer view, whether for revenue purposes, fraud prevention and risk mitigation, or as a result of regulations, can be extremely difficult.
Again, typical enterprise data warehousing methodologies, which are relational in nature, don’t solve the problem. There’s also an added issue as customer data is often much less structured – consider customer on-boarding documents, call centre notes and web server logs, which represent just as much of a challenge as social media data when the goal is to deliver a coherent customer view.
It makes much more sense to use a non-relational database for these types of data. But the advantages of NoSQL also become apparent when it comes to highly structured data as it can be loaded as-is, alleviating the need to harmonise and normalise the data before it can be aggregated. With NoSQL it isn’t a problem to have different representations of a customer, which can be unified based on certain attributes without needing to create a single, over-arching data model. New data can therefore be easily incorporated from disparate systems, and then linked and enhanced with non-relational text data.
Data about traded instruments and the legal entities related to them has historically been a huge data management challenge. Most banks have been through several rounds of M&A and other organisational changes that resulted in multiple reference data management systems across the firm. This introduces data inconsistencies (which lead to trade exceptions), complexity and costs.
Many firms have been trying to rationalise their reference data systems to create a single enterprise data management platform. This has usually been a colossal task because of the level of effort involved in creating a single, unified data model to handle all the different incoming data-vendor feeds and address all the different concerns of the downstream data consumers. This is typical of enterprise data management efforts of this scale that rely on a relational database as their core platform.
As before, NoSQL provides an attractive alternative, allowing for the persistence of data vendor feeds in their original format. The data can then be fed to the customers in the appropriate formats with transformation occurring at the time it’s needed.
These are just a few examples of how financial service organisations are using Enterprise NoSQL to tackle data complexity, demonstrating that the best approach to data management isn’t always relational. Today, Enterprise NoSQL is providing new levels of flexibility and agility without compromising on key enterprise features such as ACID transactions, government-grade security, high availability, elasticity and disaster recovery. Choosing the right tool for the job just got easier.
Adrian joined MarkLogic in 2012 as the Vice President for EMEA. Prior to this, Adrian was VP Enterprise for Juniper Networks in Europe. Adrian also worked as VP EMEA and Australasia at Chicago based analytics software company SPSS where he grew the business to represent nearly half of the global revenues prior to their acquisition by IBM. This followed two years at Mercator software prior to their acquisition by IBM. Adrian’s career began in IT services companies with 11 years at EDS before time at ATOS Origin and Cambridge Technology Partners.