Michael Gould, Founder and CTO at Anaplan
Let’s not delude ourselves that Big Data is the new kid on the block (the term has been floating around since 2011). And, please, let’s also forget about defining the characteristics of Big Data with a mnemonic string of Vs, which typically start out with these four— volume, variety, velocity and veracity. The Vs are an ever-growing list; anytime folks have a deep need to convince the world that they understand the topic deeper a few more magically get added such as viability and value. Frankly, you could go pick out another dozen or so words beginning with ‘V’ and it still wouldn’t get us any nearer to the essence of Big Data.
Big Data is Relative
To me, Big Data is a relative term in that data is only ‘Big’ if it becomes so large and complex that it is difficult to process using currently available technology. As such, Finance has actually always had Big Data. With ledgers and databases that take so long to process and query, Finance has only ever had access to the data at the consolidated level; granular detail remains dark and impenetrable. For instance, twenty five years ago when I was working for a transportation company, a request for quarterly product, customer and route profitability reports took days to process and were delivered to my desk as waist high piles of sprocket driven printouts on a trolley that could only be moved by two people. Needless to say, it took months to analyse and identify individual accounts that need renegotiating – with constant interruptions from an over-zealous facilities manager insisting that such a large amount of paper constituted a fire risk.
Big Data in Budgeting Today
Although today’s technology means business users themselves can query databases of that magnitude with sub-second response times, Finance is continually confronted with new Big Data challenges. As commercial pressures mount, Finance departments must continually up the ante in managing their financial performance. For instance, in order to improve the accuracy of their forecasts, companies are increasing the granularity of their budgets. This can challenge legacy systems. In certain instances, such as in China, where state-owned enterprises have to budget many thousands of line items, some of the West’s most successful enterprise planning and budgeting solutions struggle to cope. At the same time, the companies that saw value in taking their budgeting beyond traditional bottom-up-top-down line item consolidation and explored driver-based approaches or their close cousin, Integrated Business Planning (IBP), soon found themselves in the domain of ‘Big Data.’ Unfortunately, the architecture and calculation engines that underpin traditional budgeting tools were simply not designed for Big Data’s volume and complexity. It could take many hours to calculate results and required considerable patience awaiting responses to queries. This lack of cost-effective computing power has deterred many organizations from adopting a holistic approach to IBP. Many work at a higher level of granularity than they would prefer or partition the process across disjointed models—compromising the real value of IBP.
Computing Power and Architecture: the two essentials for success with Big Data
In-memory computing is clearly one element in solving the ‘Big Data’ problems that many Finance functions are grappling with today. But don’t be fooled into thinking it’s the only one. Traditional data structures, which have served so well for analysis and reporting, are ill-suited to the dynamic modelling needed for driver-based budgeting, IBP, and scenario analysis. Layering new technology on top of old ideas is no way to achieve breakthrough; there comes a point at which you have to throw it all away and start afresh, something that the mega-vendors that built their EPM suites by acquisition struggle with.
Without such baggage, alternatives to the traditional data model, such as HyperBlock™ architecture, a hybrid approach combining the best aspects of three architectures; highly multidimensional cubes that can contain a lot of data in a very compact form, storing data in columns rather than rows making it easier to represent large amounts of transactional data and do inserts and deletes, tracking cell level dependencies of individual data points, much as in spreadsheets, so that when changes are made to a model, the in-memory engine only recalculates dependent values following the shortest calculation sequence to give millisecond response times to queries – make optimal use of in-memory calculation with parallel processing done on multiple threads to speed response times by a factor of 10 or more, having been built for modelling, rather than simply analysis. In turn, this gives organisations the ability to recalculate and query massive models on-line or via mobile devices with spreadsheet-like response times. It’s this marrying of cutting-edge technology and a unique architecture which will be the key to unlocking the big data opportunity.