Relational databases can no longer tackle white collar crime

By Martin Darling, Regional Manager for EMEA, TigerGraph.

It’s hard to remember a time when tax evasion has been such a hot topic. Now, promises to tackle tax evasion fill every political manifesto, stories of celebrity tax dodgers fill headlines and apparently stateless tech companies are hauled in front of televised parliamentary tribunals to explain why their tax bills are so low.

Martin Darling
Martin Darling

And while the will is there, the way is less clear. Tax evasion is still a pariah on state budgets.

The UK’s tax authority numbers for 2019, show a total loss of £35 billion an increase of 17 percent from 2016. The US Internal Revenue Service (IRS) predicts that it costs the US government $458 billion annually.

The last few years have seen a cascade of leaks from offshore tax havens which have given a glimpse of the true extent of global tax evasion. First there were the Offshore Leaks, then in 2014, the Luxembourg leaks. The next year, Swiss leaks, the Panama papers and then the Paradise papers.

Each leak has revealed thousands of documents, gigabytes of data and represented billions of pounds of tax avoidance and evasion. More still, the leaks have revealed illegality, high level corruption and the maneuverings of organised crime

It makes sense that in the wake of the global financial crisis of 2008 and the ensuing regimes of austerity and the contraction of credit, that tax evasion and white collar crime has reached new levels of importance in the public imagination. It was the massive corporate malfeasance in the finance sector that caused a crisis of a scale not seen since the great depression. Following that, governments around the world enacted often-harsh regimes of austerity on the basis that there simply was not any money left.

It has, at least in part, led to the political instability we now see in the western world: mistrust of elites; the rise of extremist populism; flagging faith in political institutions – least one root lies with the failure to properly tackle tax evasion and white collar crime.

It’s for that reason that these subjects have risen to the top of public imagination. But as I said before – a will does not always mean a way.

Tax havens are a big part of the problem. While much of the money there is not held illicitly, tax havens are estimated to hold up to 10% of global assets . Tax havens are prized not just for their low taxes but their bank secrecy laws. Territories like Switzerland, Panama and Belize allow clients to hold money there pretty much anonymously. This is where the money trails so often disappear and investigations go cold.

Another massively complicating factor is the use of shell companies. The complex chains of shell companies through which illicit capital flows make it tough for tax authorities, law enforcement bodies and civilian investigators such as journalists and NGOs.

It is common practice to set up a shell company which will obfuscate illicit activity. In fact, a thriving industry has grown up around this very practice. For a small fee, you can set up companies with fake or paid directors, concealing the actual beneficiary of the funds. Set up enough companies, and you have a confusing chain of names, addresses and business registrations which funnel funds from the tax evader back into their pocket. That’s enough to demotivate plenty of investigations.

And it’s here where our current tools are letting us down. Relational databases are still one of the cornerstones of this kind of investigation. And they’re failing us.

As it stands relational databases can only analyse a known relationship or path. In that case, researchers cannot ask whether two separate entities have a relationship, they must already know that relationship.

In a situation where money hides in multiple phantom entities and trickles down into ever more complex layers, relational graphs cannot dig much further down than three levels. They find it hard to identify patterns among disparate datasets and carry out deep analytics. They’re often overly complex and slow

Every level the investigator descends, comes with more computational expense and time consuming workload. Dig as few as four levels down and problems start arising. If the query cannot establish the path, analysis will time out and investigators will be faced with a dead end.

These chains of obfuscation aren’t static either. Criminal entities and tax dodgers can shut down these fraudulent subsidiaries as quickly as they set them up, running off once their funds are safely transferred offshore.

Investigators need tools which can not only accommodate the depth of these scams but their frenetic, ever shifting nature.

Native parallel graph databases may be useful here. They can dig deep through money trails, going as many levels as needed, 10, 20 or more, across datasets on a global scale. They are designed to traverse unknown paths to find connections, no matter how deep that connection goes.

They can also use temporal analysis to identify changes in company structure over time and help flag suspicious behaviour in the rapid opening and closing of subsidiaries set up specifically to funnel illicit funds.

Chiefly, native parallel graph databases are good at establishing relationships between the fine details. They can trace the flows of money and help establish the patterns which indicate tax evasion or money laundering. They can even incorporate data from multiple internal and external sources. OpenCorporates – the largest corporate information database in the world is already using Graph technology to do just that. The corporate information giant migrated its database earlier this year, to help marshall their vast amounts of data – over 170 million entries – and the complexity of the investigations such a massive database necessitates.

China Construction Bank (CCB) have started to use this technology to tackle its concerns around money-laundering and credit fraud. As the world’s second largest bank, the CCB generates over five terabytes of data a year. It is now using Native Parallel Graph databases to analyse the millions of personal and corporate accounts it interacts with every day.

CCB is just one organisation that is set to gain from this technology. In February 2019, Gartner predicted that graph processing and graph database management systems will grow at a rate of 100 percent annually through 2022. Much currently used graph technology cannot keep up with the demands made of it, they added: “Graph analytics will grow in the next few years due to the need to ask complex questions across complex data, which is not always practical or even possible at scale using SQL queries.”  Furthermore, Gartner believes graph technology will be a critical underpinning of artificial intelligence and machine learning, enabling their ability to find connections and relationships in complex datasets which is crucial in fraud detection.

Tax evasion has now become a central political issue for democracies in the Western world but the tools we use to fight it are failing us. Native Parallel Graph databases provide the ability to turn wills into ways.

Gabriel Zucman (August 2014). “Taxing across Borders: Tracking Personal Wealth and Corporate Profits”. Journal of Economic Perspectives. 28 (4): 121–48. doi:10.1257/jep.28.4.121.