Editorial & Advertiser disclosure

Global Banking and Finance Review is an online platform offering news, analysis, and opinion on the latest trends, developments, and innovations in the banking and finance industry worldwide. The platform covers a diverse range of topics, including banking, insurance, investment, wealth management, fintech, and regulatory issues. The website publishes news, press releases, opinion and advertorials on various financial organizations, products and services which are commissioned from various Companies, Organizations, PR agencies, Bloggers etc. These commissioned articles are commercial in nature. This is not to be considered as financial advice and should be considered only for information purposes. It does not reflect the views or opinion of our website and is not to be considered an endorsement or a recommendation. We cannot guarantee the accuracy or applicability of any information provided with respect to your individual or personal circumstances. Please seek Professional advice from a qualified professional before making any financial decisions. We link to various third-party websites, affiliate sales networks, and to our advertising partners websites. When you view or click on certain links available on our articles, our partners may compensate us for displaying the content to you or make a purchase or fill a form. This will not incur any additional charges to you. To make things simpler for you to identity or distinguish advertised or sponsored articles or links, you may consider all articles or links hosted on our site as a commercial article placement. We will not be responsible for any loss you may suffer as a result of any omission or inaccuracy on the website.

Home > Technology > How data lakehouses are vital to fuelling AI and the future of medicine

Technology

How data lakehouses are vital to fuelling AI and the future of medicine

Published by Jessica Weisman-Pitts

Posted on December 16, 2021

Featured image for article about Technology

By Michael Sanky, Global Industry Leader, Healthcare and Life Sciences, Databricks

The pandemic has not only highlighted the importance of speed for medical discoveries, but also how data science and artificial intelligence (AI) can aid this acceleration. For example, machine learning in medicine has taken significant strides in recent years, with drug molecules discovered through AI used in human trials. Despite this, a recent report from the Alan Turing Institute revealed that difficulties with data collection, use, storage, processing and integration with different systems, namely the lack of a robust data architecture, hindered efforts to build helpful AI tools in response to the pandemic.

To tap into the full potential of AI, organisations, especially in healthcare and pharmaceuticals, need to get their data in order. The question is, how?

The growing importance of data

While great efforts have been placed into the likes of drug and medical discovery, particularly in light of recent events, it can be a lengthy, complex and costly process. Not to mention, its low success rates – only a couple of years ago, the overall failure rate of drug development was reported to sit at 96%. This is where data has stepped in and is beginning to update methods and transform the potential of drug development to bring down that percentage.

Without human data, particularly genomic data, we cannot comprehensively capture all elements of a disorder or disease to gain a wider and deeper picture. This calls for sequencing on a very large scale to be able to discover and validate key genetic variants. More information and insights gathered means organisations can take better-informed steps and counteract a major cause of drug development failure – a lack of efficacy. Creating and establishing machine learning (ML) algorithms with this data is also enabling drug development pipelines to be automated – not only offering greater understanding but also accelerating drug discovery.

As another example, QSAR (Quantitative Structure-Activity Relationships) models are able to improve predictive accuracy on novel chemical structures as well as lower costs and time by reducing the number of compounds synthesised. Predictive analytics can also be used in drug development and manufacturing by transferring knowledge and incorporating learnings from rich historical data. This data can then be used to predict new compounds and accelerate the experiment lifecycle.

AI will and already is playing a big role in drug development, discovery and the clinical trials process. There are opportunities to accelerate clinical research with a modern approach to data and analytics.

The data challenge

Despite these steps forward, all this great data brings its own challenges. With so much biological and medical data now available, pulling out the necessary insights needed – and quickly – is harder than ever. There is no point having all this data if it cannot be properly utilised. Moreover, genomic data in particular requires a huge amount of storage, specialised software to analyse it and raises many data management, data sharing and also privacy and security issues – it is important to remember that this is highly sensitive and private information.

The problem for many organisations is that all this data is often highly decentralised and while they are dealing with new, fresh data, they are working with legacy architecture, which is difficult to scale to support analytics for so many different data points and large volumes of diverse data. Simply trying to find the right data needed to be used for analytics can take weeks.

Biotech company, Regeneron, was facing precisely these problems, grappling with poor processing performance and scalability issues. As a result, the data teams didn’t have what they needed to analyse the petabytes of genomic and clinical data available; failing to make the best use of what was at their fingertips. While organisations are now able to collect in more data than ever before, they are struggling to process these massive data sets.

The role of data architecture

This is where data lakehouses have a huge part to play. It is vital that health organisations simplify their infrastructure and operations to increase productivity and the probability of success. Data can only be used to its full potential if it is all centralised in one unified and easy-to access data analytics platform, such as a lakehouse. The simplified lakehouse infrastructure allows for greater scalability, automation and for machine learning to be done at scale to accelerate drug cycle pipelines. A unified platform also enables the creation of interactive workspaces for greater transparency and collaboration through all stages of the drug lifecycle. Data and insights can be easily shared between teams, whilst ensuring reliability and upholding security to protect sensitive data. As a result, overall drug target identification is sped up for faster discovery of drugs and therapies and teams can work in more disease areas simultaneously.

Having to deal with legacy architecture and complicated infrastructures, on the other hand, is a time suck, particularly setting up the right infrastructure and maintaining it to support the necessary analytics. This draws teams away from carrying out the vital analysis itself. Through increased automation, such as automating the likes of cluster management to automatically switch over operations in case of any system failures, teams can spend less time on DevOps and instead concentrate on higher value tasks, namely drug development and discovering new treatments, still safe in the knowledge that there will be no disruptions. When Regeneron turned to using a new platform offering a more robust data architecture, finding the right data to use for analytics went from taking three weeks to two days, helping support a much broader range of studies. It is data architecture that is the key to making data usable and to be able to answer questions for improved drug discovery.

In addition to enabling clinical predictability and access to data lineage, the lakehouse platform allows researchers to take advantage of reproducible, ML-based systems for generating and validating hypotheses, which then allows them to make more targeted decisions about their time and research.

Truly harnessing the potential of data

The vital role of data in healthcare, particularly for drug and medical discovery, may be highly recognised but now organisations must move this further forward to be harnessing the full potential of that data. Without a robust data architecture, those high failure percentage rates for the likes of drug discovery will not be lowering anytime soon, but with a centralised, scalable platform to simplify operations, organisations can gain the insights they need and accelerate drug discovery. Data is only the first step, having the necessary data architecture in place is the next.

By Michael Sanky, Global Industry Leader, Healthcare and Life Sciences, Databricks

To tap into the full potential of AI, organisations, especially in healthcare and pharmaceuticals, need to get their data in order. The question is, how?

The growing importance of data

The data challenge

The role of data architecture

Truly harnessing the potential of data

Treasury transformation must be built on accountability and trust

Financial services: a human-centric approach to managing risk

LakeFusion Secures Seed Funding to Advance AI-Native Master Data Management

Clarity, Context, Confidence: Explainable AI and the New Era of Investor Trust

Data Intelligence Transforms the Future of Credit Risk Strategy

Architect of Integration Ushers in a New Era for AI in Regulated Industries

How One Technologist is Building Self-Healing AI Systems that Could Transform Financial Regulation

SBS is Doubling Down on SaaS to Power the Next Wave of Bank Modernization

Trust Embedding: Integrating Governance into Next-Generation Data Platforms

The Guardian of Connectivity: How Rohith Kumar Punithavel Is Redefining Trust in Private Networks

BNY Partners With HID and SwiftConnect to Provide Mobile Access to its Offices Around the Globe With Employee Badge in Apple Wallet

How Integral’s CTO Chidambaram Bhat is helping to solve transfer pricing problems through cutting edge AI.

How data lakehouses are vital to fuelling AI and the future of medicine

How data lakehouses are vital to fuelling AI and the future of medicine

More from Technology

Why Physical Infrastructure Still Matters in a Digital Economy

Why Compliance Has Become an Engineering Problem

Can AI-Powered Security Prevent $4.2 Billion in Banking Fraud?

Reimagining Human-Technology Interaction: Sagar Kesarpu’s Mission to Humanize Automation

LeapXpert: How financial institutions can turn shadow messaging from a risk into an opportunity

Intelligence in Motion: Building Predictive Systems for Global Operations

Predictive Analytics and Strategic Operations: Strengthening Supply Chain Resilience

How Nclude.ai turned broken portals into completed applications

The Silent Shift: Rethinking Services for a Digital World?

Culture as Capital: How Woxa Corporation Is Redefining Fintech Sustainability

Securing the Future: We're Fixing Cyber Resilience by Finally Making Compliance Cool

Supply chain security risks now innumerable and unmanageable for majority of cybersecurity leaders, IO research reveals

More from Technology

Why Physical Infrastructure Still Matters in a Digital Economy

Why Compliance Has Become an Engineering Problem

Can AI-Powered Security Prevent $4.2 Billion in Banking Fraud?

Reimagining Human-Technology Interaction: Sagar Kesarpu’s Mission to Humanize Automation

LeapXpert: How financial institutions can turn shadow messaging from a risk into an opportunity

Intelligence in Motion: Building Predictive Systems for Global Operations

Predictive Analytics and Strategic Operations: Strengthening Supply Chain Resilience

How Nclude.ai turned broken portals into completed applications

The Silent Shift: Rethinking Services for a Digital World?

Culture as Capital: How Woxa Corporation Is Redefining Fintech Sustainability

Securing the Future: We're Fixing Cyber Resilience by Finally Making Compliance Cool

Supply chain security risks now innumerable and unmanageable for majority of cybersecurity leaders, IO research reveals