North East London NHS Foundation Trust Unlocks the Value in Unstructured Data with Intel and Santana Big Data Analytics

The elderly account for two million unplanned admissions (68% of total admissions) per annum in the UK and the number is growing.  In some areas of the country each 65+ year old spends 4 days per annum as an unplanned admission in a hospital bed.  Care of the elderly, in this regard, costs the NHS £8.3bn per annum. This is a small amount in comparison with the social care costs and the wider personal and economic costs which encompass items like loss of economic productivity due to carer commitments.

Most of the cost arises from issues only emerging when patients present in an acute care setting. The issues associated with overcrowded geriatric wards, lack of capacity in social care beds, problems reintroducing patients back into their own home settings, higher than optimal length of stay and exacerbation of co-morbidities are all well known. Many times patients present with falls and subsequent breaks - this is what the structured record in the Electronic Medical Record (EMR) sees

Managing avoidable admissions

Almost always this is not the cause of the presentation and therefore the admissions are avoidable. The information that allows clinicians, service designers and payers to address this issue does not lie, fully, in the structured EMR data – it lies in Case Notes. Santana Big Data Analytics is a company geared to unlocking the value of these notes and our project with North East London NHS Foundation Trust (NELFT) is an example of our transformative technology in action.

NELFT provides integrated community and mental health services for a diverse population of almost 1.5 million people living in the London Boroughs of Waltham Forest, Redbridge, Barking & Dagenham and Havering. Additionally, the Trust also manages community health services in south-west Essex. NELFT has an annual budget of more than £325 million in 2013/2014. employs around 5,500 staff and is a recognised research leader and innovator, partnering with diverse academic and private-sector leaders to explore new approaches to improving the quality of its services.

High Quality, Succinct Case Notes are Key to Success

As a Community Provider, part of NELFT’s role is to provide services that prevent admissions and allow people to live longer healthier lives by consuming services away from hospitals. Services include those that are designed to be preventative, rehabilitatory and quality of life preserving. Clinicians in NELFT recognize that high quality, succinct case notes are key to the way they operate their services.

Savings are made and benefits are gained, by changing clinical and operational practice. Although there are large volumes of data to be processed using Big Data tools and techniques, this is really a small data problem: what information can be delivered in a consumable format to clinicians to inform care?

Trigger Alerts, Identify Unmet Needs and Prioritize Care

Clinical requirements for information are often expressed in terms of need for integrated care records that go beyond the coded EMR / commissioned care pathway data set and present data in a more timely way to integrated and yet virtual teams. They need to focus quickly on what is important without having the time to “train importance” into their data.

In NELFT there are a number of EMR systems, which contain data that needs to be seen in the context of Social Care data from the local Council and other care providers, including the independent sector and primary care. NELFT have an award winning business intelligence platform which is well used by many staff, however it relies on structured and often coded (thus latent) EMR data. Clinicians like the idea of a single source of clinical and operational truth in a web-based, mobile / private cloud environment, but need it to do more.  They need it to produce alerts, identify unmet need, prioritize care and most importantly have a complete overview of all the organisation knows about the patient.

Extracting Structured Information from Unstructured Text

Enter Santana BDA. Santana Big Data Analytics brings together proven expertise in the fields of business and clinical intelligence, data analytics, big data and natural language processing (NLP). They worked in conjunction with the NELFT Performance Team to address the above issues using NLP, a technique for automatically extracting structured information from unstructured text. It provides a way of generating large amounts of coded clinical data without additional data entry requirements. The potential uses of this data are enormous - case summaries, monitoring performance, critiquing clinical decisions, screening risk, etc. Natural Language Processing also provides a way of readily combining information from different electronic records systems.

Intel Implementation of Cloudera Technology Stack

Intel have been instrumental in this project. Having worked previously with the team that founded Santana BDA in Leeds, Intel were well positioned to provide a wide range of inputs. This ranged from user experience design, big data processing and technology optimization support alongside clinical input to assist the Santana BDA team with hardware provision. Having understood the collective needs of the team of NELFT and Santana BDA staff, Intel worked to implement the Cloudera technology stack which is driving the project.

Ultimately this resulted in the design of an NLP appliance where the technology is optimized for the fast processing of Case Notes and the derivation of clinical meaning through the design and implementation of sets of data classifiers. The classifiers are created using machine learning and recognized, high quality clinical research.

The Santana Big Data Analytics engine is architected to run in a secure cloud or server cluster running on premise or externally. The initial implementation of Santana NLP engine used SQL Server technology to process the data. This worked well at NELFT for processing batches of 100,000 patient records. To create a solution that can process larger volumes of historical data, the Santana team are working with Cloudera to utilise the power of Apache Hadoop.

They have implemented the NLP engine as a scalable appliance running on Cloudera Distribution for Apache Hadoop (CDH) Enterprise. Both implementations run on scalable infrastructure powered by Intel® Xeon® processors.

A Flexible, Affordable and Scalable Platform for Analyzing Unstructured Data

Apache Hadoop is an open-source software framework that allows massive data sets to be distributed and processed across clusters of computers. Hadoop offers a flexible, affordable, and scalable platform for analyzing unstructured data, as well as for other analytics scenarios where the velocity, volume, and variety of data make them impractical for traditional databases. Cloudera CDH provides enterprise capabilities for Apache Hadoop processing along with system management capabilities that make it well-suited to deployment in healthcare and other enterprise environments.

Reducing the Complexities of Existing Infrastructure

The close collaboration among NELFT, Santana, and Intel, coupled with Santana’s methods and tools, meant that the appliance could be installed quickly in the NELFT infrastructure. Within a few weeks the team had gathered the data and the Santana engine on the Cloudera and Intel appliance churned through records in seconds that would have previously taken months of labour to read and analyse manually.

The use of modern Big Data technologies, if architected well, can reduce many of the complexities with existing infrastructures and with the use of NLP can provide information to a clinical or operational organisation with a forward looking approach and can augment Business Intelligence (BI) systems that apply a rearward looking dashboard system currently in use today.