The use of Electronic Health Records (EHR) is increasing in primary care practices, partially driven in the United States by the Health Information Technology for Economic and Clinical Health Act. In 2011, 55% of all physicians and 68% of family physicians were using an EHR system. In 2013, 78% of office-based physicians reported adopting an EHR system. EHRs can, however, be a source of frustration for physicians. A 2012 survey of family physicians revealed that only 38% were highly satisfied with their EHR. Among the barriers to EHR adoption and satisfaction are issues with usability, readability, loss of efficiency and productivity, and divergent stakeholder information needs, which are all crammed into a small and single form factor.

Cost of EHR Systems

The American Recovery and Reinvestment Act incentivizes expanding the meaningful use of electronic health record systems, but this comes at a cost. A recent study has reported the cost of implementing an electronic health record system in twenty-six primary care practices in a physician network in north Texas —taking into account hardware and software costs — as well as the time and effort invested in implementation. For an average five-physician practice, implementation cost is estimated to be around USD 162,000, with USD 85,500 in maintenance expenses during the first year. It is also estimated that the HealthTexas network implementation team and the practice implementation team needed an average of 611 hours to prepare for and implement the electronic health record system, and that the end users — physicians, other clinical staff, and nonclinical staff — needed 134 hours per physician, on average, to prepare for use of the record system in clinical encounters.

The Opportunity

This clearly has opened up an opportunity to innovate. Despite slower-than-expected growth, the global market for EHR is estimated to have reached USD 22.3 billion by the end of 2015, with the North American market projected to account for USD 10.1 billion or 47%, according to research released by Accenture (NYSE:ACN).

Although the worldwide EHR market is projected to grow at 5.5% annually through 2015, Accenture’s previous research shows that would represent a slowdown from roughly 9% growth during 2010. Despite the slower pace of growth globally, the combined EHR market in North and South America is expected to have reached USD 11.1 billion by the end of 2015, compared to an estimated USD 4 billion in the Asia Pacific region and USD 7.1 billion in Europe, the Middle East and Africa.

The Challenge

EHRs have the potential to improve outcomes and quality of care, yield cost savings, and increase engagement of patients with their own healthcare. When successfully integrated into clinical practice, EHRs automate and streamline clinician workflows, narrow the gap between information and action that can result in delayed or inadequate care. Although there is evolving evidence that EHRs can modestly improve clinical outcomes, one fundamental problem is that EHR systems were principally designed to support the transactional needs of administrators and billers, and less so to nurture the relationship between patients and their providers. Nowhere is this more apparent than in the ability of EHRs to handle unstructured, free-text data of the sort found in the history of present illness (HPI). Current EHR systems are not designed to capture the nature of HPI — an open-ended interview eliciting patient input — summarizing the information as free text within the patient record. There is huge untapped areas to innovate by exploiting the HPI to execute care plans and to document a foundational reference for subsequent encounters. In addition, the HPI can be used directly by an automated system driven by AI — replacing the current manual model using clinical coding specialists — into more structured data linked to payment and reimbursement.

“Although the market is growing, the ability of healthcare leaders to achieve sustained outcomes and proven returns on their investments pose a significant challenge to the adoption of electronic health records,” said Kaveh Safavi, global managing director of Accenture Health. “However, as market needs continue to change, we’re beginning to see innovative solutions emerge that can better adapt and scale electronic health records to meet the needs of specific patient populations as well as the business needs of health systems.”

In summary, with adoption of EHR came the challenge of data, and finding the information quickly and efficiently. With a typical 5-day hospital stay, with many doctors and nurses working on the same patient creating a huge amount of overlapping data, it becomes almost impossible to get a clear picture of what is happening with a patient by the 3rd day. The traditional EHR model is not effective in this setting.

Ripe for Innovation

One solution to the problem is to utilize human augmented machine learning to generate an insightful, patient-specific narrative — especially in the case of in-patient encounter — to simplify all of this data. Such a system will have the ability to use the Natural Language Technology (NLP) to process free-format text (“unstructured data”) stored within Patient Notes and aggregate that with the information located within the various Tables and Charts (the “structured data”). In a way, this is in line with the overall trend in work automation — the use of innovative technologies to facilitate the transition to electronic records from paper-based records — specifically for healthcare providers. NLP is one technology that can fundamentally change the way we interact with patient records and help improve clinical outcomes.

Let’s look a little closely at the data captured within an EHR system. Within the EHR, data is captured in one of four ways, entering data directly — including templates, scanning documents, transcribing text reports created with dictation or speech recognition and finally interfacing data from other information systems such as laboratory systems, radiology systems, blood pressure monitors, or electrocardiographs. This captured data, in turn, can be represented in either structured or unstructured forms. Structured data is, by definition, created through constrained choices in the form of data entry devices including drop-down menus, check boxes, and pre-filled templates. There are obvious advantages of this type of data format. They are easily searchable, aggregated, analyzed, reported, and linked to other information resources — but it suffers from data compression and more immortally loss of context —making them unsuitable for individualization of the EHR and too fragmented for intelligent holistic treatment that is possible with unstructured data.

Unstructured clinical data, on the other hand, exists in the form of free text narratives. Provider and patient encounters are commonly recorded in free-form clinical notes. Free text entries into the patient’s health record give the provider flexibility to note observations and concepts that are not supported or anticipated by the constrained choices associated with structured data. It is important to note that some data are inherently suitable for structured format, while others are not. NLP can be a powerful tool in achieving this balance — some part of unstructured text narratives can be transformed into structured data — leaving other data in free format text, but with derived annotation and semantic analytics, making the EHR data model real life situations more closely.

Not the Silver Bullet

NLP is not a silver bullet and clinical text poses significant challenges to NLP. This text is often ungrammatical, consists of bullet-point telegraphic phrases with limited context, and lacks complete sentences. Clinical notes make heavy use of acronyms and abbreviations, making them highly ambiguous. Word sense disambiguation also poses a challenge in extracting meaningful data from unstructured text. Clinical notes often contain terms or phrases that have more than one meaning. For example, discharge can signify either bodily excretion or release from a hospital; cold can refer to a disease, a temperature sensation, or an environmental condition. Similarly, the abbreviation MDcan be interpreted as the credential for “Doctor of Medicine” or as an abbreviation for “mental disorder.” This underscores the need to understand and model the context more closely, and NLP practitioners are working towards a working solution to these challenges.

One such solution is the standardization of medical language such as the Unified Medical Language System (UMLS). It is a set of files and software that brings together many health and biomedical vocabularies and standards to enable interoperability between computer systems. We can use the UMLS to enhance or develop applications, such as electronic health records, classification tools, dictionaries and language translators. Specifically, the UMLS metathesaurus, is a repository of over 100 biomedical vocabularies, including CPT®, ICD-10-CMLOINC®MeSH®RxNorm, and SNOMED CT®, is an excellent tool in standardizing this variation. Within the Metathesaurus, terms across vocabularies are grouped together based on meaning — forming concepts — allowing us to capture and account for the huge variations in language and expressions.

This obviously helps, but even such exhaustive approaches have their limitations. Given the nature of language itself, each individual concept is often assigned multiple semantic type categories from the UMLS Semantic Network, making the meaningcontext-sensitive. For example, within UMLS, 33.1% of abbreviations have multiple meanings. The presence of abbreviation ambiguity is even higher in clinical notes, with a rate of 54.3%. This makes subjectivity a big factor in understanding clinical notes — makes it that much more difficult to derive actionable intelligence.

The Growing Market

Irrespective of these challenges, the NLP market is growing steadily and is forecasted to grow for some time, as shown in the Figure below.

According to a recent report, NLP Market for Healthcare and Life Sciences Industry will be worth USD 2.67 Billion by 2020. This report, titled, “Natural Language Processing Market for Health Care and Life Sciences Industry by Type (Rule-Based, Statistical, & Hybrid NLP Solutions), Region (North America, Europe, Asia-Pacific, Middle East and Africa, Latin America) – Global Forecast to 2020”, defines and divides the NLP market into various segments with an in-depth analysis and forecasting of revenues.

The global NLP market for health care and life sciences industry is expected to grow from USD 1.10 Billion in 2015 to USD 2.67 Billion by 2020, at a CAGR of 19.2%. In the current scenario, North America is expected to be the largest market on the basis of spending and adoption of NLP solutions for the healthcare and life sciences industry.

What Next?

The EHR is here to stay. Now is the time to innovate by introducing better ways to capture clinical data, better ways to interact with the data and better ways to use the data to improve clinical outcomes. NLP and machine leaning are obvious candidates to make this happen. That is why investments are piling up in this area. Jorge Conde is Andreessen Horowitz’ newest general partner and leads the firm’s investments at the cross section of biology, computer science, and healthcare. He was recently asked: “… you were an undergrad biology at Johns Hopkins, but you have an MBA from Harvard and also worked as an investment banker at Morgan Stanley! How does that all add up?” Jorge‘s answer was simple and to the point: “I went to finance to see if I could understand … what drives an industry, how does the operation actually work? But then I realized … that I wanted to build and do. And so … I did additional graduate work in the sciences at the medical school at Harvard and at MIT.” The article I am quoting this from is aptly titled The Century of Biology. Computational biology and by its extension healthcare is going to be the most exciting field for the 21st century and we will need to build the tools to support it. Well, we better get to work!