Health Data Analytics and Electronic Health Records for Better Healthcare

Jonathan Silverstein, MD of of the University of Pittsburgh School of Medicine Department of Biomedical Informatics, during the Health Record Research Request office hours at sciVelo.

Jonathan Silverstein, MD of of the University of Pittsburgh School of Medicine Department of Biomedical Informatics, during the Health Record Research Request office hours at sciVelo.

University of Pittsburgh’s Department of Biomedical Informatics Partners Across Campus and the Region

Health information systems are collecting large amounts of patient data in the form of electronic health records (EHRs). By applying modern analytical methods to EHRs, university research investigators are increasingly making predictions that are critical to many healthcare decisions. These predictions can detect the presence of disease, assess the risk of developing disease, forecast the progression or regression of disease and project the response to therapy. As better predictive models are likely to generate better decisions, even small improvements in predictions can revolutionize patient care leading to improved health outcomes at lower costs. While other industries have applied artificial intelligence (AI) techniques such as machine learning to large amounts of data to improve their services, the healthcare industry has until now been lagging behind. A key bottleneck has been the ability to extract and structure EHRs in a consistent format that allows the large-scale application of machine learning and the effective sharing of the data in a secure fashion.

EHRs span a range of patient data including patient demographics, medication history, laboratory test results and clinical notes. The University of Pittsburgh Medical Center (UPMC), one of the largest academic health centers in the country, is one the earliest and most sophisticated adopters of EHR systems. Since 2012, UPMC has spent more than $1.5 billion in hospital information systems to advance clinical excellence and administrative efficiency.1 In 2015, UPMC’s Children's Hospital of Pittsburgh, was given the Davies Award by the Healthcare Information Management Systems Society for early adoption of EMR systems.2 In 2017, UPMC was named one of the nation’s “most wired” health systems by the American Hospital Association’s Health Forum.1 Today, UPMC deploys advanced EHR systems across >40 hospitals and >500 outpatient facilities, and these systems contain EHRs for >4.5 million patients.

There is a long history of using EHRs for research by investigators at the University of Pittsburgh (Pitt) and UPMC. One of the earliest warehouses of clinical data that was created is the Medical ARchival System (MARS) that began collecting clinical information on patients in 1986. However, as rapid and large-scale analyses require EHRs to be transformed into consistent and standardized formats, in 2017, Dr. Shyam Visweswaran at the Department of Biomedical Informatics (DBMI) established a new research warehouse called Neptune for the University of Pittsburgh’s Clinical and Translational Science Institute (CTSI). Neptune integrates data from several sources into a consolidated longitudinal health record for each UPMC patient. Furthermore, Neptune’s data on patient demographics, diagnoses, procedures, medications and laboratory test results are harmonized to standard terminologies such as: ICD-9, ICD-10 for diagnoses; ICD-9, ICD-10, CPT-4, and HCPCS for procedures; RxNorm for medications; and LOINC for laboratory test results. Currently, Neptune contains 15 years of data, from January 2004 to December 2018, from >4.5 million patients that includes 241 million diagnoses, 104 million procedures, 1.14 billion laboratory test results and 72 million medication orders. Data in Neptune is updated monthly from UPMC EHR systems.

Creating resources for translational research: Centers at Pitt that support use of EHRs

The Center for Clinical Research Informatics

The Center for Clinical Research Informatics (CCRI) is a core informatics center that is housed in the Department of Biomedical Informatics (DBMI). The mission of CCRI is to do innovative research that is enabled by clinical, mobile health, molecular and research data. By applying novel AI techniques to large amounts of EHRs, the CCRI team, led by Dr. Shyam Visweswaran, facilitates reuse of clinical and research data to empower translational research and clinical informatics. For instance, the Learning Electronic Medical Record (LEMR) project uses machine learning to enhance EHR systems to intelligently display the right data at the right time by adapting both to the physician user and the patient’s condition. As another example, personalized predictive modeling derives predictive models that are tailored to an individual patient as opposed to population-based approaches.
“A personalized model is developed using factors that are specific to the current patient and such a model can be simple and yet more accurate in predicting clinical outcomes for that patient.”

Shyam Visweswaran, MD, PhD, Associate Professor of Biomedical Informatics, the Director of the Center for Clinical Research Informatics (CCRI) and the Director of the Biomedical Informatics Core of the University of Pittsburgh Clinical and Translational Science Institute (CTSI).

Another important research program is the discovery of causal relationships from large amounts of data including EHRs. The Center for Causal Discovery (CCD), led by Dr. Gregory F. Cooper at DBMI, develops advanced casual discovery algorithms. The CCD is a partnership among data science experts from Pitt, Carnegie Mellon University (CMU), and the Pittsburgh Supercomputing Center (PSC) with exceptional collaborators from Yale University, California Institute of Technology, Rutgers University, Stanford University, the University of Crete, and the University of North Carolina.

The Research Informatics Office

The Research Informatics Office (RIO) was established in 2017 by Dr. Jonathan C. Silverstein, who joined Pitt as its first Chief Research Informatics Officer. RIO’s principal mission is “to support investigators by innovative collection and use of biomedical data”, in other words, to do science-as-a-service. The RIO provisions EHR data from Neptune and UPMC EHR systems through its Health Record Research Request (R3) service, often within days, and at affordable costs to investigators through support from the Clinical and Translational Science Institute (CTSI), led by Dr. Steven Reis. The RIO extracts data from multiple EHR systems at UPMC, de-identifies the data to protect patient privacy, and provisions the data approved by the Pitt Institutional Review Board (IRB) for use in research. Data provisioned by the RIO includes structured EHR data, clinical reports, and clinical images. R3 is a process and a policy that the RIO executes on behalf of UPMC by allowing UPMC data for research. R3 is also a service provided by the RIO staff, where health informatics researchers help other researchers to solve their clinical and translational research problems.

As an example, Dr. Kevin Kraemer is the principal investigator of a large project funded by the Patient-Centered Outcomes Research Institute (PCORI), a nonprofit, nongovernmental organization located in Washington, DC. The study is designed to help determine unsafe opioid prescription practices and treatment of acute pain. Currently 48 clinics are involved in Pennsylvania and Utah, with 13 of them being located in Pittsburgh. Data is collected from each UPMC clinic on a weekly basis, and R3 helps to identify the patients eligible to be involved in the study. Dr. Kraemer and his team can then track data for enrolled patients related to initial visit summaries, referrals to pain clinics, prescriptions, how long the patients are treated with opioids and how they respond to the treatment. Dr. Kraemer’s team can also send monthly feedback reports to the clinics on how frequently they have prescribed opioids compared to their peers based on data extracted through R3. Dr. Kraemer said: “My team would not be able to do any of this work without R3.”
Multiple Pittsburgh institutions and units, including UPMC Enterprises (UPMCE), the Institute for Precision Medicine (IPM), and the Pittsburgh Supercomputing Center (PSC) , have also been working collaboratively with the RIO to help accomplish RIO’s mission. For instance, R3 is authorized by UPMC’s Chief Medical Information Officer, Dr. Robert Bart, to access other types of UPMC data that normally would not exist in Neptune, such as genomics. Due to the efforts of the RIO team and the deep collaborations across the UPMC Enterprises (UPMCE), the Institute for Precision Medicine (IPM) and the Clinical and Translational Science Institute (CTSI), the efficiency of EHR data provisioning for research has significantly improved at Pitt and UPMC.
“As the data types and the data sources R3 is provisioning are expanding, the numbers of investigators and departments that RIO is serving are rapidly growing. R3 has received 800 requests from investigators within its first year of operation."

Jonathan C. Silverstein, MD, MS, FACS, FACMI, Chief Research Informatics Officer of the Research Office of Informatics (ROI) of the University of Pittsburgh’s Department of Biomedical Informatics.

Another remarkable resource coming in the spring of 2019 to researchers is the Cancer Registry Records for Research (CR3). CR3 will enable cancer investigators to search UPMC Network Cancer Registry data including detailed diagnoses, general treatment, and outcomes to identify cohorts of patients for studies. These data can be combined with other medical data from EHRs creating innovative opportunities in cancer research within the University and the UPMC network. CR3 was created by the RIO in collaboration with UPMC Network Cancer Registry and University of Chicago under National Cancer Institute (NCI) funding. CR3 is supported by the IPM that is led by Dr. Adrian Lee and has Drs. Arthur S. Levine, Steven Shapiro and Jeremy Berg on its senior advisory board. Under related funding from NCI’s Information Technology for Cancer Research (ITCR) program to DBMI investigators, the Text Information Extraction System (TIES) enables text data from multiple EHRs to be structured and searched by using natural language processing tools. Currently, TIES holds approximately 30 million de-identified pathology and radiology documents.

Collaborations for Better Healthcare: National EHR Data Networks

Several national efforts are ongoing to create EHR data networks to make patient data more readily available for clinical, translational, and informatics research. The CCRI and RIO, in collaboration with the CTSI Director Dr. Steven Reis, contribute to the following:

  • The PaTH network is a component of the PCORnet, the National Patient-Centered Clinical Research Network, and is funded by the PCORI . The goal of PCORnet is to make it faster, easier, and less costly to conduct clinical research across institutions by harnessing the power of large amounts of EHR and other health data. While PaTH enables investigation across diseases, the specific focus of PaTH is to share data on three targeted conditions (idiopathic pulmonary fibrosis, atrial fibrillation and obesity) across six mid-Atlantic health centers and nationally. Principal Investigator: Kathleen McTigue, MD, MPH.
  • The Accrual of patients to Clinical Trials (ACT) network is a nationwide network of sites that share EHR data to significantly increase participant accrual to the nation’s highest priority clinical trials. The ACT network is a federated network with common standards, data terminology and shared resources. The network provides the ability to query data in real-time on >80 million patients across 32 academic health centers. ACT is funded by the National Center for Advancing Translational Sciences (NCATS) of the NIH. Principal Investigator: Steven Reis, MD.
  • The All of Us Research Program of the Precision Medicine Initiative is a historic effort to gather data from 1 million people living in the U.S. and is funded by the NIH. The goal of the program is to revolutionize how disease is prevented and treated based on individual differences in lifestyle, environment and genetics. Pitt’s program, called the All of Us Pennsylvania Research Program, will enroll 120,000 participants. Principal Investigators: Steven Reis, MD, Shyam Visweswaran, MD, PhD and Oscar C. Marroquin, MD.
  • The Human BioMolecular Atlas Program (HuBMAP) is another program funded by NIH with its Infrastructure Component co-led by PSC and DBMI. HuBMAP aims to help scientists learn more about the biological processes in the human body, such as the aging or disease development, or explore the relationship between cellular organization and function through the use of imaging and genomics at a single cell level within human tissues. Principal Investigators: Jonathan C. Silverstein, MD and Nicholas A. Nystrom, PhD.

Transforming Pittsburgh and the Future of Health Care

Under the vision and leadership of Dr. Arthur S. Levine, the Senior Vice Chancellor for the Health Sciences and Dean of the School of Medicine, and in partnership with UPMC, Pitt has become one of the leaders in health sciences improving healthcare on a global scale. Successful commercial translation of basic and clinical research framed around unmet healthcare needs extends the impact of Pitt’s research through improved health and more efficient healthcare spending. The Pittsburgh Health Data Alliance (PHDA) Initiative was formed in 2015 as a collaboration among UPMC, Pitt, and CMU to cultivate translational research projects that intersect major unmet healthcare needs for purposes of commercial translation. Housed in the University of Pittsburgh’s Department of Biomedical Informatics, the Center for Commercial Applications of Healthcare Data (CCA) is part of the PHDA and is operated by sciVelo and co-directed by Dr. Michael J. Becich and Dr. Donald P. Taylor.
Mike headshot
Michael J. Becich, MD, PhD, the Chairman and Distinguished University Professor at DBMI is focused on making health care data and health analytics an asset for the region.
“…creating a data commons for the region for commercialization, in partnership with UPMC and UPMCE, will eventually act as an attractive force for major biotech companies and pharma to cohabitate in Oakland. Oakland is the heart of Pittsburgh’s translational science ecosystem supported UPMC, CMU and Pitt. This cohabitation will further advance innovation in healthcare in areas such as: immuno-oncology, clinical decision support, mobile health, behavioral health, and opioid mitigation - all important areas that matter to national health and wellness…”

Michael J. Becich, MD, PhD

Since its formation, the CCA has demonstrated the ability to translate scientific breakthroughs into market-oriented solutions. Through the CCA, over 60 digital health projects pitched to UPMCE for translational research funding in less than 3 years. These projects spawn from a portfolio spanning over 300 researchers across 10 schools and 69 departments at the University of Pittsburgh. Among these projects, Spatial Pathology Powers Cancer Diagnostics (SPDx) formed a new company called SpIntellx that develops new machine learning software tools to computationally guide pathologists’ decisions. In addition, the CCA project, Tumor Driver Identification, received substantial follow-on funding from the UPMC Immune Transplant and Therapy Center (ITTC) to perform clinical trials for validation of its ability to predict response to immunotherapy in melanoma patients. The University of Pittsburgh Innovation Institute has been an instrumental partner to the CCA as it is the organization on campus responsible for filing intellectual property, licensing out technology, and advancing academic entrepreneurship.
Don Taylor
Donald P. Taylor, PhD, MBA, CLP, Assistant Vice Chancellor of Health Sciences Translation, Associate Professor of Biomedical Informatics and Executive Director of sciVelo, Innovation Institute.
“The exquisite coordination and collaboration across these health data and health analytics initiatives were largely informed by the Plan for Pitt’s Goal 2: Engage in Research of Impact. Here we’re executing on two strategies that support Goal 2: ‘expanding our computational capacity’ and ‘Extend the Impact of our Research through application to practice, policy development, and commercial translation’. One example of executing on these goals is a recently funded CCA project within the Swanson School of Engineering (Principal Investigaor: Dr. David Vorp) that’s liberated a decades-long academic research project toward a tractable market solution. This project addresses the significant unmet clinical need of accurately predicting the risk of abdominal aortic aneurysms rupturing beyond the current standard of care that brings little comfort to patients and providers. With our expanded computational capacity in accessing and processing digital radiology images through R3 and our translational research design on solving the unmet clinical need we’re advancing the science toward saving over 200,000 lives per year."

Donald P. Taylor, PhD, MBA, CLP

Author/Photographer: Ceren Tuzmen, PhD