(From the July 2015 issue of Research Now)
Informatics for Integrating Biology and the Bedside (i2b2) is an interactive medical informatics system that is widely used for patient cohort identification, which is the process of finding patients with shared characteristics. This is an important first step in the early identification of disease risks for patients and participant recruitment for clinical trials. However, i2b2 has its limitations, and other cohort identification methods, such as searching a large clinical database for prospective subjects or conducting manual chart reviews, can be time-consuming and expensive. Now, a multidisciplinary team at Nationwide Children’s Hospital has found a way to expedite this process and leverage existing i2b2 functionalities for sleep disorder research using natural language processing, which enables quick analysis of large amounts of text.
“Searching for patients that met requirements of a specific research study involved a lengthy and personnel-driven process of sifting through 16,000 sleep disorder medical summary documents,” says Yungui Huang, PhD, MBA, director of Research Information Solutions and Innovations (RISI) at The Research Institute at Nationwide Children’s and senior author of the study, which was published in Applied Clinical Informatics.
The study was made possible by the collaborative efforts of a multidisciplinary team that also included Wei Chen, PhD, senior systems programmer from the big data team of RISI; Robert Kowatch, MD, PhD, principal investigator in the Center for Innovation in Pediatric Practice; Simon Lin, MD, MBA, chief research information officer at Nationwide Children’s; and Mark Splaingard, MD, director of the Sleep Disorders Center at Nationwide Children’s.
According to Dr. Huang, the manual process used to take an average of a couple of weeks. A primary motivation for this study was determining ways to speed up the research patient identification process.
From January 2004 to September 2014, 15,683 sleep study reports were collected at Nationwide Children’s. The researchers note that this large number of documents means that physicians urgently needed the cohort identification process to be automated by using an informatics system to overcome the poor performance of traditional manual means.
Dr. Huang and her team developed an i2b2 application to speed the process of cohort identification by converting textual information from sleep study documents into data that is more organized and useful for researchers.
“We extracted essential information from the sleep disorder summary documents and stored them in a database,” explains Dr. Huang. “Using the i2b2 platform, which comes with a user-friendly drag-and-drop interface, combined with a customized ontology or structure of searchable terms, we enabled cohort identification in real time – and that’s an important first step for clinical research.”
In addition to reducing the labor cost of cohort identification, the i2b2 system made data more accessible than before. Rather than using traditional keyword-based methods of document searching, “what if” questions could be asked directly in real time, instead of waiting for weeks for manual chart review.
“Our study shortened the research patient identification process from two weeks to about 15 minutes,” says Dr. Huang. “This is the first study to customize i2b2 specifically for sleep disorder research, and it is also the first i2b2 study initiated by Nationwide Children’s.”
This paves the road for future clinicians and investigators to take advantage of this simple yet powerful platform for their patient care and research needs, adds Dr. Huang.
As far as next steps for this team’s research, Dr. Huang explains that they are planning to roll out this institutional i2b2 platform organization-wide, in hopes that many other researchers will be able to learn about it and use it as a means of preparing for their clinical studies.
“We are also in the process of linking our i2b2 platform with other institutions,” says Dr. Huang. “This will further enhance researchers’ capabilities to search for eligible research patients not only within Nationwide Children’s, but across institutional boundaries.”
Chen W, Kowatch R, Lin S, Splaingard M, Huang Y. Interactive cohort identification of sleep disorder patients using natural language processing and i2b2. Applied Clinical Informatics. 2015 May 27;6:345-363.