Smart Healthcare

iHi Data Platform

:::

iHi Data Platform

CMUH Big Data Center: From iHi Data Platform to Practical Healthcare Intelligence

Introduction

The Big Data Center (BDC) of China Medical University Hospital (CMUH) established the Clinical Research Data Repository in 2016. BDC manages the largest phenome-genome-environmental data platform in Asia, encompassing 19-year EMR and environmental exposure data from 3 million patients and genetic information from 230 thousand patients, which forms the solid foundation for generating clinical data with high resolution. To make great use of valuable medical big data, BDC developed the smart data platform, iHi Platform, in 2020 to ignite hyper-intelligent data applications. The iHi platform is the only innovative data platform that combines clinical, genetic, and environmental data in Taiwan (Fig. 1).

Figure 1. iHi platform provides more than 100 theme-based databases that integrate multi-dimensional data types, including clinical EMR, SNP array genotyping, environmental exposure, medical image, and medical device record.

 

The iHi Platform provides clean, integrated, and de-identified data to clinical researchers through a cloud-based system. This data architecture not only can make the data ecosystem interoperable and sustainable, but also solve the problem of cluttered data and low accessibility and create a venue for infinite artificial intelligence. Through the iHi Platform services, we aim to expand multi-omics clinical data for education, research, and clinical or business application. Ultimately, the insights inspired by the iHi Platform provide feedback to clinical settings and ultimately improve medical quality and patient health (Fig. 2).

▲ Figure 2. Deep cleaned, integrated, and full-spectrum big data ecosystem.

The features of the iHi platform

The iHi Platform was designed as a patient-centered medical data ecosystem that provides clinical researchers with accessible, reliable, and diverse data. Several AI/data tools have received US and Taiwan patents and one AI tool has been approved by US FDA or Taiwan FDA, providing the validity and quality of the iHi Platform. The iHi Platform encompasses innovative data structure (data LEGO) and systematic data annotation workflows (data chip). We further establish the iHi Genomics Analytic Platform to speed up translational research discovery. With the deep-cleaned, comprehensive, and accessible data service, iHi Platform can bring research to an infinite intelligence applications.


Innovative data structure: Data LEGO

We aim to build a full-spectrum big data ecosystem that can not only integrate the EMR data, health insurance data, genomic, and environmental data, but also combine with real-world data from patient-centered systems and multi-omics data such as microbiome and exosome data. Most importantly, these diverse and heterogeneous data must be linkable and traceable for sustainability and reusable. Therefore, we process all data through the standardized data management pipeline, which provides users with high-quality and protected clinical datasets. Furthermore, we modularize multi-omics and multi-dimensional datasets into data LEGO brick which is deep-cleaned and well-sorted by their characteristics. The iHi data platform, a data LEGO pool, deposits diverse data sources, such as EHR, medical images and examination reports. The researchers can select the data bricks of interest to build their own unique data castles and to perform analyses on the iHi Platform (Fig. 3).

▲ Figure 3. Data LEGO: The modular-designed datasets can be selected to build a customized theme database and analysis pipeline.

 


Smart Data Augmented Annotation

All data provided in the iHi platform were processed through the standardized data management pipeline, which provides users with high-quality and protected clinical datasets. To perform systematic data cleaning, validation, and integration, we establish a unique smart data chip fabrication process to control the quality of each processing step.

From data sources acquisition, data architecture design, data polishing, standardization, refinement, to data validation and stacking, the smart data chip with qualified and certified datasets can be generated (Fig. 4).

▲ Figure 4. Smart Date Chip: The workflow of smart data augmented annotation.

 

Through this standard and pre-built smart data chip fabrication process, we can easily manage and trace each process step in the iHi Platform. In addition, we are the only platform that provides both ISO and CNS double-certified de-identified data in Taiwan (Fig. 5).

▲ Figure 5 . ISO 29100 & 29191 and CNS 29100-2 certifications for medical data de-identification.

 

This brand-new concept of continuous scale and flow production used in data processing can deeply clean data and enhance the high-quality AI solutions that fit into the real-world clinical flow. At the same time, high-performance AI can help extract important new data features and insights to enhance data diversity and further brew the smart data ecosystem (Fig. 6).

▲ Figure 6. Smart data augmented annotation: Through systematic data processing flow to enhance data quality and brew the smart data ecosystem.

 


iHi Genomics Analytic Platform

In 2021, we launched the iHi Genomics analytic platform, which is an easy-to-use analytic platform for data exploring and extracting insights from interesting datasets. The iHi Genomics provides the disease cohort selection and Genome-wide Association Study (GWAS) analysis within a few clicks. The iHi Genomics can generate the full report for GWAS, including the quality control details and the Manhattan plot for significant SNPs associated with the disease (Fig.7).

▲ Figure 7. The iHi Genomics Analytic Platform.

 

Using virtual desktop infrastructure (VDI), the user can remotely access de-identified data certified by ISO and CNS in a highly secure environment (Fig 8).

▲ Figure 8. Comprehensive Data Security Guarding Operations (CDSGO) enables data security and protection.

 

The iHi Genomics analytic pipeline provides the full report of the identified gene or SNP, with a detailed description and linkable external information, which allows researchers without coding skills to painlessly perform basic GWAS analysis.

 


International Recognitions

Under the full support of the CMUH board, the BDC manages the EHR of more than 3 million patients connected with genetic and environmental data. The iHi Platform with the deep-cleaned, multi-omics, and integrated data can provide deep macro-level and micro-level resolution for clinical insights discovery. Based on these iHi services, more than 80 SCI papers have been published (Fig. 9).

▲ Figure 9. Current publication performance that were based on iHi databases.

 

Due to an extensive experience and the high quality data of the iHi Platform, we have been collaborating with 19 international institutions, including universities, medical centers, and national institutes of health. In 2018 and 2019, we were invited by the American Society of Nephrology (ASN) to present the big data research and application of kidney diseases (Fig. 10).

▲ Figure 10. Invited interview by American Society of Nephrology to present the big data applications of kidney diseases.

 

The whole working flow and infrastructure of the iHi platform also have been highly recognized by many leading experts worldwide, such as Dr. Nick Bryan, the former president of Radiological Society of North America (Fig. 11).

▲ Figure 11. International scholars from diverse professionals recognize BDC's achievements.

 

Starting from 2022, the global cooperation in clinical and intelligent medical projects has gone stronger, including the collaborations with the universities and hospitals located in US and Japan. In the future, we will continue nurturing the iHi data ecosystem and integrating the worldwide collaborations.

 


We are looking forward to more cooperations.

Stay connected with CMUH
How to get to CMUH the map of hospital