Authors
Keywords
Abstract
: Healthcare organizations are increasingly relying on data warehouses to centralize and manage electronic health record (EHR) data for operational, clinical, and research purposes. These repositories integrate patient care information, administrative records, and financial data, while maintaining strict compliance with regulatory requirements and protecting health information privacy. Modern data warehouses serve as decision support systems, facilitating business analytics, quality improvement initiatives, and strategic planning across healthcare organizations. The implementation of research data warehouses (RDWs) has made it possible to effectively reuse EHR data for scientific investigations supported by specialized IT infrastructure and governance structures. Advanced machine learning techniques, including Random Forest, Ada boost, and XG Boost regression algorithms, are used to analyse complex healthcare datasets and extract predictive insights. These ensemble methods improve accuracy while reducing over fitting risks in predictive modelling applications. The evolution towards cloud-based repositories requires comprehensive data governance strategies that include security, integrity, and regulatory compliance to ensure consistent health analytics capabilities.
Keywords: Data warehousing, Electronic health records (EHR), Health analytics, and Research data warehousing (RDW), Machine learning regression, Data governance, Regulatory compliance, Group methods, Clinical decision support and Predictive modelling
