DATA LAKEHOUSE EVOLUTION: A FRAMEWORK FOR IMPLEMENTING DOMAIN-DRIVEN DATA MESH ARCHITECTURE IN ENTERPRISE SYSTEMS
Keywords:
Data Lakehouse, Data Mesh Architecture, Serverless Infrastructure, Enterprise Data Management, Federated Data GovernanceAbstract
Modern enterprises face significant challenges in managing and scaling their data infrastructure while ensuring interoperability across diverse platforms. This article presents a comprehensive framework for implementing a Data Lakehouse architecture incorporating Data Mesh principles and leveraging serverless cloud infrastructure. The proposed approach addresses critical challenges in enterprise data management by combining domain-oriented ownership with federated governance while maintaining global data quality and interoperability standards. The implementation framework integrates open-source technologies and standardized formats to ensure platform independence and prevent vendor lock-in while adopting serverless infrastructure to optimize operational costs and scalability. Through a detailed case study, the article demonstrates how this architecture enables organizations to effectively manage distributed data systems while reducing data duplication and improving time-to-insight. The results indicate significant improvements in data accessibility, reduced operational overhead, and enhanced analytical capabilities. This article contributes to the field by providing a practical blueprint for organizations transitioning to modern data architectures, emphasizing maintaining a balance between decentralized ownership and centralized governance. The framework's effectiveness is validated through real-world implementation, offering valuable insights for practitioners and researchers in enterprise data management.
References
Poltavtseva, M. A., "Evolution of Data Management Systems and Their Security," IEEE Conference Publication, 2019. https://ieeexplore.ieee.org/abstract/document/8711971/figures#figures
Begoli, E., Goethert, I., & Knight, K., "A Lakehouse Architecture for the Management and Analysis of Disparate, Large-Scale Data," 2021 IEEE International Conference on Big Data (Big Data), 2021. https://ieeexplore.ieee.org/document/9671534/figures#figures
Menon, P., & Goethert, I., "Data Lakehouse in Action: Architecting a Modern and Scalable Data Analytics Platform," IEEE Conference Publication, 2022. https://ieeexplore.ieee.org/book/10163024
Dražen Oreščanin, T. Hlupić, "Data Lakehouse - a Novel Step in Analytics Architecture," 2021 44th International Convention on Information, Communication and Electronic Technology (MIPRO), 2021. https://ieeexplore.ieee.org/document/9597091/figures#figures
Royles, C., "The Data Mesh Paradigm," Cloudera, 2022. https://www.cloudera.com/content/dam/www/marketing/resources/whitepapers/the-data-mesh-paradigm.pdf?daqp=true
Hassan, H. B., Barakat, S. A., & Sarhan, Q. I., "Survey on Serverless Computing," Journal of Cloud Computing, 2021. https://journalofcloudcomputing.springeropen.com/articles/10.1186/s13677-021-00253-7
Widner, J., Woolcock, M., & Nieto, D. O., "Using Case Studies to Enhance the Quality of Explanation and Implementation," In The Case for Case Studies: Methods and Applications in International Development, Cambridge University Press, pp. 1-26, 2022. https://www.cambridge.org/core/books/case-for-case-studies/using-case-studies-to-enhance-the-quality-of-explanation-and-implementation/FACFB9C877AAAD65909220E80E3074C7
Grant, A., Bugge, C., & Wells, M., "Designing process evaluations using case study to explore the context of complex interventions evaluated in trials," Trials, 21:982, 2020. https://trialsjournal.biomedcentral.com/articles/10.1186/s13063-020-04880-4
Mellat-Parast, M., & Safari, A., "Improving Quality and Operational Performance of Service Organizations: An Empirical Analysis Using Repeated Cross-Sectional Data of U.S. Firms," IEEE Transactions on Engineering Management, vol. 71, no. 1, pp. 656-669, 2022. https://ieeexplore.ieee.org/document/9667783/citations#citations