Metadata Management to Accelerate Big Data Implementation

Authors

  • Alivia Yulfitri Universitas Esa Unggul
  • Dana Indra Sensuse Universitas Indonesia
  • M. Bahrul Ulum Universitas Esa Unggul
  • Yunita Fauzia Achmad Universitas Esa Unggul

DOI:

https://doi.org/10.52661/jict.v6i2.362

Keywords:

Metadata, DMBOK 2 Framework, Metadata Management Maturity, Big Data

Abstract

Big Data development often encounters obstacles in data quality even though it already has a data warehouse. The lack of high-quality data is the root cause of this issue. One of the causes is the failure to implement metadata management, which leads to issues with non-standardized data, a lack of a common understanding of the meaning and content of a data element, and the use of different data formats and formulas. This leads to a variety of data issues, including data duplication, inconsistent data, inaccurate data, outdated data, and unreliable data. This condition is impacting companies, especially data warehouse managers, who still face problems in cleaning, organizing, and managing data. Therefore, conducting research on metadata management is crucial to determining the necessary preparations for big data development and ensuring the provision of high-quality data. This research utilizes the DMBOK 2 framework, maturity instruments from Standford, and the technical framework from CMMI. The results of the research can help companies with metadata management improve big data implementation.

References

C. Giebler, C. Gröger, E. Hoos, H. Schwarz, and B. Mitschang, “Leveraging the Data Lake: Current State and Challenges,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Springer, 2019, pp. 179–188. doi: 10.1007/978-3-030-27520-4_13.

M. Francia, E. Gallinucci, M. Golfarelli, A. G. Leoni, S. Rizzi, and N. Santolini, “Making data platforms smarter with MOSES,” Future Generation Computer Systems, vol. 125, pp. 299–313, Dec. 2021, doi: 10.1016/j.future.2021.06.031.

M. Shah, “Data Governance in a Big Data World.” Accessed: Mar. 24, 2020. [Online]. Available: https://tdwi.org/articles/2017/09/15/diq-all-data-governance-in-big-data-world.aspx

A. Delgado, D. Calegari, L. González, A. Montarnal, and F. Bénaben, “Towards a Metamodel supporting E-government Collaborative Busines,” in Proceedings of the 53rd Hawaii International Conference on System Sciences, 2020. Accessed: Feb. 15, 2024. [Online]. Available: URI: https://hdl.handle.net/10125/63987

W. D. Kissling et al., “Building essential biodiversity variables (EBVs) of species distribution and abundance at a global scale,” Biological Reviews, vol. 93, no. 1, pp. 600–625, Feb. 2018, doi: 10.1111/brv.12359.

C. Draxl and M. Scheffler, “NOMAD: The FAIR concept for big data-driven materials science,” MRS Bull, vol. 43, no. 9, pp. 676–682, Sep. 2018, doi: 10.1557/mrs.2018.208.

P. Sawadogo and J. Darmont, “On data lake architectures and metadata management,” Journal of Intelligent Information Systems , Jul. 2021, doi: 10.1007/s10844-020-00608-7.

P. Sawadogo, T. Kibata, and J. Darmont, “Metadata Management for Textual Documents in Data Lakes,” May 2019, [Online]. Available: http://arxiv.org/abs/1905.04037

P. Sawadogo, E. Scholly, C. Favre, E. Ferey, S. Loudcher, and J. Darmont, “Metadata Systems for Data Lakes: Models and Features,” Sep. 2019, doi: 10.1007/978-3-030-30278-8.

Emyana Ruth Eritha Sirait, “Implementasi Teknologi Big Data Di Lembaga Pemerintahan Indonesia,” Jurnal Penelitian Pos dan informatika, vol. 6, no. 2, p. 113, 2016, doi: 10.17933/jppi.2016.060201.

C. Morariu, O. Morariu, S. Răileanu, and T. Borangiu, “Machine learning for predictive scheduling and resource allocation in large scale manufacturing systems,” Comput Ind, vol. 120, Sep. 2020, doi: 10.1016/j.compind.2020.103244.

A. Johnston, E. Matechou, and E. B. Dennis, “Outstanding challenges and future directions for biodiversity monitoring using citizen science data,” Methods Ecol Evol, vol. 14, no. 1, pp. 103–116, Jan. 2023, doi: 10.1111/2041-210X.13834.

A. Prasetyo, “Implementasi Big Data pada Sektor Publik di Indonesia Sudah Sampai Mana,” kompasiana.com. Accessed: Mar. 20, 2020. [Online]. Available: https://www.kompasiana.com/agungprasetyo2833/5e0d3650097f364e35498082/implementasi-big-data-pada-sektor-publik-di-indonesia-sudah-sampai-mana?page=2

DAMA International, DAMA DMBOK 2: Data Management Body of Knowledge, 2nd ed. Basking Ridge, New Jersey: Technics Publications, 2017. Accessed: Jun. 04, 2023. [Online]. Available: https://www.TechnicsPub.com

University of Stanford, “University of Stanford’s Data Governance Maturity Model,” University of Stanford, 2011.

University of Stanford, “Data Governance Maturity Model Guiding Questions for each Component-Dimension,” 2013.

J. Zhu et al., “Metadata Management with IBM InfoSphere Information Server,” Career: Data and Analytics, pp. 1–458, 2011, [Online]. Available: http://www.redbooks.ibm.com/abstracts/sg247939.html?Open

M. Allen and D. Cervo, Multi-Domain Master Data Management: Advanced MDM and Data Governance in Practice, 1st Editio. Morgan Kaufmann, 2015.

IBM, “The IBM Data Governance Council Maturity Model,” IBM. [Online]. Available: ftp://ftp.software.ibm.com/software/tivoli/whitepapers/LO11960-USEN-00_10.12.pdf

K. M. Hüner, M. Ofner, and B. Otto, “Towards a maturity model for corporate data quality management,” Proceedings of the ACM Symposium on Applied Computing, pp. 231–238, 2009, doi: 10.1145/1529282.1529334.

Downloads

Published

2025-01-08

How to Cite

Yulfitri, A., Indra Sensuse, D., Ulum, M. B., & Fauzia Achmad, Y. (2025). Metadata Management to Accelerate Big Data Implementation. Journal of Informatics and Communication Technology (JICT), 6(2). https://doi.org/10.52661/jict.v6i2.362

Issue

Section

Articles