Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Article type: Research Article
Authors: Mallek, Hanaa; * | Ghozzi, Faizaa | Gargouri, Faiezb
Affiliations: [a] MIRACL Laboratory, ISIMS, Technology Center of Sfax, Sakiet Ezzit, Sfax, Tunisia | [b] MIRACL Laboratory, University of Sfax, Sfax, Tunisia
Correspondence: [*] Corresponding author: Hana Mallek, MIRACL Laboratory, ISIMS, Technology Center of Sfax, Sakiet Ezzit, Sfax, 3021, Tunisia. E-mail: [email protected].
Abstract: As the amount of information exceeds the management and storage capacity of traditional data management systems, several domains need to take into account this growth of data, in particular the decision-making domain known as Business Intelligence (BI). Since the accumulation and reuse of these massive data stands for a gold mine for businesses, several insights that are useful and essential for effective decision making have to be provided. However, it is obvious that there are several problems and challenges for the BI systems, especially at the level of the ETL (Extraction-Transformation-Loading) as an integration system. These processes are responsible for the selection, filtering and restructuring of data sources in order to obtain relevant decisions. In this research paper, our central focus is especially upon the adaptation of the extraction phase inspired from the first step of MapReduce paradigm in order to prepare the massive data to the transformation phase. Subsequently, we provide a conceptual model of the extraction phase which is composed of a conversion operation that guarantees obtaining NoSQL structure suitable for Big Data storage, and a vertical partitioning operation for presenting the storage mode before submitting data to the second ETL phase. Finally, we implement through Talend for Big Data our new component which helps the designer extract data from semi-structured data.
Keywords: Extraction phase, ETL processes, big data, conceptual modeling
DOI: 10.3233/HIS-230008
Journal: International Journal of Hybrid Intelligent Systems, vol. 19, no. 3,4, pp. 167-182, 2023
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
如果您在出版方面需要帮助或有任何建, 件至: [email protected]