Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Purchase individual online access for 1 year to this journal.
Price: EUR 135.00Impact Factor 2024: 0.9
Intelligent Data Analysis provides a forum for the examination of issues related to the research and applications of Artificial Intelligence techniques in data analysis across a variety of disciplines. These techniques include (but are not limited to): all areas of data visualization, data pre-processing (fusion, editing, transformation, filtering, sampling), data engineering, database mining techniques, tools and applications, use of domain knowledge in data analysis, big data applications, evolutionary algorithms, machine learning, neural nets, fuzzy logic, statistical pattern recognition, knowledge filtering, and post-processing.
In particular, papers are preferred that discuss development of new AI related data analysis architectures, methodologies, and techniques and their applications to various domains.
Papers published in this journal are geared heavily towards applications, with an anticipated split of 70% of the papers published being applications-oriented, research and the remaining 30% containing more theoretical research. Manuscripts should be submitted in *.pdf format only. Please prepare your manuscripts in single space, and include figures and tables in the body of the text where they are referred to. For all enquiries regarding the submission of your manuscript please contact the IDA journal editor: [email protected]
Authors: Yan, Cairong | Zhu, Ziyang | Zhang, Yiwei | Guan, Xiaopeng | Wan, Yongquan
Article Type: Research Article
Abstract: Multi-behavior recommendation models excel in extracting abundant information from user-item interactions to enhance performance; however, they encounter challenges in accuracy due to noise disturbance and ambiguous weight allocation. In this paper, we propose cd-MBRec, a novel model designed to amplify commonality among various behaviors, thereby minimizing noise interference while preserving behavior diversity to highlight semantic variations in feedback across distinct scenarios. Specifically, the model begins by constructing behavior matrices that models separate behaviors, along with an interaction matrix offering a broad overview of user behaviors. It employs graph neural networks to extract higher-order semantic and structural information from input data. …Concurrently, the model integrates principles of Weber-Fechner Law for the adaptive allocation of initial weights to the multiple behaviors and utilizes matrix factorization techniques for efficient behavior embedding. Extensive experiments on two real-world datasets demonstrate that cd-MBRec surpasses existing state-of-the-art models in recommendation performance, achieving notable average improvements of 4.96% in HR@10 and 7.75% in NDCG@10. Show more
Keywords: Multi-behavior recommendation, graph neural networks, matrix factorization
DOI: 10.3233/IDA-240393
Citation: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-17, 2024
Authors: Li, Xiaoyong | Cheng, Huimin | An, Sufang | Zhang, Yanjun | Zhang, Yong
Article Type: Research Article
Abstract: Social relationships among students at campus are closely related to their mental health and academic performance. Therefore, it is a very important task for educators to analyze students’ social relationships. However, existing studies have focused on one-to-one social relationships between students, few ones have explored the high-order community relationships hidden in social networks, especially in a visual manner. To solve this problem, a visual analysis system called ViSSR is proposed in this paper, which utilizes the Louvain algorithm to detect the hierarchical community structure of students’ social network at campus, and then provides four coordinated views to visualize the detection …results. Among the views, the hierarchical hypergraph view is to visualize the hierarchical community structure that greatly breaks through the limitations of first-order relationships available in a traditional node-link social network, the community analysis view and individual analysis view show the social characteristics of a community and individual student respectively, and the matrix view displays the behavioral features of students. Case studies and experts evaluation have been conducted to demonstrate the usability of the system. Show more
Keywords: Student social relationships, visual analysis, community detection, hierarchical hypergraph
DOI: 10.3233/IDA-230263
Citation: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-26, 2024
Authors: Bera, Somenath | Varish, Naushad | Yaqoob, Syed irfan | Rafi, Mudassir | Shrivastava, Vimal K.
Article Type: Research Article
Abstract: Joint spectral-spatial feature extraction has been proven to be the most effective part of hyperspectral image (HSI) classification. But, due to the mixing of informative and noisy bands in HSI, joint spectral-spatial feature extraction using convolutional neural network (CNN) may lead to information loss and high computational cost. More specifically, joint spectral-spatial feature extraction from excessive bands may cause loss of spectral information due to the involvement of convolution operation on non-informative spectral bands. Therefore, we propose a simple yet effective deep learning model, named deep hierarchical spectral-spatial feature fusion (DHSSFF), where spectral-spatial features are exploited separately to reduce the …information loss and fuse the deep features to learn the semantic information. It makes use of abundant spectral bands and few informative bands of HSI for spectral and spatial feature extraction, respectively. The spectral and spatial features are extracted through 1D CNN and 3D CNN, respectively. To validate the effectiveness of our model, the experiments have been performed on five well-known HSI datasets. Experimental results demonstrate that the proposed method outperforms other state-of-the-art methods and achieved 99.17%, 98.84%, 98.70%, 99.18%, and 99.24% overall accuracy on Kennedy Space Center, Botswana, Indian Pines, University of Pavia, and Salinas datasets, respectively. Show more
Keywords: CNN, deep learning, feature fusion, feature extraction, hyperspectral image classification, informative bands
DOI: 10.3233/IDA-230927
Citation: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-25, 2024
Authors: Weng, Wei | Hou, Fengxia | Gong, Shengchao | Chen, Fen | Lin, Dongsheng
Article Type: Research Article
Abstract: Graph clustering is a crucial technique for partitioning graph data. Recent research has concentrated on integrating topology and attribute information from attribute graphs to generate node embeddings, which are subsequently clustered using classical algorithms. However, these methods have some limitations, such as insufficient information inheritance in shallow networks or inadequate quality of reconstructed nodes, leading to suboptimal clustering performance. To tackle these challenges, we introduce two normalization techniques within the graph attention autoencoder framework, coupled with an MSE loss, to facilitate node embedding learning. Furthermore, we integrate Transformers into the self-optimization module to refine node embeddings and clustering outcomes. Our …model can induce appropriate node embeddings for graph clustering in a shallow network. Our experimental results demonstrate that our proposed approach outperforms the state-of-the-art in graph clustering over multiple benchmark datasets. In particular, we achieved 76.3% accuracy on the Pubmed dataset, an improvement of at least 7% compared to other methods. Show more
Keywords: Attribute graph clustering, transformer, autoencoder, node embedding
DOI: 10.3233/IDA-230647
Citation: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-16, 2024
Authors: Swetha, A.V.S. | Bala, Manju | Sharma, Kapil
Article Type: Research Article
Abstract: Breast cancer poses a significant threat to women’s health, emphasizing the crucial role of timely detection. Traditional pathology reports, though widely used, face challenges prompting the development of automated Deep Learning (DL) tools. DL models, gaining traction in radiology, offer precise diagnoses; however, issues with generalization on varying dataset sizes persist. This paper introduces a computationally efficient DL framework, addressing dataset imbalance through a hybrid model design, ensuring both accuracy and speed in breast cancer image classification. Proposed model novel design excels in accuracy and generalization across medical imaging datasets, providing a robust tool for precise diagnostics. The proposed model …integrates features from two classifiers, Inception ResNet V2 and Vision Transformers (ViT), to enhance the classification of breast cancer. This synergistic blend enhances adaptability, ensuring consistent performance across diverse dataset scales. A key contribution is the introduction of an Efficient Attention Mechanism within one of the classifiers, optimizing focus on critical features for improved accuracy and computational efficiency. Further, a Resource-Efficient Optimization model through feature selection is proposed, streamlining computational usage without compromising accuracy. Addressing the inherent heterogeneity within classifiers, our framework integrates high dimensional features comprehensively, leading to more accurate tumor class predictions. This consideration of heterogeneity marks a significant leap forward in precision for breast cancer diagnosis. An extensive analysis on datasets, BreakHis and BACH, that are imbalanced in nature is conducted by evaluating complexity, performance, and resource usage. Comprehensive evaluation using the datasets and standard performance metrics accuracy, precision, Recall, F1-score, MCC reveals the model’s high efficacy, achieving a testing accuracy of 0.9936 and 0.994, with precision, recall, F1-score and MCC scores of 0.9919, 0.987, 0.9898, 0.9852 and 0.989, 1.0, 0.993, 0.988 on the BreakHis and BACH datasets, respectively. Our proposed model outperforms state-of-the-art techniques, demonstrating superior accuracy across different datasets, with improvements ranging from 0.25% to 15% on the BACH dataset and from 0.36% to 15.02% on the BreakHis dataset. Our results position the framework as a promising solution for advancing breast cancer prediction in both clinical and research applications. The collective contributions, from framework and hybrid model design to feature selection and classifier heterogeneity consideration, establish a holistic and state-of-the-art approach, significantly improving accuracy and establishing optimization in breast cancer classification from MRI images. Future research for the DL framework in breast cancer image classification includes enhancing interpretability, integrating multi-modal data, and developing personalized treatments. Show more
Keywords: InceptionResNetV2, vision transformer, Multi-scale weight adaptive Nystrom attention mechanism, global features, local features
DOI: 10.3233/IDA-240334
Citation: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-36, 2024
Authors: Shin, Nakyung | Lee, Yulhee | Moon, Heesung | Kim, Joonhui | Jung, Hohyun
Article Type: Research Article
Abstract: The exponential growth of academic papers necessitates sophisticated classification systems to effectively manage and navigate vast information repositories. Despite the proliferation of such systems, traditional approaches often rely on embeddings that do not allow for easy interpretation of classification decisions, creating a gap in transparency and understanding. To address these challenges, we propose an innovative explainable paper classification system that combines Latent Semantic Analysis (LSA) for topic modeling with explainable artificial intelligence (XAI) techniques. Our objective is to identify which topics significantly influence the classification outcomes, incorporating Shapley additive explanations (SHAP) as a key XAI technique. Our system extracts topic …assignments and word assignments from paper abstracts using latent semantic analysis (LSA) topic modeling. Topic assignments are then employed as embeddings in a multilayer perceptron (MLP) classification model, with the word assignments further utilized alongside SHAP for interpreting the classification results at the corpus, document, and word levels, enhancing interpretability and providing a clear rationale for each classification decision. We applied our model to a dataset from the Web of Science, specifically focusing on the field of nanomaterials. Our model demonstrates superior classification performance compared to several baseline models. Ultimately, our proposed model offers a significant advancement in both the performance and explainability of the system, validated by case studies that illustrate its effectiveness in real-world applications. Show more
Keywords: Paper classification, topic modeling, latent semantic analysis, explainable artificial intelligence, Shapley value
DOI: 10.3233/IDA-240075
Citation: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-27, 2024
Authors: Li, Wei
Article Type: Research Article
Abstract: With the modernization of cities, public sculptures are constantly being designed and constructed. The artistic form and image expression effect of sculpture based on intelligent and parametric design needs to be designed and developed to guide and assist the construction of sculpture. This paper applies the NAS architecture search method to explore the field of image expression effect models. Through the end-to-end search of the experiment designed in this paper, the separable convolution lightweight design is used, and the new model AestheticNet is used to predict the image form effect score distribution. Secondly, this paper proposes optimization strategies combining image …expression effect theory and convolutional neural network, including improvement of Loss function self-weighted Loss, two-dimensional Attention mechanism – introduction of CBAM, and adaptive pooling layer. Optimization of several aspects, such as adaptive input. Finally, the validation set is compared with other existing image-morphological effect model algorithms, which proves the effectiveness of the customized search scheme. It demonstrates the efficacy of the AestheticNet model compared to other algorithms by validating its prediction of public sculpture image form effect ratings. The artistic form using intelligent and parametric design methodologies may improve. Image expression of sculptures may be enhanced by applying the image form effect model, which should be pervasive. We can use it to intelligently and parametrically guide the design and manufacture of sculptures. Show more
Keywords: Public sculpture, intelligent parameterization, image expression, deep learning, NAS architecture
DOI: 10.3233/IDA-240497
Citation: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-21, 2024
Authors: Xu, Jie | Zhang, Luo Jia | Zhao, De Chun | Ji, Gen Lin | Li, Pei Heng
Article Type: Research Article
Abstract: Long-term time series forecasting (LTSF) has become an urgent requirement in many applications, such as wind power supply planning. This is a highly challenging task because it requires considering both the complex frequency-domain and time-domain information in long-term time series simultaneously. However, existing work only considers potential patterns in a single domain (e.g., time or frequency domain), whereas a large amount of time-frequency domain information exists in real-world LTSFs. In this paper, we propose a multi-scale hierarchical network (MHNet) based on time-frequency decomposition to solve the above problem. MHNet first introduces a multi-scale hierarchical representation, extracting and learning features of …time series in the time domain, and gradually builds up a global understanding and representation of the time series at different time scales, enabling the model to process time series over lengthy periods of time with lower computational complexity. Then, the robustness to noise is enhanced by employing a transformer that leverages frequency-enhanced decomposition to model global dependencies and integrates attention mechanisms in the frequency domain. Meanwhile, forecasting accuracy is further improved by designing a periodic trend decomposition module for multiple decompositions to reduce input-output fluctuations. Experiments on five real benchmark datasets show that the forecasting accuracy and computational efficiency of MHNet outperform state-of-the-art methods. Show more
Keywords: Long-term series forecasting, multi-scale modeling, time-frequency representation, time series decomposition
DOI: 10.3233/IDA-240455
Citation: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-23, 2024
Authors: Kong, Lingkai | Zhao, Boying | Li, Hongyu | He, Wei | Cao, You | Zhou, Guohui
Article Type: Research Article
Abstract: Medical assisted decision-making plays a key role in providing accurate and reliable medical advice. But in medical decision-making, various uncertainties are often accompanied. The belief rule base (BRB) has a strong nonlinear modeling capability and can handle uncertainties well. However, BRB suffers from combinatorial explosion and tends to influence explainability during the optimization process. Therefore, an interval belief rule base with explainability (IBRB-e) is explored in this paper. Firstly, pre-processing using extreme gradient boosting (XGBoost) is performed to filter out features with lower importance. Secondly, based on the filtered features, explainability criterion is defined. Thirdly, evidence reasoning (ER) rule is …chosen as an inference tool, while projection covariance matrix adaptive evolutionary strategy (P-CMA-ES) algorithm with explainability constraints is chosen as an optimization algorithm. Lastly, the validation of the model is performed through a breast cancer case. The experimental results show that IBRB-e has good explainability while maintaining high accuracy. Show more
Keywords: Belief rule base, decision-making, medical assistant, explainability, evidence reasoning
DOI: 10.3233/IDA-230648
Citation: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-25, 2024
Authors: Chen, Hongwei | Liu, Luanxuan | Chen, Zexi
Article Type: Research Article
Abstract: In recent years, sequential recommendation has received widespread attention for its role in enhancing user experience and driving personalized content recommendations. However, it also encounters challenges, including the limitations of modeling information and the variability of user preferences. A novel time-aware Long-Short Term Transformer (TLSTSRec) for sequential recommendation is introduced in this paper to address these challenges. TLSTSRec has two major innovative features. (1) Accurate modeling of users is achieved by fully leveraging temporal information. Time information is modeled by creating a trainable timestamp matrix from both the perspectives of time duration and time spectrum. (2) A novel time-aware Transformer model …is proposed. To address the inherent variability of user preferences over time, the model combines long-term and short-term temporal information and adjusts the personalized trade-offs between long-term and short-term sequences using adaptive fusion layers. Subsequently, newly designed encoders and decoders are employed to model timestamps and interaction items. Finally, extensive experiments substantiate the effectiveness of TLSTSRec relative to various state-of-the-art sequential recommendation models based on MC/RNN/GNN/SA across a spectrum of widely used metrics. Furthermore, experiments are conducted to validate the rationality of the TLSTSRec structure. Show more
Keywords: Sequential recommendation, transformer, self-attention, time-aware model
DOI: 10.3233/IDA-240051
Citation: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-21, 2024
Authors: Ren, Ziqian | Xun, Yaling | Cai, Jianghui | Yang, Haifeng
Article Type: Research Article
Abstract: Periodic high-utility sequential patterns (PHUSPs) mining is one of the research hotspots in data mining, which aims to discover patterns that not only have high utility but also regularly appear in sequence datasets. Traditional PHUSP mining mainly focuses on mining patterns from a single sequence, which often results in some interesting patterns being discarded due to strict constraints, and most of the discovered patterns are unstable and difficult to use for decision-making. In response to this issue, a novel algorithm called TKSPUS (top-k stable periodic high-utility sequential pattern mining) is proposed to discover stable top-k periodic high-utility sequential patterns that …co-occur in multi-sequences. TKSPUS extends the traditional periodic high-utility sequential patterns mining, and designs two new metrics, namely utility stability coefficient (usc) and periodic stability coefficient (sr), to determine the periodic stability and utility stability of patterns in multi-sequences respectively. Additionally, the TKSPUS algorithm adopts the projection mechanism to mine stable periodic high-utility patterns over multi-sequence, while a new data structure called pusc and two corresponding pruning strategies are also introduced to boost the mining process. Experiments show that compared with the other four related algorithms, the TKSPUS algorithm has better performance in memory consumption and execution time, and the stability of the mining results is improved by 47% on average compared with the traditional periodic high-utility patterns mining algorithm. Show more
Keywords: High-utility pattern mining, periodic pattern, pattern stability, multi-sequences
DOI: 10.3233/IDA-230672
Citation: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-24, 2024
Authors: Chen, Mingcai | Du, Yuntao | Tang, Wei | Zhang, Baoming | Wang, Chongjun
Article Type: Research Article
Abstract: Real-world machine learning applications seldom provide perfect labeled data, posing a challenge in developing models robust to noisy labels. Recent methods prioritize noise filtering based on the discrepancies between model predictions and the provided noisy labels, assuming samples with minimal classification losses to be clean. In this work, we capitalize on the consistency between the learned model and the complete noisy dataset, employing the data’s rich representational and topological information. We introduce LaplaceConfidence, a method that to obtain label confidence (i.e., clean probabilities) utilizing the Laplacian energy. Specifically, it first constructs graphs based on the feature representations of all noisy …samples and minimizes the Laplacian energy to produce a low-energy graph. Clean labels should fit well into the low-energy graph while noisy ones should not, allowing our method to determine data’s clean probabilities. Furthermore, LaplaceConfidence is embedded into a holistic method for robust training, where co-training technique generates unbiased label confidence and label refurbishment technique better utilizes it. We also explore the dimensionality reduction technique to accommodate our method on large-scale noisy datasets. Our experiments demonstrate that LaplaceConfidence outperforms state-of-the-art methods on benchmark datasets under both synthetic and real-world noise. Code available at https://github.com/chenmc1996/LaplaceConfidence . Show more
Keywords: Learning with noisy labels, graph energy, label refurbishment
DOI: 10.3233/IDA-230818
Citation: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-17, 2024
Authors: Fabra-Boluda, Raül | Ferri, Cèsar | Hernández-Orallo, José | Ramírez-Quintana, M. José | Martínez-Plumed, Fernando
Article Type: Research Article
Abstract: The quest for transparency in black-box models has gained significant momentum in recent years. In particular, discovering the underlying machine learning technique type (or model family) from the performance of a black-box model is a real important problem both for better understanding its behaviour and for developing strategies to attack it by exploiting the weaknesses intrinsic to the learning technique. In this paper, we tackle the challenging task of identifying which kind of machine learning model is behind the predictions when we interact with a black-box model. Our innovative method involves systematically querying a black-box model (oracle) to …label an artificially generated dataset, which is then used to train different surrogate models using machine learning techniques from different families (each one trying to partially approximate the oracle’s behaviour). We present two approaches based on similarity measures, one selecting the most similar family and the other using a conveniently constructed meta-model. In both cases, we use both crisp and soft classifiers and their corresponding similarity metrics. By experimentally comparing all these methods, we gain valuable insights into the explanatory and predictive capabilities of our model family concept. This provides a deeper understanding of the black-box models and increases their transparency and interpretability, paving the way for more effective decision making. Show more
Keywords: Machine learning, family identification, adversarial, black-box, surrogate models
DOI: 10.3233/IDA-230707
Citation: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-21, 2024
Authors: Lu, Xiangyi | Tian, Feng | Shen, Yumeng | Zhang, Xuejun
Article Type: Research Article
Abstract: Traffic flow prediction can improve transportation efficiency, which is an important part of intelligent transportation systems. In recent years, the prediction method based on graph convolutional recurrent neural network has been widely used in traffic flow prediction. However, in real application scenarios, the spatial dependence of graph signals will change with time, and the filter using a fixed graph displacement operator cannot accurately predict traffic flow at the current moment. To improve the accuracy of traffic flow prediction, a two-layer graph convolutional recurrent neural network based on the dynamic graph displacement operator is proposed. The framework of our proposal is …to use the first layer of static graph convolutional recurrent neural network to generate the sequence wave vector of the graph displacement operator. The sequence wave vector is passed through the deconvolutional neural network to obtain the sequence dynamic graph displacement operator, and then the second layer dynamic graph convolutional recurrent neural network is used to predict the traffic flow at the next moment. The model is evaluated on the METR-LA and PEMS-BAY datasets. Experimental results demonstrate that our model signiï¬cantly outperforms other baseline models. Show more
Keywords: Traffic flow prediction, graph convolution, deep neural network
DOI: 10.3233/IDA-230174
Citation: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-17, 2024
Authors: Hu, Yi-Chung | Wu, Geng
Article Type: Research Article
Abstract: Empirical evidence has shown that forecast combination can improve the prediction accuracy of tourism demand forecasting. This paper aimed to develop a more accurate grey forecast combination method (GFCM) with multivariate grey prediction models In light of the practical applicability of grey prediction, which is not required to apply any statistical test to examine data series this research features the use of multivariate grey models through the genetic algorithm to synthesize forecasts from univariate grey prediction models commonly used in tourism forecasting into composite forecasts Empirical results showed that the proposed GFCM significantly outperformed the other combination methods considered. The …results also suggested that the risk of forecast failures caused by selecting an inappropriate single model for tourism demand forecasting can be reduced by using the GFCM. Show more
Keywords: Tourism demand, forecast combination, tourist arrivals, grey system, genetic algorithm
DOI: 10.3233/IDA-230565
Citation: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-14, 2024
Authors: Abidi, Mustufa Haider | Khare, Neelu | D., Preethi | Alkhalefah, Hisham | Umer, Usama
Article Type: Research Article
Abstract: The emergence of the novel COVID-19 virus has had a profound impact on global healthcare systems and economies, underscoring the imperative need for the development of precise and expeditious diagnostic tools. Machine learning techniques have emerged as a promising avenue for augmenting the capabilities of medical professionals in disease diagnosis and classification. In this research, the EFS-XGBoost classifier model, a robust approach for the classification of patients afflicted with COVID-19 is proposed. The key innovation in the proposed model lies in the Ensemble-based Feature Selection (EFS) strategy, which enables the judicious selection of relevant features from the expansive COVID-19 dataset. …Subsequently, the power of the eXtreme Gradient Boosting (XGBoost) classifier to make precise distinctions among COVID-19-infected patients is harnessed.The EFS methodology amalgamates five distinctive feature selection techniques, encompassing correlation-based, chi-squared, information gain, symmetric uncertainty-based, and gain ratio approaches. To evaluate the effectiveness of the model, comprehensive experiments were conducted using a COVID-19 dataset procured from Kaggle, and the implementation was executed using Python programming. The performance of the proposed EFS-XGBoost model was gauged by employing well-established metrics that measure classification accuracy, including accuracy, precision, recall, and the F1-Score. Furthermore, an in-depth comparative analysis was conducted by considering the performance of the XGBoost classifier under various scenarios: employing all features within the dataset without any feature selection technique, and utilizing each feature selection technique in isolation. The meticulous evaluation reveals that the proposed EFS-XGBoost model excels in performance, achieving an astounding accuracy rate of 99.8%, surpassing the efficacy of other prevailing feature selection techniques. This research not only advances the field of COVID-19 patient classification but also underscores the potency of ensemble-based feature selection in conjunction with the XGBoost classifier as a formidable tool in the realm of medical diagnosis and classification. Show more
Keywords: COVID-19, machine learning, classification, ensemble-based feature selection, XGBoost
DOI: 10.3233/IDA-230854
Citation: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-18, 2024
Authors: Kim, Dokyun | Cho, Sukhyun | Chae, Heewoong | Park, Jonghun | Huh, Jaeseok
Article Type: Research Article
Abstract: While time series data are prevalent across diverse sectors, data labeling process still remains resource-intensive. This results in a scarcity of labeled data for deep learning, emphasizing the importance of semi-supervised learning techniques. Applying semi-supervised learning to time series data presents unique challenges due to its inherent temporal complexities. Efficient contrastive learning for time series requires specialized methods, particularly in the development of tailored data augmentation techniques. In this paper, we propose a single-step, semi-supervised contrastive learning framework named nearest neighbor contrastive learning for time series (NNCLR-TS). Specifically, the proposed framework incorporates a support set to store representations including their …label information, enabling a pseudo-labeling of the unlabeled data based on nearby samples in the latent space. Moreover, our framework presents a novel data augmentation method, which selectively augments only the trend component of the data, effectively preserving their inherent periodic properties and facilitating effective training. For training, we introduce a novel contrastive loss that utilizes the nearest neighbors of augmented data for positive and negative representations. By employing our framework, we unlock the ability to attain high-quality embeddings and achieve remarkable performance in downstream classification tasks, tailored explicitly for time series. Experimental results demonstrate that our method outperforms the state-of-the-art approaches across various benchmarks, validating the effectiveness of our proposed method. Show more
Keywords: Deep learning, machine learning, representation learning, self-supervised learning, semi-supervised learning, time series analysis
DOI: 10.3233/IDA-240002
Citation: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-25, 2024
Authors: Zhang, Haifei | Ge, Hongwei | Li, Ting | Zhou, Lujie | Su, Shuzhi | Tong, Yubing
Article Type: Research Article
Abstract: In order to alleviate urban congestion, improve vehicle mobility, and improve logistics delivery efficiency, this paper establishes a practical multi-objective and multi constraint logistics delivery mathematical model based on graphs, and proposes a solution algorithm framework that combines decomposition strategy and deep reinforcement learning (DRL). Firstly, taking into account the actual multiple constraints such as customer distribution, vehicle load constraints, and time windows in urban logistics distribution regions, a multi constraint and multi-objective urban logistics distribution mathematical model was established with the goal of minimizing the total length, cost, and maximum makespan of urban logistics distribution paths. Secondly, based on …the decomposition strategy, a DRL framework for optimizing urban logistics delivery paths based on Graph Capsule Network (G-Caps Net) was designed. This framework takes the node information of VRP as input in the form of a 2D graph, modifies the graph attention capsule network by considering multi-layer features, edge information, and residual connections between layers in the graph structure, and replaces probability calculation with the module length of the capsule vector as output. Then, the baseline REINFORCE algorithm with rollout is used for network training, and a 2-opt local search strategy and sampling search strategy are used to improve the quality of the solution. Finally, the performance of the proposed method was evaluated on standard examples of problems of different scales. The experimental results showed that the constructed model and solution framework can improve logistics delivery efficiency. This method achieved the best comprehensive performance, surpassing the most advanced distress methods, and has great potential in practical engineering. Show more
Keywords: Urban logistics distribution, multi objective optimization, deep reinforcement learning, decomposition strategy, graph capsule network, attention mechanism
DOI: 10.3233/IDA-230480
Citation: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-28, 2024
Authors: Liu, Shi-Tong | Liu, Yong | Ding, Jia-Ming
Article Type: Research Article
Abstract: In the process of product ranking considering online reviews, they often are based on initial reviews and do not consider additional consumer reviews, but additional review information can sometimes directly affect consumers’ final decisions. To fully characterize the rich emotional preferences of consumers embedded in two-stage online customer reviews information, considering consumers’ individual preferences and product objective evaluation information, we construct a combination weighting method to calculate comprehensive weights of product attributes, and then exploit the sentiment analysis technique, interval-valued probabilistic linguistic term set (IVPLTS) and preference ranking organization method for enrichment evaluations (PROMETHEE) to establish a products ranking method …based on compound reviews, and then we use it to identify the sentiment orientation of reviews and the results. Finally, a real-life case illustrates a real-world application of the proposed method. Show more
Keywords: Product ranking, additional review information, sentiment analysis, interval-valued probabilistic linguistic term set, customer reviews information
DOI: 10.3233/IDA-230865
Citation: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-23, 2024
Authors: Zhao, Zhaolin | Bo, Kaiming | Hsu, Chih-Yu | Liao, Lyuchao
Article Type: Research Article
Abstract: With the rapid development of unmanned aerial vehicle (UAV) technology and computer vision, real-time object detection in UAV aerial images has become a current research hotspot. However, the detection tasks in UAV aerial images face challenges such as disparate object scales, numerous small objects, and mutual occlusion. To address these issues, this paper proposes the ASM-YOLO model, which enhances the original model by replacing the Neck part of YOLOv8 with an efficient bidirectional cross-scale connections and adaptive feature fusion (ABiFPN) . Additionally, a Structural Feature Enhancement Module (SFE) is introduced to inject features extracted by the backbone network into the …Neck part, enhancing inter-network information exchange. Furthermore, the MPDIoU bounding box loss function is employed to replace the original CIoU bounding box loss function. A series of experiments was conducted on the VisDrone-DET dataset, and comparisons were made with the baseline network YOLOv8s. The experimental results demonstrate that the proposed model in this study achieved reductions of 26.1% and 24.7% in terms of parameter count and model size, respectively. Additionally, during testing on the evaluation set, the proposed model exhibited improvements of 7.4% and 4.6% in the AP50 and mAP metrics, respectively, compared to the YOLOv8s baseline model, thereby validating the practicality and effectiveness of the proposed model. Subsequently, the generalizability of the algorithm was validated on the DOTA and DIOR datasets, which share similarities with aerial images captured by drones. The experimental results indicate significant enhancements on both datasets. Show more
Keywords: Computer vision, drone aerial images, multi-scale object detection, real-time object detection, feature fusion
DOI: 10.3233/IDA-230929
Citation: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-22, 2024
Authors: Wang, Xin | Zhang, Yong | Xu, Junfeng | Gao, Jun
Article Type: Research Article
Abstract: Capturing images through semi-reflective surfaces, such as glass windows and transparent enclosures, often leads to a reduction in visual quality and can adversely affect the performance of computer vision algorithms. As a result, image reflection removal has garnered significant attention among computer vision researchers. With the growing application of deep learning methods in various computer vision tasks, such as super-resolution, inpainting, and denoising, convolutional neural networks (CNNs) have become an increasingly popular choice for image reflection removal. The purpose of this paper is to provide a comprehensive review of learning-based algorithms designed for image reflection removal. Firstly, we provide an …overview of the key terminology and essential background concepts in this field. Next, we examine various datasets and data synthesis methods to assist researchers in selecting the most suitable options for their specific needs and targets. We then review existing methods with qualitative and quantitative results, highlighting their contributions and significance in this field. Finally, some considerations about challenges and future scope in image reflection removal techniques are discussed. Show more
Keywords: Deep learning, reflection removal, reflection separation, systematic literature review
DOI: 10.3233/IDA-230904
Citation: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-27, 2024
Authors: Devi, M. Shyamala | Aruna, R. | Almufti, Saman | Punitha, P. | Kumar, R. Lakshmana
Article Type: Research Article
Abstract: Bones collaborate with muscles and joints to sustain and maintain our freedom of mobility. The proper musculoskeletal activity of bone protects and strengthens the brain, heart, and lung function. When a bone is subjected to a force greater than its structural capacity, it fractures. Bone fractures should be detected with the appropriate type and should be treated early to avoid acute neurovascular complications. The manual detection of bone fracture may lead to highly delayed complications like malunion, Joint stiffness, Contractures, Myositis ossificans, and Avascular necrosis. A proper classification system must be integrated with deep learning technology to classify bone fractures …accurately. This motivates me to propose a Systematized Attention Gate UNet (SAG-UNet) that classifies the type of bone fracture with high accuracy. The main contribution of this research is two-fold. The first contribution focuses on dataset preprocessing through feature extraction using unsupervised learning by adapting the Growing Neural Gas (GNG) method. The second contribution deals with refining the supervised learning Attention UNet model that classifies the ten types of bone fracture. The attention gate of the Attention UNet model is refined and applied to the upsampling decoding layer of Attention UNet. The KAGGLE Bone Break Classification dataset was processed to extract only the essential features using GNG extraction. The quantized significant feature RGB X-ray image was divided into 900 training and 230 testing images in the ratio of 80:20. The training images are fitted with the existing CNN models like DenseNet, VGG, AlexNet, MobileNet, EfficientNet, Inception, Xception, UNet and Attention UNet to choose the best CNN model. Experiment results portray that Attention UNet offers the classification of bone fractures with an accuracy of 89% when testing bone break images. Now, the Attention UNet was chosen to refine the Attention gate of the Decoding upsampling layer that occurs after the encoding layer. The Attention Gate of the proposed SAG-UNet forms the gating coefficient from the input feature map and gate signal. The gating coefficient is then processed with batch normalization that centers the aligned features in the active region, thereby leaving the focus on the unaligned weights of feature maps. Then, the ReLU activation function is applied to introduce the nonlinearity in the aligned features, thereby learning the complex representation in the feature vector. Then, dropout is used to exclude the error noise in the aligned weights of the feature map. Then, 1 × 1 linear convolution transformation was done to form the vector concatenation-based attention feature map. This vector has been applied to the sigmoid activation to create the attention coefficient feature map with weights assigned as ‘1’ for the aligned features. The attention coefficient feature map was grid resampled using trilinear interpolation to form the spatial attention weight map, which is passed to the skip connection of the next decoding layer. The implementation results reveal that the proposed SAG-UNet deep learning model classifies the bone fracture types with a high accuracy of 98.78% compared to the existing deep learning models. Show more
Keywords: Activation, attention gate, CNN, classification, convolution, dropout, feature map, normalization, ReLU
DOI: 10.3233/IDA-240431
Citation: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-29, 2024
Authors: Anbarasan, M. | Ramesh, K.
Article Type: Research Article
Abstract: The pharmaceutical supply chain, which ensures that drugs are accessible to patients in a trusted process, is a complex arrangement in the healthcare industry. For that, a secure pharmachain framework is proposed. Primarily, the users register their details. Then, the details are converted into cipher text and stored in the blockchain. If a user requests an order, the manufacturer receives the request, and the order is handed to the distributor. Labeling is performed through Hypergeometric Distribution Centroid Selection K-Medoids Clustering (HDCS-KMC) to track the drugs. The healthcare Pharmachain architecture uses IoT to control the supply chain and provide safe medication …tracking. The framework includes security with a classifier and block mining consensus method, boosts performance with a decision controller, and protects user and medication information with encryption mechanisms. After that, the drugs are assigned to vehicles, where the vehicle ID and Internet of Things (IoT) sensor data are collected and pre-processed. Afterward, the pre-processed data is analyzed in the fog node by utilizing a decision controller. Now, the status ID is generated based on vehicle id and location. The generated status ID is meant for fragmentation, encryption, and block mining processes. If a user requests to view the drug’s status ID, then the user needs to get authentication. The user’s forking behavior and request activities were extracted and given to the classifier present in the block-mining consensus algorithm for authentication purposes. Block mining happens after authentication, thereby providing the status ID. Furthermore, the framework demonstrates an efficaciousness in identifying assaults with a low False Positive Rate (FPR) of 0.022483% and a low False Negative Rate (FNR) of 1.996008%. Additionally, compared to traditional methods, the suggested strategy exhibits good precision (97.869%), recall (97.0039%), accuracy (98%), and F-measure (97.999%). Show more
Keywords: Double Transposed-Prime Key-Columnar Transposition Cipher (DT-PK-CTC), Internet of Things (IoT), Hypergeometric Distribution Centroid Selection K-Medoids Clustering (HDCS-KMC), healthcare, pharmachain, Radial Basis Function (RBF)
DOI: 10.3233/IDA-240087
Citation: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-25, 2024
Authors: S, Sharmiladevi | S, Siva Sathya
Article Type: Research Article
Abstract: Air pollution is an alarming problem in many cities and countries around the globe. The ability to forecast air pollutant levels plays a crucial role in implementing necessary prevention measures to curb its effects in advance. There are many statistical, machine learning, and deep learning models available to predict air pollutant values, but only a limited number of models take into account the spatio-temporal factors that influence pollution. In this study a novel Deep Learning model that is augmented with Spatio-Temporal Co-Occurrence Patterns (STEEP) is proposed. The deep learning model uses the Closed Spatio-Temporal Co-Occurrence Pattern mining (C-STCOP) algorithm to …extract non-redundant/closed patterns and the Diffusion Convolution Recurrent Neural Network (DCRNN) for time series prediction. By constructing a graph based on the co-occurrence patterns obtained from C-STCOP, the proposed model effectively addresses the spatio-temporal association among monitoring stations. Furthermore, the sequence-to-sequence encoder-decoder architecture captures the temporal dependencies within the time series data. The STEEP model is evaluated using the Delhi air pollutants dataset and shows an average improvement of 8%–13% in RMSE, MAE and MAPE metric compared to the baseline models. Show more
Keywords: PM2.5 pollutant, spatio-temporal patterns, deep learning, time series prediction, delhi, air pollution
DOI: 10.3233/IDA-240028
Citation: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-18, 2024
Authors: Jain, Sachin | Jain, Vishal
Article Type: Research Article
Abstract: There has been extensive use of machine learning (ML) based tools for mathematical symbol and phrase categorization and prediction. Aiming to thoroughly analyze the existing methods for categorizing brain tumors, this paper considers both machine-learning and non-machine-learning approaches. From 2013 to 2023, the writers compiled and reviewed research papers on brain tumor detection. Wiley, IEEE-Explore, Science-Direct, Scopus, ACM-Digital Library, and others provide the relevant data. A systematic literature review examines the efficacy of research methodologies over the last ten years or more by compiling relevant publications and studies from various sources. Accuracy, sensitivity, specificity, and computing efficiency are some of …the criteria that researchers use to evaluate these methods. The availability of labeled data, the required degree of automation and accuracy in the classification process, and the unique dataset are generally the deciding factors in the method choice. This work integrates previous research findings to summarize the current state of brain tumor categorization. This paper summarizes the 169 research papers in brain tumor detection between 2013–2023 and explores the application and development of machine learning methods in brain tumor detection, which has significant research implications and value in the field of brain tumor classification research. All research findings of previous studies are arranged in this paper in the form of research questions and answers format. Show more
Keywords: Brain tumor classification, machine learning, deep learning, SVM, CNN
DOI: 10.3233/IDA-240069
Citation: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-32, 2024
Authors: He, Xinyu | Kan, Manfei | Ren, Yonggong
Article Type: Research Article
Abstract: Relation extraction is one of the core tasks of natural language processing, which aims to identify entities in unstructured text and judge the semantic relationships between them. In the traditional methods, the extraction of rich features and the judgment of complex semantic relations are inadequate. Therefore, in this paper, we propose a relation extraction model, HAGCN, based on heterogeneous graph convolutional neural network and graph attention mechanism. We have constructed two different types of nodes, words and relations, in a heterogeneous graph convolutional neural network, which are used to extract different semantic types and attributes and further extract contextual semantic …representations. By incorporating the graph attention mechanism to distinguish the importance of different information, and the model has stronger representation ability. In addition, an information update mechanism is designed in the model. Relation extraction is performed after iteratively fusing the node semantic information to obtain a more comprehensive node representation. The experimental results show that the HAGCN model achieves good relation extraction performance, and its F1 value reaches 91.51% in the SemEval-2010 Task 8 dataset. In addition, the HAGCN model also has good results in the WebNLG dataset, verifying the generalization ability of the model. Show more
Keywords: Relation extraction, heterogeneous graph convolution, graph attention, information update
DOI: 10.3233/IDA-240083
Citation: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-17, 2024
Authors: Anwar, Muhammad | He, Zhiquan | Cao, Wenming
Article Type: Research Article
Abstract: At the core of Deep Learning-based Deformable Medical Image Registration (DMIR) lies a strong foundation. Essentially, this network compares features in two images to identify their mutual correspondence, which is necessary for precise image registration. In this paper, we use three novel techniques to increase the registration process and enhance the alignment accuracy between medical images. First, we propose cross attention over multi-layers of pairs of images, allowing us to take out the correspondences between them at different levels and improve registration accuracy. Second, we introduce a skip connection with residual blocks between the encoder and decoder, helping information flow …and enhancing overall performance. Third, we propose the utilization of cascade attention with residual block skip connections, which enhances information flow and empowers feature representation. Experimental results on the OASIS data set and the LPBA40 data set show the effectiveness and superiority of our proposed mechanism. These novelties contribute to the enhancement of 3D DMIR-based on unsupervised learning with potential implications in clinical practice and research. Show more
Keywords: Deformable medical image registration, similarity measures, deep learning, convolutional neural networks
DOI: 10.3233/IDA-230692
Citation: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-19, 2024
Authors: Strada, Silvia | Costantini, Emanuele | Formentin, Simone | Savaresi, Sergio M.
Article Type: Research Article
Abstract: The Usage-Based Insurance paradigm, which is receiving a lot of attention in recent years, envisages computing the car policy premium based on accident risk probability, evaluated observing the past driving history and habits. However, Usage-Based Insurance strategies are usually based on simple empirical decision rules built on travelled distance. The development of intelligent systems for smart risk prediction using the stored overall driving behaviour, without the need of other insurance or socio-demographic information, is still an open challenge. This work aims at exploring a comprehensive machine learning-based approach solely based on driving-related data of private vehicles. The anonymized dataset employed …in this study is provided by the telematics company UnipolTech, and contains space/time densely measured data related to trips of almost 100000 vehicles uniformly spread on the Italian territory, recorded every 2 km by on-board telematics fix devices (black boxes), from February 2018 to February 2020. An innovative feature engineering process is proposed, with the aim of uncovering novel informative quantities able to disclose complex aspects of driving behaviour. Recent and powerful learning techniques are explored to develop advanced predictive models, able to provide a reliable accident probability for each vehicle, automatically managing the critical imbalance intrinsically peculiar this kind of datasets. Show more
Keywords: Mobility data, usage-based insurance, machine learning, driving behavior, accident risk prediction
DOI: 10.3233/IDA-230971
Citation: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-18, 2024
Authors: P, ThamilSelvi C | S, Vinoth Kumar | Asaad, Renas Rajab | Palanisamy, Punitha | Rajappan, Lakshmana Kumar
Article Type: Research Article
Abstract: Technological developments in medical image processing have created a state-of-the-art framework for accurately identifying and classifying brain tumors. To improve the accuracy of brain tumor segmentation, this study introduced VisioFlow FusionNet, a robust neural network architecture that combines the best features of DeepVisioSeg and SegFlowNet. The proposed system uses deep learning to identify the cancer site from medical images and provides doctors with valuable information for diagnosis and treatment planning. This combination provides a synergistic effect that improves segmentation performance and addresses challenges encountered across various tumor shapes and sizes. In parallel, robust brain tumor classification is achieved using NeuraClassNet, …a classification component optimized with a dedicated catfish optimizer. NeuraClassNet’s convergence and generalization capabilities are powered by the Cat Fish optimizer, which draws inspiration from the adaptive properties of aquatic predators. By complementing a comprehensive diagnostic pipeline, this classification module helps clinicians accurately classify brain tumors based on various morphological and histological features. The proposed framework outperforms current approaches regarding segmentation accuracy (99.2%) and loss (2%) without overfitting. Show more
Keywords: VisioFlow FusionNet, brain tumor segmentation, NeuraClassNet, cat fish optimizer, medical image analysis, deep learning
DOI: 10.3233/IDA-240108
Citation: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-26, 2024
Authors: Megala, G. | Swarnalatha, P.
Article Type: Research Article
Abstract: Video grounding intends to perform temporal localization in multimedia information retrieval. The temporal bounds of the target video span are determined for the given input query. A novel interactive multi-head self-attention (IMSA) transformer is proposed to localize an unseen moment in the untrimmed video for the given image. A new semantic-trained self-supervised approach is considered in this paper to perform cross-domain learning to match the image query – video segment. It normalizes the convolution function enabling efficient correlation and collecting of semantically related video segments across time based on the image query. A double hostile Contrastive learning with Gaussian distribution …parameters method is advanced to learn the representations of video. The proposed approach performs dynamically on various video components to achieve exact semantic synchronization and localization among queries and video. In the proposed approach, the IMSA model localizes frames greatly compared to other approaches. Experiments on benchmark datasets show that the proposed model can significantly increase temporal grounding accuracy. The moment occurrence is identified in the video with a start and end boundary ascertains an average recall of 86.45% and a mAP of 59.3%. Show more
Keywords: Contrastive learning, gaussian parameter, self-attention transformer, temporal localization, video grounding
DOI: 10.3233/IDA-240138
Citation: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-18, 2024
Authors: K, Vijay | Jayashree, K.
Article Type: Research Article
Abstract: Content-Based Image Retrieval (CBIR) uses complicated algorithms to analyze visual attributes and retrieve relevant photos from large databases. CBIR is essential to a privacy-preserving feature extraction and protection method for outsourced picture representation. SecureImageSec combines essential methods with the system’s key entities to ensure secure, private and protected image feature processing during outsourcing. For a system to be implemented effectively, these techniques must be seamlessly integrated across critical entities, such as the client, the cloud server that is being outsourced, the component that protects secure features, the component that maintains privacy in communication, access control, and authorization, and the integration …and system evaluation. The client entity initiates outsourcing using advanced encryption techniques to protect privacy. SecureImageSec protects outsourced data by using cutting-edge technologies like Fully Homomorphic Encryption (FHE) and Secure Multi-Party Computation (SMPC). Cloud servers hold secure feature protection entities and protect outsourced features’ privacy and security. SecureImageSec uses AES and FPE to protect data format. SecureImageSec’s cloud-outsourced privacy-preserving communication uses SSL/TLS and QKD to protect data transmission. Attribute-Based Encryption (ABE) and Functional Encryption (FE) in SecureImageSec limit access to outsourced features based on user attributes and allow fine-grained access control over decrypted data. SecureImageSec’s Information Leakage Rate (ILR) of 0.02 for a 1000-feature dataset shows its efficacy. SecureImageSec also achieves 4.5 bits of entropy, ensuring the encrypted feature set’s muscular cryptographic strength and randomness. Finally, SecureImageSec provides secure and private feature extraction and protection, including CBIR capabilities, for picture representation outsourcing. Show more
Keywords: Content-based image retrieval, Homomorphic Encryption, SecureImageSec, Quantum Key Distribution, Cloud computing
DOI: 10.3233/IDA-240265
Citation: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-22, 2024
Authors: Shantal, Mohammed | Othman, Zalinda | Abu Bakar, Azuraliza
Article Type: Research Article
Abstract: Missing data is one of the challenges a researcher encounters while attempting to draw information from data. The first step in solving this issue is to have the data stage ready for processing. Much effort has been made in this area; removing instances with missing data is a popular method for handling missing data, but it has drawbacks, including bias. It will be impacted negatively on the results. How missing values are handled depends on several vectors, including data types, missing rates, and missing mechanisms. It covers missing data patterns as well as missing at random, missing at completely random, …and missing not at random. Other suggestions include using numerous imputation techniques divided into various categories, such as statistical and machine learning methods. One strategy to improve a model’s output is to weight the feature values to better the performance of classification or regression approaches. This research developed a new imputation technique called correlation coefficient min-max weighted imputation (CCMMWI). It combines the correlation coefficient and min-max normalization techniques to balance the feature values. The proposed technique seeks to increase the contribution of features by considering how those elements relate to the desired functionality. We evaluated several established techniques to assess the findings, including statistical techniques, mean and EM imputation, and machine learning imputation techniques, including k -NNI, and MICE. The evaluation also used the imputation techniques CBRL, CBRC, and ExtraImpute. We use various sizes of datasets, missing rates, and random patterns. To compare the imputed datasets and original data, we finally provide the findings and assess them using the root mean squared error (RMSE), mean absolute error (MAE), and R 2 . According to the findings, the proposed CCMMWI performs better than most other solutions in practically all missing-rate scenarios. Show more
Keywords: Missing data, imputation method, feature weighting, correlation coefficient, data standardization, Min-Max normalization, classification method, regression method
DOI: 10.3233/IDA-230140
Citation: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-15, 2024
Authors: Liao, Shu-Hsien | Widowati, Retno | Liao, Shu-Ting
Article Type: Research Article
Abstract: A recommender system is an information filtering system used to predict a user’s rating or preference for an item. Dietary preferences are often influenced by various etiquettes and culture, such as appetite, the selection of ingredients, menu development, cooking methods, choice of tableware, seating arrangement of diners, order of eating, etc. Food delivery service is a courier service in that delivers food to customers by restaurants, stores, or independent delivery companies. With the continuous advances in information systems and data science, recommender systems are gradually developing towards to intentional and behavioral recommendations. Behavioral recommendation is an extension of peer-to-peer recommendation, …where merchants find the people who want to buy the product and deliver it. Intentional recommendation is a mindset that seeks to understand the life of consumers; by continuously collecting information about their actions on the internet and displaying events and information that match the life and purchase preferences of consumers. This study considers that data targeting is a method by which food delivery service platforms can understand consumers’ dietary preferences and individual lifestyles so that the food delivery service platform can effectively recommend food to the consumer. Thus, this study implements two stages data mining analytics, including clustering analysis and association rules, to investigate Taiwanese food consumers (n = 2,138) to investigate dietary and food delivery services behaviors and preferences to find knowledge profiles/patterns/rules for food intentional and behavioral recommendations. Finally, discussion and implications are presented. Show more
Keywords: Dietary preference, food delivery services, food intentional and behavioral recommendation, clustering analysis, association rules
DOI: 10.3233/IDA-240664
Citation: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-29, 2024
Authors: Sethi, Priyanshi | Bhardwaj, Rhythm | Sharma, Nonita | Sharma, Deepak Kumar | Srivastava, Gautam
Article Type: Research Article
Abstract: Neural style transfer is used as an optimization technique that combines two different images – a content image and a style reference image – to produce an output image that retains the appearance of the content image but has been modified to match the actual style of the style reference image. This is achieved by fine-tuning the output image to match the style reference images and the statistics for both content and style in the content image. These statistics are extracted from the images using a convolutional network. Primitive models such as WCT were improved upon by models such as PhotoWCT, whose …spatial and temporal limitations were improved upon by Deep Photo Style Transfer. Eventually, wavelet transforms were introduced to perform photorealistic style transfer. A wavelet-corrected transfer based on whitening and colouring transforms, i.e., WCT2 , was proposed that allowed the preservation of core content and eliminated the need for any post-processing steps and constraints. A model called Domain-Aware Universal Style Transfer also came into the picture. It supported both artistic and photorealistic style transfer. This study provides an overview of the neural style transfer technique. The recent advancements and improvements in the field, including the development of multi-scale and adaptive methods and the integration of semantic segmentation, are discussed and elaborated upon. Experiments have been conducted to determine the roles of encoder-decoder architecture and Haar wavelet functions. The optimum levels at which these can be leveraged for effective style transfer are ascertained. The study also highlights the contrast between VGG-16 and VGG-19 structures and analyzes various performance parameters to establish which works more efficiently for particular use cases. On comparing quantitative metrics across Gatys, AdaIN, and WCT, a gradual upgrade was seen across the models, as AdaIN was performing 99.92 percent better than the primitive Gatys model in terms of processing time. Over 1000 iterations, we found that VGG-16 and VGG-19 have comparable style loss metrics, but there is a difference of 73.1 percent in content loss. VGG-19, however, is displaying a better overall performance since it can keep both content and style losses at bay. Show more
Keywords: Content image, style image, VGG, photorealism
DOI: 10.3233/IDA-230765
Citation: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-15, 2024
Authors: Chen, Hongwei | Zhang, Man | Liu, Fangrui | Chen, Zexi
Article Type: Research Article
Abstract: Due to the rapid development of industrial manufacturing technology, modern mechanical equipment involves complex operating conditions and structural characteristics of hardware systems. Therefore, the state of components directly affects the stable operation of mechanical parts. To ensure engineering reliability improvement and economic benefits, bearing diagnosis has always been a concern in the field of mechanical engineering. Therefore, this article studies an effective machine learning method to extract useful fault feature information from actual bearing vibration signals and identify bearing faults. Firstly, variational mode decomposition decomposes the source signal into several intrinsic mode functions according to the actual situation. The vibration …signal of the bearing is decomposed and reconstructed. By iteratively solving the variational model, the optimal modulus function can be obtained, which can better describe the characteristics of the original signal. Then, the feature subset is efficiently searched using the wrapper method of feature selection and the improved binary salp swarm algorithm (IBSSA) to effectively reduce redundant feature vectors, thereby accurately extracting fault feature frequency signals. Finally, support vector machines are used to classify and identify fault types, and the advantages of support vector machines are verified through extensive experiments, improving the ability of global search potential solutions. The experimental findings demonstrate the superior fault recognition performance of the IBSSA algorithm, with a highest recognition accuracy of 97.5%. By comparing different recognition methods, it is concluded that this method can accurately identify bearing failure. Show more
Keywords: Fault diagnosis, salp swarm algorithm, feature selection, variational mode decomposition
DOI: 10.3233/IDA-230994
Citation: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-26, 2024
Authors: Wang, Qian | Li, Tie-Qiang | Sun, Haicheng | Yang, Hao | Li, Xia
Article Type: Research Article
Abstract: Magnetic Resonance Imaging (MRI) is a cornerstone of modern medical diagnosis due to its ability to visualize intricate soft tissues without ionizing radiation. However, noise artifacts significantly degrade image quality, hindering accurate diagnosis. Traditional denoising methods struggle to preserve details while effectively reducing noise. While deep learning approaches show promise, they often focus on local information, neglecting long-range dependencies. To address these limitations, this study proposes the deep and shallow feature fusion denoising network (DAS-FFDNet) for MRI denoising. DAS-FFDNet combines shallow and deep feature extraction with a tailored fusion module, effectively capturing both local and global image information. This approach …surpasses existing methods in preserving details and reducing noise, as demonstrated on publicly available T1-weighted and T2-weighted brain image datasets. The proposed model offers a valuable tool for enhancing MRI image quality and subsequent analyses. Show more
Keywords: Magnetic resonance imaging (MRI), image denoising, deep learning, UNet, convolutional neural networks (CNNs)
DOI: 10.3233/IDA-240613
Citation: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-13, 2024
Authors: S, Sadesh | Chandrasekaran, Gokul | Thangaraj, Rajasekaran | Kumar, Neelam Sanjeev
Article Type: Research Article
Abstract: The promising Network-on-Chip (NoC) model replaces the existing system-on-chip (SoC) model for complex VLSI circuits. Testing the embedded cores using NoC incurs additional costs in these SoC models. NoC models consist of network interface controllers, Internet Protocol (IP) data centers, routers, and network connections. Technological advancements enable the production of more complex chips, but longer testing times pose a potential problem. NoC packet switching networks provide high-performance interconnection, a significant benefit for IP cores. A multi-objective approach is created by integrating the benefits of the Whale Optimization Algorithm (WOA) and Grey Wolf Optimization (GWO). In order to minimize the duration …of testing, the approach implements optimization algorithms that are predicated on the behavior of grey wolves and whales. The P22810 and D695 benchmark circuits are under consideration. We compare the test time with existing optimization techniques. We assess the effectiveness of the suggested hybrid WOA-GWO algorithm using fourteen established benchmark functions and an NP-hard problem. This proposed method minimizes the time needed to test the P22810 benchmark circuit by 69%, 46%, 60%, 19%, and 21% compared to the Modified Ant Colony Optimization, Modified Artificial Bee Colony, WOA, and GWO algorithms. In the same vein, the proposed method reduces the testing time for the d695 benchmark circuit by 72%, 49%, 63%, 21%, and 25% in comparison to the same algorithms. We experimented to determine the time savings achieved by adhering to the suggested procedure throughout the testing process. Show more
Keywords: Test time, whale optimization algorithm, test access mechanism, grey wolf optimization, test scheduling, network-on-chip
DOI: 10.3233/IDA-240878
Citation: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-20, 2024
Authors: Wu, Renhui | Xu, Hui | Rui, Xiaobin | Wang, Zhixiao
Article Type: Research Article
Abstract: With the rapid development and popularization of smart mobile devices, users tend to share their visited points-of-interest (POIs) on the network with attached location information, which forms a location-based social network (LBSN). LBSNs contain a wealth of valuable information, including the geographical coordinates of POIs and the social connections among users. Nowadays, lots of trust-enhanced approaches have fused the trust relationships of users together with other auxiliary information to provide more accurate recommendations. However, in the traditional trust-aware approaches, the embedding processes of the information on different graphs with different properties (e.g., user-user graph is an isomorphic graph, user-POI graph …is a heterogeneous graph) are independent of each other and different embedding information is directly fused together without guidance, which limits their performance. More effective information fusion strategies are needed to improve the performance of trust-enhanced recommendation. To this end, we propose a T rust E nhanced POI recommendation approach with C ollaborative L earning (TECL) to merge geographic information and social influence. Our proposed model integrates two modules, a GAT-based graph autoencoder as trust relationships embedding module and a multi-layer deep neural network as a user-POI graph learning module. By applying collaborative learning strategy, these two modules can interact with each other. The trust embedding module can guide the selection of user’s potential features, and in turn the user-POI graph learning module enhances the embedding process of trust relationships. Different information is fused through the two-way interaction of information, instead of travelling in one direction. Extensive experiments are conducted using real-world datasets, and results illustrate that our suggested approach outperforms state-of-the-art methods. Show more
Keywords: Collaborative learning, graph attention network, location-based social network, point-of-interest recommendation
DOI: 10.3233/IDA-230897
Citation: Intelligent Data Analysis, vol. Pre-press, no. Pre-press, pp. 1-19, 2024
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
如果您在出版方面需要帮助或有任何建, 件至: [email protected]