Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Article type: Research Article
Authors: Saleem, Saimaa; * | Khattar, Anuradhab | Mehrotra, Monicaa
Affiliations: [a] Department of Computer Science, Jamia Millia Islamia, New Delhi, India | [b] Department of Computer Science, Miranda House, University of Delhi, Delhi, India
Correspondence: [*] Corresponding author. Saima Saleem, Department of Computer Science, Jamia Millia Islamia, New Delhi, India. E-mail: [email protected].
Abstract: Rapidly classifying disaster-related social media (SM) images during a catastrophe event is critical for enhancing disaster response efforts. However, the biggest challenge lies in acquiring labeled data for an ongoing (target) disaster to train supervised learning-based models, given that the labeling process is both time-consuming and costly. In this study, we address this challenge by proposing a new multimodal transfer learning framework for the real-time classification of SM images of the target disaster. The proposed framework is based on Contrastive Language-Image Pretraining (CLIP) model, jointly pretrained on a dataset of image-text pairs via contrastive learning. We propose two distinct methods to design our classification framework (1) Zero-Shot CLIP: it learns visual representations from images paired with natural language descriptions of classes. By utilizing the vision and language capabilities of CLIP, we extract meaningful features from unlabeled target disaster images and map them to semantically related textual class descriptions, enabling image classification without training on disaster-specific data. (2) Linear-Probe CLIP: it further enhances the performance and involves training a linear classifier on top of the pretrained CLIP model’s features, specifically tailored to the disaster image classification task. By optimizing the linear-probe classifier, we improve the model’s ability to discriminate between different classes and achieve higher performance without the need for labeled data of the target disaster. Both methods are evaluated on a benchmark X (formerly Twitter) dataset comprising images of seven real-world disaster events. The experimental outcomes showcase the efficacy of the proposed methods, with Linear-Probe CLIP achieving a remarkable 7% improvement in average F1-score relative to the state-of-the-art methods.
Keywords: Transfer learning, CLIP, social media, image classification, disaster response
DOI: 10.3233/JIFS-241271
Journal: Journal of Intelligent & Fuzzy Systems, vol. Pre-press, no. Pre-press, pp. 1-18, 2024
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
如果您在出版方面需要帮助或有任何建, 件至: [email protected]