Affiliations: [a] Department of Electrical Engineering, Faculty of Engineering, Diponegoro University, Jl. Prof. Soedarto, SH Tembalang, Semarang, Indonesia. E-mail: [email protected] | [b] Department of Electrical Engineering & Information Technology, Gadjah Mada University, Jl. Grafika No. 2, Yogyakarta, Indonesia. E-mails: [email protected], [email protected], [email protected]
Abstract: The development of negative sites brings harm to users of the Internet, especially among teenagers. One way to block these sites is to provide a list of sites that are categorized as negative. However, the problem is that every day new sites appear that have not been listed yet. Therefore an intelligent system that can detect the content and can automatically update the list is needed. Negative content on a website can consist of text, image, and video contents that require different parsing techniques and classifiers to separate and classify such contents. Each classifier produces a probability. Hence, an algorithm that can combine these probabilities is required. Fusion algorithm can combine the probabilities of text, images, and video contents. However, the algorithm does not work on websites which have an equal proportion of negative and positive images, i.e. grey websites. These websites require specific handling such as a cascade fusion algorithm to change the sensitivity so it can reduce the level of over blocking. The results show that after the modification of the fusion algorithm, the accuracy of the classifier increased from 91.62% to 98.49% because the rate of over blocking could be reduced.