Type of Publication

Book Chapter

Date:

12 /

2015

Status

Published

DOI:

DOI: 10.1007/978-3-319-27030-2_18

Automatic Web Page Classification Using Visual Content for Subjective and Functional Variables

Featured in:

Monfort, V., Krempels, KH. (eds) Web Information Systems and Technologies

Authors:

Nuno Gonçalves and António Videira

Abstract

Automatic classification of webpages has several applications in industry: digital marketing, search engines, content filtering and many more. Traditionally this classification has been done using only the textual information of webpages, which includes the html code, tags, title and more lately also the url. The aim of this paper is to prove that for some subjective variables, although very important to the applications mentioned, the visual information of webpages as they are rendered by the browser has extremely rich content for the classification task. The variables studied are the aesthetic value (whether pages are beautiful or ugly) and the design recency of them (whether pages are old fashioned or look modern). We then proved that automatic classifications that rely only on the visual look and feel can achieve very high accuracies. As we used several low-level and mid-level features and studied several criteria for selection and classification, our classifiers were able to improve one step further the stat of the art. Finally, we applied this framework to classify webpages in their topic (content aware) and also to classify whether pages are a blog or not (functional aware).

Citation
Nuno Gonçalves and António Videira (2015). Automatic Web Page Classification Using Visual Content for Subjective and Functional Variables. In: Monfort, V., Krempels, KH. (eds) Web Information Systems and Technologies. WEBIST 2014. Lecture Notes in Business Information Processing, vol 226. Springer, Cham. DOI: 10.1007/978-3-319-27030-2_18

Related Content

Researcher Coordinator, VIS TEAM Leader
No tagged content to show
No tagged content to show
No tagged content to show

RECENT PUBLICATIONS

VOIDFace: A Privacy-Preserving Multi-Network Face Recognition With Enhanced Security

Authors: Ajnas Muhammed; Iurii Medvedev; Nuno Gonçalves
Featured in: IEEE International Joint Conference on Biometrics (IJCB 2025)

Part I – Proceedings of the 12th Iberian Conference on Pattern Recognition and Image Analysis

Authors: Nuno Gonçalves; Hélder P. Oliveira; Joan Andreu Sánchez
Featured in: 12th Iberian Conference on Pattern Recognition and Image Analysis (IbPRIA 2025)

Part II – Proceedings of the 12th Iberian Conference on Pattern Recognition and Image Analysis

Authors: Nuno Gonçalves; Hélder P. Oliveira; Joan Andreu Sánchez
Featured in: 12th Iberian Conference on Pattern Recognition and Image Analysis (IbPRIA 2025)

suggested news

Paper accepted to IJCB 2025
Prof. Nuno and VIS Team successfully organizes IbPRIA...
Four papers presented @ IbPRIA 2025

RECENT PROJECTS

FACING2 – Face Image Understanding
VISUAL-ID – Unique Visual Identities in Graphics, Images and Faces
UniqueMark

Institute of Systems and Robotics Department of Electrical and Computers Engineering University of Coimbra