Featured in:
2020 IEEE International Conference on Autonomous Robot Systems and Competitions, Ponta Delgada, Portugal
Authors:
Ricardo Pereira, Nuno Gonçalves, Luís Garrote, Tiago Barros, Ana Lopes and Urbano J. Nunes
This paper focuses on the task of RGB indoor scene classification. A single scene may contain various configurations and points of view, but there are a small number of objects that can characterize the scene. In this paper we propose a deeplearning based Global and Semantic Feature Fusion Approach (GSF2App) with two branches. In the first branch (top branch), a CNN model is trained to extract global features from RGB images, taking leverage from the ImageNet pre-trained model to initialize our CNN’s weights. In the second branch (bottom branch), we develop a semantic feature vector that represents the objects in the image, which are detected and classified through the COCO dataset pre-trained YOLOv3 model. Then, both global and semantic features are combined in an intermediate feature fusion stage. The proposed approach was evaluated on the SUN RGB-D Dataset and NYU Depth Dataset V2 achieving state-of-the-art results on both datasets.
© 2024 VISTeam | Made by Black Monster Media
Institute of Systems and Robotics Department of Electrical and Computers Engineering University of Coimbra