Featured in:
2020 IEEE International Conference on Autonomous Robot Systems and Competitions, Ponta Delgada, Portugal
Ricardo Pereira, Nuno Gonçalves, Luís Garrote, Tiago Barros, Ana Lopes and Urbano J. Nunes
This paper focuses on the task of RGB indoor scene classification. A single scene may contain various configurations and points of view, but there are a small number of objects that can characterize the scene. In this paper we propose a deeplearning based Global and Semantic Feature Fusion Approach (GSF2App) with two branches. In the first branch (top branch), a CNN model is trained to extract global features from RGB images, taking leverage from the ImageNet pre-trained model to initialize our CNN’s weights. In the second branch (bottom branch), we develop a semantic feature vector that represents the objects in the image, which are detected and classified through the COCO dataset pre-trained YOLOv3 model. Then, both global and semantic features are combined in an intermediate feature fusion stage. The proposed approach was evaluated on the SUN RGB-D Dataset and NYU Depth Dataset V2 achieving state-of-the-art results on both datasets.
© 2024 VISTeam | Made by Black Monster Media
Institute of Systems and Robotics Department of Electrical and Computers Engineering University of Coimbra