Nuno Gonçalves

Prof. Nuno Gonçalves is researcher at the Institute of Systems and Robotics - University of Coimbra and Tenured Assistant Professor at the Dept. of Electrical and Computers Engineering of the University of Coimbra. He received his MsC and PhD degrees in 2002 and 2008, respectively from the University of Coimbra.
His main research areas are computer vision, biometrics, machine-readable codes, security printing, computer graphics and machine learning, with special emphasis to biometrics and steganography for ID and travel documents. He has scientific publications in the following topics: visual coding (machine-readable codes), steganography, object recognition, facial recognition and diagnosis, biometrics, documents security, augmented and virtual reality, reflections for image rendering, light field cameras, omnidirectional vision, non-central cameras, calibration, optics, camera models, motion estimation, pose estimation, web information systems, sports vision, legged robotics, amongst others. He was the Principal Investigator of a closed project, funded by the Portuguese Science and Technology Foundation, in non-central camera models for computer graphics and computer-aided surgery and he is currently Coordinator of seven projects with the industry (funded by the INCM - Portuguese Mint and Official Printing Office) in the area of security elements involving: biometrics, machine readable codes (with some applications in virtual and augmented reality), security unique marks in printing labels, security unique marks in assay contrasts in artifacts of precious metals and processing of human faces in 3D by using several types of cameras. He is inventor of four patents. Since 2018 he is also Innovation Manager in INCM where he coordinates projects in areas such as: authentication and security printing, biometrics, robotics, industry 4.0, among other areas.

Projects

Card3DFace

This project intends to create a 3D face printing system on cards. As for printing on polymer cards,...

TrustStamp

This project intends to develop verification tools to be applied on INCM trust stamps, to confirm au...

UniqueMark

This project aims to improve the safety of INCM's contrasting marks in precious metal artefacts (the...

TrustFaces

The TrustFaces project derives from the TrustStamp project, completed in June 2018, in partnership w...

FACING

Os principais objetivos deste projeto são a realização de baterias exaustivas de testes de ferrament...

UniQode

Este projeto é a continuação do projeto TrustStamp, alargando o seu âmbito e permitindo dar resposta...

UniqueMark Pilot

This is a research and development project funded by the Imprensa Nacional Casa da Moeda (INCM) R...

TruIM – Trust Image Understanding

TruIm Project aims at developing technologies to authenticate objects in certified images, encoded u...

Publications

An Application of a Halftone Pattern Coding in Augmented Reality

Presentation of a coding system using a halftone pattern (with black and white pixels) capable to be integrated into markers that encode information that can be retrieved a posteriori and used in the creation of augmented reality applications. These markers can be easily detected in a photo and the encoded information is the basis for parameterizing various types of augmented reality applications.

  • Date: 30/11/2017
  • //
  • Featured In: SIGGRAPH Asia 2017
  • //
  • Publication Type: Conference Papers
  • //
  • Author(s): Bruno Patrão, Leandro Cruz, Nuno Gonçalves
  • //
  • DOI: 10.1145/3145690.3145705
  • //
  • Download File

Halftone Pattern: A New Steganographic Approach

Presentation of a steganographic technique to hide a textual information into an image. It is inspired by the use of dithering to create halftone images. It begins from a base image and creates the coded image by associating each base image pixel to a set of two-colors pixels (halftone) forming an appropriate pattern. The coded image is a machine readable information, with good aesthetic, secure and containing data redundancy and compression.

  • Date: 03/07/2018
  • //
  • Featured In: Eurographics 2018
  • //
  • Publication Type: Conference Papers
  • //
  • Author(s): Bruno Patrão, Leandro Cruz, Nuno Gonçalves
  • //
  • Download File

Exemplar Based Filtering of 2.5D Meshes of Faces

Presentation of a content-aware filtering for 2.5D meshes of faces. An exemplar-based filter that corrects each point of a given mesh through local model-exemplar neighborhood comparison taking advantage of prior knowledge of the models (faces) to improve the comparison.

  • Date: 25/03/2018
  • //
  • Featured In: Eurographics 2018 Posters
  • //
  • Publication Type: Poster
  • //
  • Author(s): Leandro Dihl, Leandro Cruz, Nuno Gonçalves
  • //
  • Download File

Use of Epipolar Images Towards Outliers Extraction in Depth Images

Method for filtering the depth model, reconstructed from light field cameras, based on the removal of low confidence reconstructed values and using an inpainting method to replace them. This approach has shown good results for outliers removal.

  • Date: 26/10/2018
  • //
  • Featured In: Recpad 2018-24th Portuguese Conference on Pattern Recognition
  • //
  • Publication Type: Poster
  • //
  • Author(s): Dirce Celorico, Leandro Cruz, Leandro Dihl, Nuno Gonçalves
  • //
  • Download File

A Content-aware Filtering for RGBD Faces

A content-aware filtering for 2.5D meshes of faces that preserves their intrinsic features. We take advantage of prior knowledge of the models (faces) to improve the comparison. The model is invariant to depth translation and scale. The proposed method is evaluated on a public 3D face dataset with different levels of noise. The results show that the method is able to remove noise without smoothing the sharp features of the face.

  • Date: 25/02/2019
  • //
  • Featured In: GRAPP 2019 - International Conference on Computer Graphics Theory and Applications
  • //
  • Publication Type: Conference Papers
  • //
  • Author(s): Leandro Dihl, Leandro Cruz, Nuno Monteiro, Nuno Gonçalves
  • //
  • Download File

Graphic Code: Creation, Detection and Recognition

Graphic Code is a new Machine Readable Coding (MRC) method. It creates coded images by organizing available primitive graphic units arranged according to some predefined patterns. Some of these patterns are previously associated with symbols used to compose the messages and to define a dictionary.

  • Date: 26/10/2018
  • //
  • Featured In: Recpad 2018-24th Portuguese Conference on Pattern Recognition
  • //
  • Publication Type: Poster
  • //
  • Author(s): Leandro Cruz, Bruno Patrão, Nuno Gonçalves
  • //
  • Download File

Large Scale Information Marker Coding for Augmented Reality Using Graphic Code

The main advantage of using this approach as an Augmented Reality marker is the possibility of creating generic applications that can read and decode these Graphic Code markers, which might contain 3D models and complex scenes encoded in it. Additionally, the resulting marker has strong aesthetic characteristics associated to it once it is generated from any chosen base image.

  • Date: 10/12/2018
  • //
  • Featured In: IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR)
  • //
  • Publication Type: Conference Papers
  • //
  • Author(s): Leandro Cruz, Bruno Patrão, Nuno Gonçalves
  • //
  • Download File

An Augmented Reality Application Using Graphic Code Markers

Presenting applications of the Graphic Code, exploiting its large-scale information coding capabilities applied to Augmented Reality.

  • Date: 10/12/2018
  • //
  • Featured In: IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR)
  • //
  • Publication Type: Poster
  • //
  • Author(s): Leandro Cruz, Bruno Patrão, Nuno Gonçalves
  • //
  • Download File

Uniquemark: A computer vision system for hallmarks authentication

Uniquemark is a vision system for authentication based on random marks, particularly hallmarks. Hallmarks are worldwide used to authenticate and attest the legal fineness of precious metal artefacts. Our authentication method is based on a multiclass classifier model that uses mark descriptor composed by several geometric features of the particles.

  • Date: 26/10/2018
  • //
  • Featured In: Recpad 2018-24th Portuguese Conference on Pattern Recognition
  • //
  • Publication Type: Conference Papers
  • //
  • Author(s): Ricardo Barata, Leandro Cruz, Bruno Patrão, Nuno Gonçalves
  • //
  • Download File

Improving Facial Depth Data by Exemplar-based Comparisons

Presented a filtering method for meshes of faces preserving their intrinsic features. It is based in an exemplar-based neighborhood matching where all models are in a frontal position avoiding rotation and perspective drawbacks. Moreover, the model is invariant to depth translation and scale.

  • Date: 26/10/2018
  • //
  • Featured In: Recpad 2018-24th Portuguese Conference on Pattern Recognition
  • //
  • Publication Type: Poster
  • //
  • Author(s): Leandro Dihl, Leandro Cruz, Nuno Gonçalves
  • //
  • Download File

Graphic Code: a New Machine Readable Approach

Graphic Code has two major advantages over classical MRCs: aesthetics and larger coding capacity. It opens new possibilities for several purposes such as identification, tracking (using a specific border), and transferring of content to the application. This paper focuses on presenting how graphic code can be used for industry applications, emphasizing its uses on Augmented Reality (AR).

  • Date: 10/12/2018
  • //
  • Featured In: IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR)
  • //
  • Publication Type: Conference Papers
  • //
  • Author(s): Leandro Cruz, Bruno Patrão, Nuno Gonçalves
  • //
  • DOI: 10.1109/AIVR.2018.00036
  • //
  • Download File

UniqueMark - A method to create and authenticate a unique mark in precious metal artefacts

The project UniqueMark aims at creating a system to provide a precious metal artefact with a unique, unclonable and irreproducible mark, and at building a system for the validation of authenticity of it. This system verifies the mark authenticity using a microscope, or a smartphone camera with attached macro lens.

  • Date: 24/07/2019
  • //
  • Featured In: Jewellery Materials Congress 2019
  • //
  • Publication Type: Conference Papers
  • //
  • Author(s): Nuno Gonçalves, Leandro Cruz
  • //
  • Download File

Deep Facial Diagnosis: Deep Transfer Learning From Face Recognition to Facial Diagnosis

The relationship between face and disease has been discussed from thousands years ago, which leads to the occurrence of facial diagnosis. The objective here is to explore the possibility of identifying diseases from uncontrolled 2D face images by deep learning techniques. In this paper, we propose using deep transfer learning from face recognition to perform the computer-aided facial diagnosis on various diseases. In the experiments, we perform the computer-aided facial diagnosis on single (beta-thalassemia) and multiple diseases (beta-thalassemia, hyperthyroidism, Down syndrome, and leprosy) with a relatively small dataset. The overall top-1 accuracy by deep transfer learning from face recognition can reach over 90% which outperforms the performance of both traditional machine learning methods and clinicians in the experiments. In practical, collecting disease-specific face images is complex, expensive and time consuming, and imposes ethical limitations due to personal data treatment. Therefore, the datasets of facial diagnosis related researches are private and generally small comparing with the ones of other machine learning application areas. The success of deep transfer learning applications in the facial diagnosis with a small dataset could provide a low-cost and noninvasive way for disease screening and detection.

  • Date: 16/06/2020
  • //
  • Featured In: IEEE Access, vol. 8, pp. 123649-123661
  • //
  • Publication Type: Journal Articles
  • //
  • Author(s): Bo Jin, Leandro Cruz, Nuno Gonçalves
  • //
  • DOI: 10.1109/ACCESS.2020.3005687
  • //
  • Download File

Biometric System for Mobile Validation of ID And Travel Documents

Current trends in security of ID and travel documents require portable and efficient validation applications that rely on biometric recognition. Such tools can allow any authority and citizen to validate documents and authenticate citizens with no need of expensive and sometimes unavailable proprietary devices. In this work, we present a novel, compact and efficient approach of validating ID and travel documents for offline mobile applications. The approach employs the in-house biometric template that is extracted from the original portrait photo (either full frontal or token frontal), and then stored on the ID document with use of a machine readable code (MRC). The ID document can then be validated with a developed application on a mobile device with digital camera. The similarity score is estimated with use of an artificial neural network (ANN). Results show that we achieve validation accuracy up to 99.5% with corresponding false match rate = 0.0047 and false non-match rate = 0.00034. (CITATION: I. Medvedev, N. Gonçalves and L. Cruz, "Biometric System for Mobile Validation of ID And Travel Documents," 2020 International Conference of the Biometrics Special Interest Group (BIOSIG), 2020, pp. 1-5.)

  • Date: 01/10/2020
  • //
  • Featured In: 2020 International Conference of the Biometrics Special Interest Group (BIOSIG), Darmstadt, Germany, pp. 1-5
  • //
  • Publication Type: Conference Papers
  • //
  • Author(s): Iurii Medvedev, Nuno Gonçalves, Leandro Cruz
  • //
  • Download File
  • //
  • Visit Website

Multimodal Deep-Learning for Object Recognition Combining Camera and LIDAR Data

Object detection and recognition is a key component of autonomous robotic vehicles, as evidenced by the continuous efforts made by the robotic community on areas related to object detection and sensory perception systems. This paper presents a study on multisensor (camera and LIDAR) late fusion strategies for object recognition. In this work, LIDAR data is processed as 3D points and also by means of a 2D representation in the form of depth map (DM), which is obtained by projecting the LIDAR 3D point cloud into a 2D image plane followed by an upsampling strategy which generates a high-resolution 2D range view. A CNN network (Inception V3) is used as classification method on the RGB images, and on the DMs (LIDAR modality). A 3D- network (the PointNet), which directly performs classification on the 3D point clouds, is also considered in the experiments. One of the motivations of this work consists of incorporating the distance to the objects, as measured by the LIDAR, as a relevant cue to improve the classification performance. A new range- based average weighting strategy is proposed, which considers the relationship between the deep-models’ performance and the distance of objects. A classification dataset, based on the KITTI database, is used to evaluate the deep-models, and to support the experimental part. We report extensive results in terms of single modality i.e., using RGB and LIDAR models individually, and late fusion multimodality approaches. (CITATION: Gledson Melotti, Cristiano Premebida, Nuno Gonçalves, "Multimodal Deep-Learning for Object Recognition Combining Camera and LIDAR Data" 2020 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC), Ponta Delgada, Portugal, pp. 177-182)

  • Date: 01/10/2020
  • //
  • Featured In: 2020 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC), Ponta Delgada, Portugal, pp. 177-182
  • //
  • Publication Type: Conference Papers
  • //
  • Author(s): Gledson Melotti, Cristiano Premebida, Nuno Gonçalves
  • //
  • DOI: 10.1109/ICARSC49921.2020.9096138
  • //
  • Download File
  • //
  • Visit Website

Deep-Learning based Global and Semantic Feature Fusion for Indoor Scene Classification

This paper focuses on the task of RGB indoor scene classification. A single scene may contain various configurations and points of view, but there are a small number of objects that can characterize the scene. In this paper we propose a deep-learning based Global and Semantic Feature Fusion Approach (GSF 2 App) with two branches. In the first branch (top branch), a CNN model is trained to extract global features from RGB images, taking leverage from the ImageNet pre-trained model to initialize our CNN's weights. In the second branch (bottom branch), we develop a semantic feature vector that represents the objects in the image, which are detected and classified through the COCO dataset pre-trained YOLOv3 model. Then, both global and semantic features are combined in an intermediate feature fusion stage. The proposed approach was evaluated on the SUN RGB-D Dataset and NYU Depth Dataset V2 achieving state-of-the-art results on both datasets.

  • Date: 15/04/2020
  • //
  • Featured In: 2020 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC), Ponta Delgada, Portugal
  • //
  • Publication Type: Conference Papers
  • //
  • Author(s): Ricardo Pereira; Nuno Gonçalves; Luís Garrote; Tiago Barros; Ana Lopes; Urbano J. Nunes
  • //
  • DOI: 10.1109/ICARSC49921.2020.9096068
  • //
  • Download File
  • //
  • Visit Website

Object detection in traffic scenarios - a comparison of traditional and deep learning approaches

In the area of computer vision, research on object detection algorithms has grown rapidly as it is the fundamental step for automation, specifically for self-driving vehicles. This work presents a comparison of traditional and deep learning approaches for the task of object detection in traffic scenarios. The handcrafted feature descriptor like Histogram of oriented Gradients (HOG) with a linear Support Vector Machine (SVM) classifier is compared with deep learning approaches like Single Shot Detector (SSD) and You Only Look Once (YOLO), in terms of mean Average Precision (mAP) and processing speed. SSD algorithm is implemented with different backbone architectures like VGG16, MobileNetV2 and ResNeXt50, similarly YOLO algorithm with MobileNetV1 and ResNet50, to compare the performance of the approaches. The training and inference is performed on PASCAL VOC 2007 and 2012 training, and PASCAL VOC 2007 test data respectively. We consider five classes relevant for traffic scenarios, namely, bicycle, bus, car, motorbike and person for the calculation of mAP. Both qualitative and quantitative results are presented for comparison. For the task of object detection, the deep learning approaches outperform the traditional approach both in accuracy and speed. This is achieved at the cost of requiring large amount of data, high computation power and time to train a deep learning approach.

  • Date: 11/07/2020
  • //
  • Featured In: Computer Science & Information Technology, AIRCC Publishing Corporation
  • //
  • Publication Type: Conference Papers
  • //
  • Author(s): Gopi K. Erabati, Nuno Gonçalves, and Helder Araújo
  • //
  • DOI: 10.5121/csit.2020.100918
  • //
  • Download File
  • //
  • Visit Website

Bio-Inspired Modality Fusion for Active Speaker Detection

Human beings have developed fantastic abilities to integrate information from various sensory sources exploring their inherent complementarity. Perceptual capabilities are therefore heightened, enabling, for instance, the well-known "cocktail party" and McGurk effects, i.e., speech disambiguation from a panoply of sound signals. This fusion ability is also key in refining the perception of sound source location, as in distinguishing whose voice is being heard in a group conversation. Furthermore, neuroscience has successfully identified the superior colliculus region in the brain as the one responsible for this modality fusion, with a handful of biological models having been proposed to approach its underlying neurophysiological process. Deriving inspiration from one of these models, this paper presents a methodology for effectively fusing correlated auditory and visual information for active speaker detection. Such an ability can have a wide range of applications, from teleconferencing systems to social robotics. The detection approach initially routes auditory and visual information through two specialized neural network structures. The resulting embeddings are fused via a novel layer based on the superior colliculus, whose topological structure emulates spatial neuron cross-mapping of unimodal perceptual fields. The validation process employed two publicly available datasets, with achieved results confirming and greatly surpassing initial expectations.

  • Date: 10/04/2021
  • //
  • Featured In: Applied Sciences
  • //
  • Publication Type: Journal Articles
  • //
  • Author(s): Gustavo Assunção, Nuno Gonçalves, Paulo Menezes
  • //
  • DOI: 10.3390/app11083397
  • //
  • Download File
  • //
  • Visit Website

Probabilistic Object Classification using CNN ML-MAP layers

Deep networks are currently the state-of-the-art for sensory perception in autonomous driving and robotics. However, deep models often generate overconfident predictions precluding proper probabilistic interpretation which we argue is due to the nature of the SoftMax layer. To reduce the overconfidence without compromising the classification performance, we introduce a CNN probabilistic approach based on dis- tributions calculated in the network’s Logit layer. The approach enables Bayesian inference by means of ML and MAP layers. Experiments with calibrated and the proposed prediction layers are carried out on object classification using data from the KITTI database. Results are reported for camera (RGB) and LiDAR (range-view) modalities, where the new approach shows promising performance compared to SoftMax.

  • Date: 24/08/2020
  • //
  • Featured In: ECCV Workshop on Perception for Autonomous Driving (PAD)
  • //
  • Publication Type: Conference Papers
  • //
  • Author(s): Gledson Melotti, Cristiano Premebida, Jordan Bird, Diego Faria, and Nuno Gonçalves
  • //
  • Download File
  • //
  • Visit Website

Towards Facial Biometrics for ID Document Validation in Mobile Devices

Various modern security systems follow atendency to simplify the usage of the existing biometric recognition solutions and embed them into ubiquitous portable devices. In this work, we continue the investigation and development of our method for securing identification documents. The original facial biometric template, which is extracted from the trusted frontal face image, is stored on the identification document in a secured personalized machine-readable code. Such document is protected from face photo manipulation and may be validated with an offline mobile application. We apply automatic methods of compressing the developed face descriptors to make the biometric validation system more suitable for mobile applications. As an additional contribution, we introduce several print-capture datasets that may be used for training and evaluating similar systems for mobile identification and travel documents validation.

  • Date: 01/07/2021
  • //
  • Featured In: Applied Sciences
  • //
  • Publication Type: Journal Articles
  • //
  • Author(s): Iurii Medvedev, Farhad Shadmand, Leandro Cruz, Nuno Gonçalves
  • //
  • DOI: 10.3390/app11136134
  • //
  • Download File
  • //
  • Visit Website

QualFace: Adapting Deep Learning Face Recognition for ID and Travel Doc with Quality Assessment

Modern face recognition biometrics widely rely on deep neural networks that are usually trained on large collections of wild face images of celebrities. This choice of the data is related with its public availability in a situation when existing ID document compliant face image datasets (usually stored by national institutions) are hardly accessible due to continuously increasing privacy restrictions. However this may lead to a leak in performance in systems developed specifically for ID document compliant images. In this work we proposed a novel face recognition approach for mitigating that problem. To adapt deep face recognition network for document security purposes, we propose to regularise the training process with specific sample mining strategy which penalises the samples by their estimated quality, where the quality metric is proposed by our work and is related to the specific case of face images for ID documents. We perform extensive experiments and demonstrate the efficiency of proposed approach for ID document compliant face images.

  • Date: 01/08/2021
  • //
  • Featured In: BIOSIG 2021
  • //
  • Publication Type: Conference Papers
  • //
  • Author(s): João Tremoço, Iurii Medvedev, Nuno Gonçalves
  • //
  • Download File
  • //
  • Visit Website

CodeFace: a deep learning printer-proof steganography for Face Portraits.

Identity Documents (IDs) containing a facial portrait constitute a prominent form of personal identification. Photograph substitution in official documents (a genuine photo replaced by a non- genuine photo) or originally fraudulent documents with an arbitrary photograph are well known attacks, but unfortunately still efficient ways of misleading the national authorities in in-person identification processes. Therefore, in order to confirm that the identity document holds a validated photo, a novel face image steganography technique to encode secret messages in facial portraits and then decode these hidden messages from physically printed facial photos of Identity Documents (IDs) and Machine-Readable Travel Documents (MRTDs), is addressed in this paper. The encoded face image looks like the original image to a naked eye. Our architecture is called CodeFace. CodeFace comprises a deep neural network that learns an encoding and decoding algorithm to robustly include several types of image perturbations caused by image compression, digital transfer, printer devices, environmental lighting and digital cameras. The appearance of the encoded facial photo is preserved by minimizing the distance of the facial features between the encoded and original facial image and also through a new network architecture to improve the data restoration for small images. Extensive experiments were performed with real printed documents and smartphone cameras. The results obtained demonstrate high robustness in the decoding of hidden messages in physical polycarbonate and PVC cards, as well as the stability of the method for encoding messages up to a size of 120 bits.

  • Date: 29/10/2021
  • //
  • Featured In: IEEE Access
  • //
  • Publication Type: Journal Articles
  • //
  • Author(s): F. Shadmand, I. Medvedev and N. Gonçalves
  • //
  • DOI: 10.1109/ACCESS.2021.3132581
  • //
  • Download File
  • //
  • Visit Website

Card3DFace—An Application to Enhance 3D Visual Validation in ID Cards and Travel Documents

The identification of a person is a natural way to gain access to information or places. A face image is an essential element of visual validation. In this paper, we present the Card3DFace application, which captures a single-shot image of a person’s face. After reconstructing the 3D model of the head, the application generates several images from different perspectives, which, when printed on a card with a layer of lenticular lenses, produce a 3D visualization effect of the face. The image acquisition is achieved with a regular consumer 3D camera, either using plenoptic, stereo or time-of-flight technologies. This procedure aims to assist and improve the human visual recognition of ID cards and travel documents through an affordable and fast process while simultaneously increasing their security level. The whole system pipeline is analyzed and detailed in this paper. The results of the experiments performed with polycarbonate ID cards show that this end-to-end system is able to produce cards with realistic 3D visualization effects for humans.

  • Date: 23/09/2021
  • //
  • Featured In: Applied Sciences
  • //
  • Publication Type: Journal Articles
  • //
  • Author(s): Dihl, L., Cruz, L., & Gonçalves, N.
  • //
  • DOI: 10.3390/app11198821
  • //
  • Download File
  • //
  • Visit Website

Reducing Overconfidence Predictions in Autonomous Driving Perception

In state-of-the-art deep learning for object recognition, Softmax and Sigmoid layers are most commonly employed as the predictor outputs. Such layers often produce overconfidence predictions rather than proper probabilistic scores, which can thus harm the decision-making of ‘critical’ perception systems applied in autonomous driving and robotics. Given this, we propose a probabilistic approach based on distributions calculated out of the Logit layer scores of pre-trained networks which are then used to constitute new decision layers based on Maximum Likelihood (ML) and Maximum a-Posteriori (MAP) inference. We demonstrate that the hereafter called ML and MAP layers are more suitable for probabilistic interpretations than Softmax and Sigmoid-based predictions for object recognition.We explore distinct sensor modalities via RGB images and LiDARs (RV: range-view) data from the KITTI and Lyft Level-5 datasets, where our approach shows promising performance compared to the usual Softmax and Sigmoid layers, with the benefit of enabling interpretable probabilistic predictions. Another advantage of the approach introduced in this paper is that the so-called ML and MAP layers can be implemented in existing trained networks, that is, the approach benefits from the output of the Logit layer of pre-trained networks. Thus, there is no need to carry out a new training phase since the ML and MAP layers are used in the test/prediction phase. The Classification results are presented using reliability diagrams, while detection results are illustrated using precision-recall curves.

  • Date: 16/05/2022
  • //
  • Featured In: IEEE Access, vol. 10, pp. 54805-54821, 2022
  • //
  • Publication Type: Journal Articles
  • //
  • Author(s): G. Melotti, C. Premebida, J. J. Bird, D. R. Faria and N. Gonçalves
  • //
  • DOI: 10.1109/ACCESS.2022.3175195
  • //
  • Download File
  • //
  • Visit Website

MorDeephy: Face Morphing Detection Via Fused Classification (preprint)

Face morphing attack detection (MAD) is one of the most challenging tasks in the field of face recognition nowadays. In this work, we introduce a novel deep learning strategy for a single image face morphing detection, which implies the discrimination of morphed face images along with a so- phisticated face recognition task in a complex classification scheme. It is directed onto learning the deep facial features, which carry information about the authenticity of these fea- tures. Our work also introduces several additional contributions: the public and easy-to-use face morphing detection benchmark and the results of our wild datasets filtering strategy. Our method, which we call MorDeephy, achieved the state of the art performance and demonstrated a promi- nent ability for generalising the task of morphing detection to unseen scenarios.

  • Date: 05/08/2022
  • //
  • Featured In: arXiv
  • //
  • Publication Type: Journal Articles
  • //
  • Author(s): Iurii Medvedev, Farhad Shadmand, Nuno Gonçalves
  • //
  • DOI: 10.48550/arXiv.2208.03110
  • //
  • Download File
  • //
  • Visit Website

Pseudo RGB-D Face Recognition

In the last decade, advances and popularity of low cost RGB-D sensors have enabled us to acquire depth information of objects. Consequently, researchers began to solve face recognition problems by capturing RGB-D face images using these sensors. Until now, it is not easy to acquire the depth of human faces because of limitations imposed by privacy policies, and RGB face images are still more common. Therefore, obtaining the depth map directly from the corresponding RGB image could be helpful to improve the performance of subsequent face processing tasks such as face recognition. Intelligent creatures can use a large amount of experience to obtain three-dimensional spatial information only from two-dimensional plane scenes. It is machine learning methodology which is to solve such problems that can teach computers to generate correct answers by training. To replace the depth sensors by generated pseudo depth maps, in this paper, we propose a pseudo RGB-D face recognition framework and provide data driven ways to generate the depth maps from 2D face images. Specially, we design and implement a generative adversarial network model named “D+GAN” to perform the multi-conditional image- to-image translation with face attributes. By this means, we validate the pseudo RGB-D face recognition with experiments on various datasets. With the cooperation of image fusion technologies, especially Non-subsampled Shearlet Transform, the accuracy of face recognition has been signi cantly improved.

  • Date: 01/08/2022
  • //
  • Featured In: IEEE Sensors Journal
  • //
  • Publication Type: Journal Articles
  • //
  • Author(s): Bo Jin, Leandro Cruz and Nuno Gonçalves
  • //
  • DOI: 10.1109/JSEN.2022.3197235
  • //
  • Download File
  • //
  • Visit Website

Towards understanding the character of quality sampling in deep learning face recognition

Face recognition has become one of the most important modalities of biometrics in recent years. It widely utilises deep learning computer vision tools and adopts large collections of unconstrained face images of celebrities for training. Such choice of the data is related to its public availability when existing document compliant face image collections are hardly accessible due to security and privacy issues. Such inconsistency between the training data and deploy scenario may lead to a leak in performance in biometric systems, which are developed speci cally for dealing with ID document compliant images. To mitigate this problem, we propose to regularise the training of the deep face recognition network with a speci c sample mining strategy, which penalises the samples by their estimated quality. In addition to several considered quality metrics in recent work, we also expand our deep learning strategy to other sophisticated quality estimation methods and perform experiments to better understand the nature of quality sampling. Namely, we seek for the penalising manner (sampling character) that better satis es the purpose of adapting deep learning face recognition for images of ID and travel documents. Extensive experiments demonstrate the ef ciency of the approach for ID document compliant face images.

  • Date: 05/08/2022
  • //
  • Featured In: IETBiometrics
  • //
  • Publication Type: Journal Articles
  • //
  • Author(s): Iurii Medvedev, João Tremoço, Beatriz Mano, Luís Espírito-Santo and Nuno Gonçalves
  • //
  • DOI: 10.1049/bme2.12095
  • //
  • Download File
  • //
  • Visit Website

Face depth prediction by the scene depth

Depth map, also known as range image, can directly reflect the geometric shape of the objects. Due to several issues such as cost, privacy and accessibility, face depth information is not easy to obtain. However, the spatial information of faces is very important in many aspects of computer vision especially in the biometric identification. In contrast, scene depth information is related easier to obtain with the development of autonomous driving technology in recent years. An idea of face depth estimation inspired is to bridge the gap between the scene depth and the face depth. Previously, face depth estimation and scene depth estimation were treated as two completely separate domains. This paper proposes and explores utilizing scene depth knowledge learned to estimate the depth map of faces from monocular 2D images. Through experiments, we have preliminarily verified the possibility of using scene depth knowledge to predict the depth of faces and its potential in face feature representation.

  • Date: 23/06/2021
  • //
  • Featured In: IEEE/ACIS 20th International Conference on Computer and Information Science, Shanghai, China
  • //
  • Publication Type: Conference Papers
  • //
  • Author(s): Bo Jin, Leandro Cruz, Nuno Gonc ̧alves
  • //
  • DOI: 10.1109/ICIS51600.2021.9516598
  • //
  • Download File
  • //
  • Visit Website

Cost Volume Refinement for Depth Prediction

Light-field cameras are becoming more popular in the consumer market. Their data redundancy allows, in theory, to accurately refocus images after acquisition and to predict the depth of each point visible from the camera. Combined, these two features allow for the generation of full-focus images, which is impossible in traditional cameras. Multiple methods for depth prediction from light fields (or stereo) have been proposed over the years. A large subset of these methods relies on cost-volume estimates 3D objects where each layer represents a heuristic of whether each point in the image is at a certain distance from the camera. Generally, this volume is used to regress a depth map, which is then refined for better results. In this paper, we argue that refining the cost volumes is superior to refining the depth maps in order to further increase the accuracy of depth predictions. We propose a set of cost-volume refinement algorithms and show their effectiveness.

  • Date: 10/01/2021
  • //
  • Featured In: 25th International Conference on Pattern Recognition (ICPR) Milan, Italy
  • //
  • Publication Type: Conference Papers
  • //
  • Author(s): João L. Cardoso, Nuno Gonçalves and Michael Wimmer
  • //
  • DOI: 10.1109/ICPR48806.2021.9412730
  • //
  • Download File
  • //
  • Visit Website

Improving Performance of Facial Biometrics With Quality-Driven Dataset Filtering

Advancements in deep learning techniques and availability of large scale face datasets led to signi cant perfor- mance gains in face recognition in recent years. Modern face recognition algorithms are trained on large- scale in-the-wild face datasets. At the same time, many facial biometric applications rely on controlled image acquisition and enrollment procedures (for instance, document security applications). That is why such face recognition approaches can demonstrate the de ciency of the performance in the target scenario (ICAO-compliant images). However, modern approaches for face image quality estimation may help to mitigate that problem. In this work, we introduce a strategy for ltering training datasets by quality metrics and demonstrate that it can lead to performance improvements in biometric applications that rely on face image modality. We lter the main academic datasets using the proposed ltering strategy and present performance metrics.

  • Date: 05/01/2023
  • //
  • Featured In: Workshop on Interdisciplinary Applications of Biometrics and Identity Science (INTERID’2023), Hawaii
  • //
  • Publication Type: Conference Papers
  • //
  • Author(s): Iurii Medvedev and Nuno Gonçalves
  • //
  • DOI: 10.1109/FG57933.2023.10042579
  • //
  • Download File
  • //
  • Visit Website

MorDeephy: Face Morphing Detection via Fused Classification

Face morphing attack detection (MAD) is one of the most challenging tasks in the field of face recognition nowadays. In this work, we introduce a novel deep learning strategy for a single image face morphing detection, which implies the discrimination of morphed face images along with a sophisticated face recognition task in a complex classification scheme. It is directed onto learning the deep facial features, which carry information about the authenticity of these features. Our work also introduces several additional contributions: the public and easy-to-use face morphing detection benchmark and the results of our wild datasets filtering strategy. Our method, which we call MorDeephy, achieved the state of the art performance and demonstrated a prominent ability for generalizing the task of morphing detection to unseen scenarios.

  • Date: 22/02/2023
  • //
  • Featured In: 12th International Conference on Pattern Recognition Application and Methods (ICPRAM), Lisbon, Portugal.
  • //
  • Publication Type: Conference Papers
  • //
  • Author(s): Iurii Medvedev, Farhad Shadmand and Nuno Gonçalves
  • //
  • Download File
  • //
  • Visit Website

Dealing with Overfitting in the Context of Liveness Detection Using FeatherNets with RGB Images

With the increased use of machine learning for liveness detection solutions comes some shortcomings like overfitting, where the model adapts perfectly to the training set, becoming unusable when used with the testing set, defeating the purpose of machine learning. This paper proposes how to approach overfitting without altering the model used by focusing on the input and output information of the model. The input approach focuses on the information obtained from the different modalities present in the datasets used, as well as how varied the information of these datasets is, not only in number of spoof types but as the ambient conditions when the videos were captured. The output approaches were focused on both the loss function, which has an effect on the actual ”learning”, used on the model which is calculated from the model’s output and is then propagated backwards, and the interpretation of said output to define what predictions are considered as bonafide or spoof. Throughout this work, we were able to reduce the overfitting effect with a difference between the best epoch and the average of the last fifty epochs from 36.57% to 3.63%.

  • Date: 22/02/2023
  • //
  • Featured In: 12th International Conference on Pattern Recognition Application and Methods (ICPRAM). Lisbon, Portugal
  • //
  • Publication Type: Conference Papers
  • //
  • Author(s): Miguel Leão and Nuno Gonçalves
  • //
  • DOI: 10.5220/0011639600003411
  • //
  • Download File
  • //
  • Visit Website

Probabilistic Approach for Road-Users Detection

Object detection in autonomous driving applications implies the detection and tracking of semantic objects that are commonly native to urban driving environments, as pedestrians and vehicles. One of the major challenges in state-of-the-art deep-learning based object detection are false positives which occur with overcon dent scores. This is highly undesirable in autonomous driving and other critical robotic-perception domains because of safety concerns. This paper proposes an approach to alleviate the problem of overcon dent predictions by introducing a novel probabilistic layer to deep object detection networks in testing. The suggested approach avoids the tradi- tional Sigmoid or Softmax prediction layer which often produces overcon dent predictions. It is demonstrated that the proposed technique reduces overcon dence in the false positives without degrading the performance on the true positives. The approach is validated on the 2D-KITTI objection detection through the YOLOV4 and SECOND (Lidar-based detector). The proposed approach enables interpretable probabilistic predictions without the requirement of re-training the network and therefore is very practical.

  • Date: 03/05/2023
  • //
  • Featured In: IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS
  • //
  • Publication Type: Journal Articles
  • //
  • Author(s): Gledson Melotti, Weihao Lu, Pedro Conde, Dezong Zhao, Alireza Asvadi, Nuno Gon ̧calves, Cristiano Premebida
  • //
  • DOI: 10.1109/TITS.2023.3268578
  • //
  • Download File
  • //
  • Visit Website

Noise simulation for the improvement of training deep neural network for printer-proof steganography

In the modern era, images have emerged as powerful tools for concealing information, giving rise to innovative methods like watermarking and steganography, with end-to-end steganography solutions emerging in recent years. However, these new methods presented some issues regarding the hidden message and the decreased quality of images. This paper investigates the efficacy of noise simulation methods and deep learning methods to improve the resistance of steganography to printing. The research develops an end-to-end printer-proof steganography solution, with a particular focus on the development of a noise simulation module capable of overcoming distortions caused by the transmission of the print-scan medium. Through the development, several approaches are employed, from combining several sources of noise present in the physical environment during printing and capture by image sensors to the introduction of data augmentation techniques and self- supervised learning to improve and stabilize the resistance of the network. Through rigorous experimentation, a significant increase in the robustness of the network was obtained by adding noise combinations while maintaining the performance of the network. Thereby, these experiments conclusively demonstrated that noise simulation can provide a robust and efficient method to improve printer-proof steganography.

  • Date: 24/02/2024
  • //
  • Featured In: 13th International Conference on Pattern Recognition Application and Methods (ICPRAM). Rome, Italy
  • //
  • Publication Type: Conference Papers
  • //
  • Author(s): Telmo Cunha, Luiz Schirmer, João Marcos and Nuno Gonçalves
  • //
  • DOI: 10.5220/0012272300003654
  • //
  • Download File
  • //
  • Visit Website

Automatic Validation of ICAO Compliance Regarding Head Coverings: (…) Religious Circumstances

This paper contributes with a dataset and an algorithm that automatically veri es the compliance with the ICAO requirements related to the use of head coverings on facial images used on machine-readable travel documents. All the methods found in the literature ignore that some coverings might be accepted because of religious or cultural reasons, and basically only look for the presence of hats/caps. Our approach speci cally includes the religious cases and distinguishes the head coverings that might be considered compliant. We built a dataset composed by facial images of 500 identities to accommodate these type of accessories. That data was used to ne-tune and train a classi cation model based on the YOLOv8 framework and we achieved state of the art results with an accuracy of 99.1% and EER of 5.7%.

  • Date: 22/09/2023
  • //
  • Featured In: 2023 International Conference of the Biometrics Special Interest Group (BIOSIG)
  • //
  • Publication Type: Conference Papers
  • //
  • Author(s): Carla Guerra, João Marcos, Nuno Gonçalves
  • //
  • DOI: 10.1109/BIOSIG58226.2023.10345995
  • //
  • Download File
  • //
  • Visit Website

Impact of Image Context for Single Deep Learning Face Morphing Attack Detection

The increase in security concerns due to techno- logical advancements has led to the popularity of biometric approaches that utilize physiological or behavioral characteris- tics for enhanced recognition. Face recognition systems (FRSs) have become prevalent, but they are still vulnerable to image manipulation techniques such as face morphing attacks. This study investigates the impact of the alignment settings of input images on deep learning face morphing detection performance. We analyze the interconnections between the face contour and image context and suggest optimal alignment conditions for face morphing detection.

  • Date: 22/09/2023
  • //
  • Featured In: 2023 International Conference of the Biometrics Special Interest Group (BIOSIG)
  • //
  • Publication Type: Conference Papers
  • //
  • Author(s): Joana Alves Pimenta, Iurii Medvedev and Nuno Gonçalves
  • //
  • DOI: 10.1109/BIOSIG58226.2023.10345999
  • //
  • Download File
  • //
  • Visit Website

Fused Classification for Differential Face Morphing Detection

Face morphing, a sophisticated presentation attack tech- nique, poses signi cant security risks to face recognition systems. Traditional methods struggle to detect morph- ing attacks, which involve blending multiple face images to create a synthetic image that can match different individ- uals. In this paper, we focus on the differential detection of face morphing and propose an extended approach based on fused classi cation method for no-reference scenario. We introduce a public face morphing detection benchmark for the differential scenario and utilize a speci c data min- ing technique to enhance the performance of our approach. Experimental results demonstrate the effectiveness of our method in detecting morphing attacks.

  • Date: 04/01/2024
  • //
  • Featured In: IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2024)
  • //
  • Publication Type: Conference Papers
  • //
  • Author(s): Iurii Medvedev, Joana Alves Pimenta and Nuno Gonçalves
  • //
  • DOI: 10.48550/arXiv.2309.00665
  • //
  • Download File
  • //
  • Visit Website

Dealing with Overfitting in the Context of Liveness Detection

With this work it was possible to: - Conclude on the importance of a well-rounded dataset; - Mitigating overfitting, reducing the difference from the best epoch to the average of the last epochs from 36.57% to 3.63%. In the future: - Apply these conclusions in the making of a new model, keeping it as simple as possible; - Developing a dataset with as much variety as possible, be it in types of spoof, individuals and capture condition.

  • Date: 27/10/2023
  • //
  • Featured In: RECPAD - 29th Portuguese Conference on Pattern Recognition. Coimbra (2023), Portugal
  • //
  • Publication Type: Poster
  • //
  • Author(s): Miguel Leão and Nuno Gonçalves
  • //
  • Download File
  • //
  • Visit Website

Steganography Applications of StyleGAN: A (…) Investigation from Hiding Message in Face Images

In this investigation, we delve into the latent codes denoted as w, pertaining to both original and encoded images in steganography models, which are projected through StyleGAN—a generative adversarial network renowned for generating aesthetic synthesis. We present evidence of disentanglement and latent code alterations between the original and encoded images. This investigator possesses the potential to assist in the concealment of messages within images through the manipulation of latent codes within the original images, resulting in the generation of encoded images. The message into encoded renderings is facilitated by the employment of CodeFace, serving as a steganography model. CodeFace comprises an encoder and decoder architecture wherein the encoder conceals a message within an image, while the decoder retrieves the message from the encoded image. By gauging the average disparities amid the latent codes belonging to the original and encoded images, a discerning revelation of optimal channels for concealing information comes to light. Precisely orchestrated manipulation of these channels furnishes us with the means to engender novel encoded visual compositions.

  • Date: 27/10/2023
  • //
  • Featured In: RECPAD - 29th Portuguese Conference on Pattern Recognition. Coimbra (2023), Portugal
  • //
  • Publication Type: Poster
  • //
  • Author(s): Farhad Shadmand, Luiz Schirmer and Nuno Gonçalves
  • //
  • Download File
  • //
  • Visit Website

Code

To be added soon

In construction