Second Evaluation Report ======================== .. role:: raw-html(raw) :format: html This page outlines the work that has been completed as of the first evaluation in the GSoC program. Some code examples on how to use the module, some image results, theory as well as explanations into the design will be included in this. Rather than sectioning this page by week, it is sectioned according to the various submodules in the system. To start off, the following is the work that was proposed to be completed by the first evaluation (present on the last page of the proposal): .. image:: ../_static/eval_two/timeline.png Most of the tasks have been completed. The few tasks remaining, which will be completed in few days is the DAN Network (which I'm trying to minimize the size of currently), documentation of the classes, and added parameters to the Face Recognition Module. The purple highlighted portions are yet to be completed. A more intesive few days should take care of these pending tasks, which came about due to a medical emergency in the second evaluation period of GSoC. Face Recognition Module (FRM, part I) ''''''''''''''''''''''''''''''''''''' The first task was to integration Face Detection and Face Alignment into the Face Recognition Module. Although now the default alignment and detection methods are used (FaceBoxes and KazemiSullivan), there will be a possibility for the user to specify which detection/alignment method he/she wants to use, and the associated parameters. The Alignment and Detection step is used for both the training and testing phases of FRM. This is because the alignment methods being used also specify a bounding box requirement, and the faces supplied for training may not necessarily be bounded by the image border, i.e, the image could contain objects apart from the face. For this reason, face detection is run in the training phase too. Face Embeddings ''''''''''''''' Face Identification, just like Detection and Alignment, follows the same Abstract Base Class structure for the embedding methods. The "embedders", the methods that generate vector embeddings of the faces, are separated from the identification and verification process. The interface to the ``FaceEmbedder`` just requires passing the faces (that are of a fixed :math:`112\times 112` size, which is how the Alignment methods also return the faces). MobileFaceNet ````````````` The embedder that was added is the `MobileFaceNet`_ face embedder. Similar to FaceBoxes, MobileFaceNet has also been implemented in TensorFlow and the model is passed with the package, which is possible due to the small size of the trained model (as opposed to DAN which had a model size greater than 100 MB). MobileFaceNet can process at real time on experimentation, getting upto 25 fps while running it on the evaluator. The evaluator results are shown in the next section. The documentation will contain the same details too, when it will be put up in a few days. The following shows the ease of calling MobileFaceNet for generating the embeddings: .. code-block:: python def transform_image(im): # to be ported to MobileFaceNet im = cv2.resize(im, (112, 112)) im = (im - 127.5) / 128.0 return im mfn = MobileFaceNet() emb1 = mfn._embed([transform_image(im1)]) .. _MobileFaceNet: https://arxiv.org/abs/1804.07573 Face Embeddings Evaluation '''''''''''''''''''''''''' The next step was to evaluate the performance of each embedder. For this we used the `LFW`_ dataset, which contains thousands of faces and also comes with a testing scheme with similar and dissimilar images. ``FaceEmbedderEvaluator``, like the other evaluators, is meant and can be used for any general face verification dataset. The testing scheme used contained a total of 3000 pairs; although the LFW testing scheme involves averaging over multiple folds of the training set, our results are taken over the entirety of the testing pairs. This is done because our model is not meant to be trained on separate folds, but rather on a fixed training set. ``FaceEmbedderEvaluator`` computes the following: - **Accuracy**: The fraction of True Positives and True Negatives across all the predicted outputs for the pairs. - **ROC Curve**: The ROC curve which is a plot between the True Positive vs False Positive across threshold limits. Ideally the curve must rise early on, as for low false positives, there must be more true positives at low thresholds. - **AUC**: As said above, the ROC Curve must rise fast and thus, the area under the curve (AUC) must be as large as possible. - **EER**: Equal Error Rate is an interesting metric which is used with the ROC curve for two-class problems where the cost is similar for predicting wrong on either class. For ROC, the EER is value of the x-axis (FP rate) at the point where the top left diagonal of the ROC curve intersects with the curve. It's the thresholding point where the FP and FN values are equal. As for the other datasets too, the LFW dataset too is added (with modified directory structure and essential meta-data) as a git LFS ``tar.gz`` file. For convenience, the above evaluation can be done directly through the notebook in ``tests/performance/face_embedding`` directory. .. image:: ../../../tests/performance/face_embedder/images/ROC-Curve_MobileFaceNet.png .. _LFW: http://vis-www.cs.umass.edu/lfw/ Face Identification ''''''''''''''''''' The ``FaceIdentifier`` class is a wrapper around the embedders and the ``FaceKNN`` class. ``FaceKNN`` performs the KNN training and testing, and ``FaceIdentifier`` calls upon ``FaceKNN`` as well as the embedder to get the embeddings to return the results for both training and testing phases. ``FaceIdentifier`` can be used independent of the FRM module, if detection and alignment are not necessary. A straightforward way to do so is shown below: .. code-block:: python from rekognition.face_identification import FaceIdentifier FI = FaceIdentifier(training_faces_list, training_faces_labels) pred_results = FI(testing_faces_list) Face Recognition Module (FRM, Part II) '''''''''''''''''''''''''''''''''''''' The remaining work was done in completing the Face Recognition Module. Currently it integrates the detection, alignment and identification (which includes embeddings) steps. The next step for FRM is to add user customizability of the models used within FRM. At it's most basic structure, FRM can be used as: .. code-block:: python from rekognition import FRM frm = FRM() frm.train(faces, labels) frm.test(images) # returns all faces detected Another remaining work is to give more verbose detected faces details for the user, since that can be useful. Next Step ''''''''' The second evaluation period goals have three remaining tasks to be done: documentation, some additional customization on FRM, and addition of DAN. DAN go postponed due to the large model size: one possible way to add this is as a user defined model location, however I'm currently experimenting methods to reduce the size of this model. Another work I'm doing parallelly is on the Shot Recognition of the videos. These should be finished in the next few days.