WeVerify participation in Deepfake detection challenge

By on July 20th, 2020 in Blog Posts, Events, News

From 11 December 2019 to 31 March 2020, WeVerify project partner CERTH and specifically the MKLab MeVer team, participated in the Deepfake detection challenge. It has been almost 3 months since the final deadline for the challenge on the Kaggle platform. Competition organizers have recently finalized the standings (13th of June 2020) in the private leaderboard. A Kaggle staff member mentioned in a discussion that competition organizers took their time to validate winning submissions and ensure that they comply with the competition rules. This process resulted in the disqualification of the top-performing team due to the usage of external data without a proper license. This caused a lot of disturbance among the Kaggle community mainly because the competition rules were vague

Performance evaluation on the challenge was measured with two leaderboards. The public leaderboard evaluation was performed using videos that share common characteristics with the training data that were provided by the competition and was available to participants throughout the challenge. On the other hand, the private leaderboard evaluation was performed on organic videos and various Deepfake manipulations that do not necessarily exist on training data and was revealed after the competition deadline.

After the final rerankings, the MeVer team (which joined forces with DeepTrace) stands at the 115-th position on the private leaderboard, among 2,265 teams. The achieved Log Loss evaluation score is 0.515. Note that the MeVer team stood at the 50-th position in the public leaderboard with a Log Loss score of 0.295. It is apparent that the applied detection approach performs well on videos that originate from the same distribution as the training data and has relatively limited generalization ability. This observation applies to the large majority of submissions. For example, the winning team scored 0.203 in terms of public validation score and only 0.428 in the private leaderboard. Additionally, there were cases of severe overfitting on the training distributions. For example, the best submission in the public leaderboard scored 0.192 in terms of Log Loss error, but in the private evaluation, the error was 0.575 leading to the 904-th position in the private leaderboard.

To deal with the DeepFake detection challenge, we focused on preprocessing the training data and used a deep learning approach that combined multiple architectures. The preprocessing step mainly involved dataset cleaning and generally increased the performance of our initial baseline models. More specifically, the MTCNN face detector was used to extract the bounding box information of all faces in video frames. Using the facial information a similarity-based approach was devised, which is shown in Figure 1, in order to cluster similar faces and remove false detections. The preprocessing methodology is described in more detail here. Following this methodology, we managed to remove noisy detections and generate distinct face clusters from all frames of a video. Furthermore, DeepFake videos that were very similar to original ones were removed by calculating the face similarity among them and the dataset was balanced by sampling more frames from original videos. In practice, our final dataset consisted of real and fake face images. Real face images were sampled from approximately 20,000 real videos by uniformly extracting 16 frames per video. Fake face images were sampled from 80,000 fake videos by uniformly extracting 4 frames per video. Then,  the images were resized to 300×300, and various augmentations were applied, such as horizontal and vertical flipping, random cropping, rotation, image compression, Gaussian and motion blurring, brightness, saturation and contrast transformations.

Initial work involved experiments with various deep learning architectures such as XceptionNet, MesoNet, InceptionResnet, etc. However, the EfficientNet architecture outperformed all other architectures by a large margin. Based on this observation, EfficientNet-B3 and EfficientNet-B4 were trained, and although the latter scored better in the public leaderboard, a simple average of both architectures provided even better evaluation scores. For the inference pipeline, 40 uniformly distributed frames were extracted per video and the preprocessing methodology we devised for the training data was applied to them. Each detected face in a video could contain up to 40 face frames for inference. To get the aggregated prediction for each face, the individual predictions of all frames and models were averaged. The final prediction is the maximum prediction among the individual face predictions in a video.

To exploit the temporal nature of videos, we also experimented with the I3D model. The I3D model was used just like the EfficientNet models, meaning that the predictions were performed at frame level with the difference that the I3D model used as input 10 consecutive frames instead of 1. By adding I3D to the ensemble and averaging individual frame and model predictions, the performance was further improved in the public leaderboard. This was the final model that finished in the 115-th position on the private leaderboard.

With the help of DeepTrace, there was also experimentation with an audio architecture but this did not appear to improve the performance of the model. Additionally, various other techniques from the related bibliography were applied (residual filters, optical flow, etc.) but none of them seemed to lead to further improvements for this task.

Some common practices from the top 50 solutions included the following: The EfficientNet architecture and its variants were admittedly the best for this task. Most approaches combined EfficientNet architectures (B3-B7) and some of them were trained with different seeds. ResNeXT was another architecture used by a top-performing solution combined with 3D architectures such as I3D, 3D ResNet34, MC3 and R2+1D. Input image resolution varied from 256 to 380 among these approaches. Also, several approaches increased the margin of the detected facial bounding box. To improve generalization apart from strong augmentations,  some approaches proposed domain-specific augmentations like half face removal horizontally or vertically or landmark  (eyes, nose, or mouth) removal. Additionally, the solution that was ranked third on the private leaderboard, reported that mixup augmentation further increased the generalization of the detection model. Almost all approaches calculated frame-level predictions and aggregated them for final prediction. Finally, label smoothing seemed to improve the performance of some architectures.

We have already deployed the described approach as a DeepFake detection API within the WeVerify platform to test the authenticity of online media and it will focus on further improving it in the future. 

Author: Polychronis Charitidis (CERTH).

Editor: Olga Papadopoulou (CERTH).

Image credits: respective persons named. Usage rights have been obtained by the authors named above for publication in this article. Copyright / IPR remains with the respective originators.

Note: This post is an adaptation of the MeVer participation in Deepfake detection challenge blog post, which was originally prepared for the CERTH Media Verification team (MeVer)  website.

Leave a Comment

sing in to post your comment or sign-up if you dont have any account.

WeVerify.eu Privacy Policy

1. Purpose

The purpose of this Privacy Policy is to describe what we collect, use and sometimes share information about you through our online interfaces (e.g., websites and email) owned and controlled by us, including WeVerify and all subdomains (collectively referred to herein as the “Site”).

At WeVerify, we believe that you should have control of your data. Control starts with information. This is why you should know what data we collect from you and how we use it.

This notice and the accompanying policy is to enable you to make the best decisions about the information that you choose to share with us.2

2. Privacy Policy

By accessing and using any of WeVerify site, demonstrators or publicly available services, you expressly and knowingly consent to the information collection and use practices as described in this Privacy Policy.

3. Our Privacy Commitment

Our commitment to your privacy, is based on the following principles which we apply to our use of both your personally identifiable data (“Your Personal Data” or generally “Personal Data”) and to certain anonymous information we collect when you visit our Sites (“Technical Information”, and together with Personal Data, “Your Data”):

  • We will describe Your Data we will collect;
  • We will inform you clearly about our collection and use of Your Data;
  • We will either seek your express informed consent or rely on other legally permissible bases for the use of Your Data – either way, we will inform you of the basis for our use of Your Data;
  • We will give you control over the privacy preferences that apply to Your Data, including the rights to (a) change your mind about our use, (b) have access to change or correct inaccurate aspects of Your Data, and (c) require that we delete all or parts of Your Data (d) request Your Data in a portable format;
  • We will not sell or rent Your Personal Data to others;
  • We endeavor to maximize the protection of Your Data, and provide you with prompt notice in the unlikely event that a data loss incident or breach occurs; and
  • We will endeavor to be completely transparent and open about our data privacy policies and practices.

4. What Information does this Privacy Policy cover?

This Privacy Policy covers information we collect from you through all of our channels, including website, email and others. Some of our website’s functionality can be used without revealing any of Your Data. In order to access certain products, demonstrators or services, you may need to submit, or we may collect information that can be used to identify you.

Your Personal Data can include information such as your name and email address, among other things. You are responsible for ensuring the accuracy of the information you submit to us. Inaccurate information may affect your ability to use the site, download products, any follow-up information you request, and our ability to contact you. For example, your email address should be kept current because that is one of the primary manners in which we communicate with you.

5. How do we collect information?

We collect Your Data in the following ways:

  • You give it to us when you download software or documentation; register for an event such as a webinar; sign up for a newsletter; fill a form on the website or via any other sales or marketing channel;
  • You give it to us by email or phone inquiries or
  • We automatically collect Technical Information when you visit our Sites.

6. What information do we collect?  

When filling any form on the website, we collect Personal Data such as name; phone; email; company name, website and address; job title and category; social media data and nature of the interest.

In addition, we enrich the Personal Data above with Technical Information, related to:

  1. Conversion point (when, where, what campaign, source);
  2. Activity (dates of contact, email opens, link clicks, website visits, etc.);
  3. Opt-in Date to trace your consent;
  4. CRM identifiers;

When processing payments we additionally collect VAT ID, company identification and other information for invoicing and tax purposes.

7. How do we use Your Data collected at our sites?

We will use Your Data to:

  • Provide information, product or a service requested or consented to by you.
  • Comply with relevant contractual obligations with you and other third parties.
  • Improve Site performance and content, including troubleshooting and diagnostics.
  • Improve our engagement and interaction with you.
  • Facilitate your attendance at and participation in our events, communities or blogs.
  • Process a request or payment submitted to us.
  • Comply with legal requests.

8. What are your rights to control Your Data?

You have the right to request that we:

  • provide access to any of Your Personal Data we hold about you;
  • prevent the processing of Your Personal Data for direct marketing purposes;
  • update any of Your Personal Data which is out of date or incorrect;
  • delete Your Personal Data which we are holding about you;
  • restrict the way that we process Your Personal Data;
  • provide Your Personal Data to a third party provider of services; or
  • provide you with a copy of Your Personal Data which we hold about you.

We try to answer every email promptly where possible and provide our response within the time period stated by applicable law. Keep in mind, however, that there will be residual information that will remain within our databases, access logs and other records, which may or may not contain Your Personal Data. Please also note that certain parts of Your Personal Data may be exempt from such requests in certain circumstances, which may include if we need to keep processing Your Personal Data to comply with a legal obligation.

When you email us with a request, we may ask that you provide us with information necessary to confirm your identity.

8. What data do we retain?

We will only retain Your Data stored on our servers in accordance with the legitimate needs of our business and as required or permitted by applicable law. We will not retain any unused Personal Data on our systems longer than necessary for legitimate business purposes.