WeVerify Annotation team @EUvsVirus Hackathon

By on May 3rd, 2020 in Blog Posts, Events, News

Europe has joined forces in order to develop innovative solutions for coronavirus-related challenges in the EUvsVirus hackathon, the official EU Commission hackathon to fight COVID-19.

Among the many areas which participants were invited to address and work on, our team focused on the Social & Political Cohesion – Mitigating fake news spreading, where our experience and knowledge about disinformation helped us understand the problem and suggest solutions.

A team was formed, coordinated by the University of Sheffield, with participation of ATC and the MeVer team of CERTH. A brainstorming session between the team members led to the problem that we would solve:

The COVID-19 pandemic has given rise to an online “disinfodemic”, with dangerous real-life consequences (burnt 5G masts, deaths from unproven and dangerous “cures”, disregard for healthcare advice).

The International Fact-Checking Network, in turn, has over 100 fact-checkers working daily in over 70 countries, who have so far debunked over 3500 false stories, but comparing this to the huge amount of posts about COVID-19 shared daily through social media and other platforms, this is just a drop in the ocean (see picture below).

Fewer than 200 debunked stories a day against 50 million COVID-19 tweets daily many of which are aimed at misinformation
Fewer than 200 debunked stories a day against millions of daily social media posts on COVID-19, many of which contain misinformation

Existing automatic tools for fact-checking and disinformation analysis have been optimised for accuracy on political disinformation, so they are significantly less accurate when applied to COVID-19 disinformation. Moreover, COVID-19 disinformation falls into new, distinct categories (e.g. origin, social distancing, government lockdown policies) and there are no tools, which – given a social media post – will classify that according to such COVID-19 specific categories. The ability to do so automatically and reliably is paramount, as fact-checkers can then easily navigate prior debunks related only to the relevant topic (e.g. origin) in order to speed-up their work.

As the ‘WeVerify Annotation team’, we proposed open, scalable and cost-effective solutions that increase the efficiency of processing and classifying information around COVID-19. Solutions like these should already be in the hands of professionals like journalists, researchers, and everyone working on high volumes of COVID-19 (dis)information.

Our Goal was:

  • to provide journalists, media, fact-checkers and other professionals with an AI-based COVID-19-tailored solution to help automate part of their workflow and minimise repetitive tasks
  • to provide these professionals with real-time analytics and insights into COVID-19 information like trending topics, categories with the highest volume of disinformation, etc.

To achieve our goal, we coordinated the collection of large amounts of high-quality humanly-annotated data and developed open source deep learning AI techniques to automatically identify and cluster COVID-19 (dis)information into 10 categories which were derived by an extensive analysis conducted by the Reuters Institute for the Study of Journalism:

  • Public authority actions, policy, and communications
  • Community spread and impact
  • Medical advice and self-treatments
  • Claims about prominent actors
  • Conspiracy theories
  • Virus transmission
  • Virus origin and properties
  • Public preparedness
  • Vaccines, medical treatments, and tests
  • Protests and civil disobedience
  • Other

We also developed an Elastic search/Kibana proof-of-concept interface (the interface is password protected. Please contact Kalina Bontcheva – k.bontcheva@sheffield.ac.uk for access details if you are interested in obtaining access) that demonstrates the added benefits of the automatic categorisation, through easy-to-use visualisations.

Automatic Disinformation Classification — Demo
Automatic Disinformation Classification — Demo

Post-hackathon, we envisage the deployment in two main solutions. One operating as a standalone service for individual users (fact-checkers, researchers, etc.) who wish to classify and analyse a corpus of claims, and the other interfacing with Content Management Systems aimed at organisations that wish to integrate automatic COVID-19 information classification features into their media workflows.

Our next steps will be to adapt the COVID-19 information categorisation to languages other than English (FR, DE, ES, etc.), deploy it as Software-as-a-Service, implement a full disinformation categorisation and exploration user interface, and adapt to other medical disinformation areas, e.g. anti-vaccination.

Video Pitch: Automatic Disinformation Classification based on Deep Learning

Special thanks

In addition to the team members, thanks are due to:

Johann Petrak, who hacked the web UI for annotating COVID-19 disinformation classification and took care of many methodological aspects;

Mia Polovina, Andreas Grivas, Steven Zimmerman, Zlatina Marinova, Nikos Sarris, James Wood, Diana Maynard, Julia Ive, Lamiece Hassan, Tosin Dairo, Jon Chamberlain, Francesco Lomonaco, Tasos Papastylianou, James Allen-Robertson, Themis Makedas, Symeon Papadopoulos who all volunteered time to manually annotate some disinformation examples so we can train the machine learning models.

Authors: Olga Papadopoulous (CERTH-ITI) and Kalina Bontcheva (University of Sheffield)
Editor: Jochen Spangenberg (Deutsche Welle)

WeVerify.eu Privacy Policy

1. Purpose

The purpose of this Privacy Policy is to describe what we collect, use and sometimes share information about you through our online interfaces (e.g., websites and email) owned and controlled by us, including WeVerify and all subdomains (collectively referred to herein as the “Site”).

At WeVerify, we believe that you should have control of your data. Control starts with information. This is why you should know what data we collect from you and how we use it.

This notice and the accompanying policy is to enable you to make the best decisions about the information that you choose to share with us.2

2. Privacy Policy

By accessing and using any of WeVerify site, demonstrators or publicly available services, you expressly and knowingly consent to the information collection and use practices as described in this Privacy Policy.

3. Our Privacy Commitment

Our commitment to your privacy, is based on the following principles which we apply to our use of both your personally identifiable data (“Your Personal Data” or generally “Personal Data”) and to certain anonymous information we collect when you visit our Sites (“Technical Information”, and together with Personal Data, “Your Data”):

  • We will describe Your Data we will collect;
  • We will inform you clearly about our collection and use of Your Data;
  • We will either seek your express informed consent or rely on other legally permissible bases for the use of Your Data – either way, we will inform you of the basis for our use of Your Data;
  • We will give you control over the privacy preferences that apply to Your Data, including the rights to (a) change your mind about our use, (b) have access to change or correct inaccurate aspects of Your Data, and (c) require that we delete all or parts of Your Data (d) request Your Data in a portable format;
  • We will not sell or rent Your Personal Data to others;
  • We endeavor to maximize the protection of Your Data, and provide you with prompt notice in the unlikely event that a data loss incident or breach occurs; and
  • We will endeavor to be completely transparent and open about our data privacy policies and practices.

4. What Information does this Privacy Policy cover?

This Privacy Policy covers information we collect from you through all of our channels, including website, email and others. Some of our website’s functionality can be used without revealing any of Your Data. In order to access certain products, demonstrators or services, you may need to submit, or we may collect information that can be used to identify you.

Your Personal Data can include information such as your name and email address, among other things. You are responsible for ensuring the accuracy of the information you submit to us. Inaccurate information may affect your ability to use the site, download products, any follow-up information you request, and our ability to contact you. For example, your email address should be kept current because that is one of the primary manners in which we communicate with you.

5. How do we collect information?

We collect Your Data in the following ways:

  • You give it to us when you download software or documentation; register for an event such as a webinar; sign up for a newsletter; fill a form on the website or via any other sales or marketing channel;
  • You give it to us by email or phone inquiries or
  • We automatically collect Technical Information when you visit our Sites.

6. What information do we collect?  

When filling any form on the website, we collect Personal Data such as name; phone; email; company name, website and address; job title and category; social media data and nature of the interest.

In addition, we enrich the Personal Data above with Technical Information, related to:

  1. Conversion point (when, where, what campaign, source);
  2. Activity (dates of contact, email opens, link clicks, website visits, etc.);
  3. Opt-in Date to trace your consent;
  4. CRM identifiers;

When processing payments we additionally collect VAT ID, company identification and other information for invoicing and tax purposes.

7. How do we use Your Data collected at our sites?

We will use Your Data to:

  • Provide information, product or a service requested or consented to by you.
  • Comply with relevant contractual obligations with you and other third parties.
  • Improve Site performance and content, including troubleshooting and diagnostics.
  • Improve our engagement and interaction with you.
  • Facilitate your attendance at and participation in our events, communities or blogs.
  • Process a request or payment submitted to us.
  • Comply with legal requests.

8. What are your rights to control Your Data?

You have the right to request that we:

  • provide access to any of Your Personal Data we hold about you;
  • prevent the processing of Your Personal Data for direct marketing purposes;
  • update any of Your Personal Data which is out of date or incorrect;
  • delete Your Personal Data which we are holding about you;
  • restrict the way that we process Your Personal Data;
  • provide Your Personal Data to a third party provider of services; or
  • provide you with a copy of Your Personal Data which we hold about you.

We try to answer every email promptly where possible and provide our response within the time period stated by applicable law. Keep in mind, however, that there will be residual information that will remain within our databases, access logs and other records, which may or may not contain Your Personal Data. Please also note that certain parts of Your Personal Data may be exempt from such requests in certain circumstances, which may include if we need to keep processing Your Personal Data to comply with a legal obligation.

When you email us with a request, we may ask that you provide us with information necessary to confirm your identity.

8. What data do we retain?

We will only retain Your Data stored on our servers in accordance with the legitimate needs of our business and as required or permitted by applicable law. We will not retain any unused Personal Data on our systems longer than necessary for legitimate business purposes.