This opinion article by Shifaan Ryklief was first published in www.coronavirusmonitor.co.za on 16 April 2020.
The spread of fake news is a crisis in itself and false Covid-19 posts on social media have caused mass panic both locally and internationally.
There have been quite a few fake news posts floating around from WhatsApp to Facebook and Twitter, but South African lockdown regulations have made it more difficult for peddlers of fake news with those caught spreading fake news liable to a fine or imprisonment for up to six months.
According to Covid19 Infodemics Observatory, South Africa ranks second after Singapore for the most reliable Covid-19 related news and information.
The Infodemic risk analysis has been collected from over 100 million public messages, taking into account news reliability from URL’s pointing to reliable news sources, unverified social bots and the average amount of unverified posts per day in a country.
Heres how it works :
The classification of reliable vs potentially unreliable news sources is based on joining the work of different classifiers:
- M. Zimdar for the Washingtoin Post (2016). Link
- C. Silverman for BuzzFeed News (2017). Link
- Fake News Watch (2015). Link
- PolitiFact (2017). Link
- Bufale.net (2018). Link
- Starbird et al, ICWSM (2018)
- Fletcher et al, Factsheets, Reuters Institute and U. of Oxford (2018). Link
- Grinberg et al, Science 363, 374 (2019).
- MediaBiasFactCheck (2020). Link
A few sources have been manually classified and annotated according to Wikipedia and other trusted sources.
When two classifiers do not agree on the classification of the same source, they pick the potentially more harmful classification, in terms of lower priority:
For instance, if news from xyz.com is classified by two distinct data sources as POLITICAL and MSM, an algorithm will assign the label ‘POLITICAL’. Note that this does not means that it is fake: it is just potentially unreliable according to one or more expert classifiers.
OTHER here refers to URLs pointing to content not verifiable automatically (eg. videos), while SHADOW refers to shortened URLs poitning to dead links. In both cases, it is not possible to assess their reliability/unreliability and they are classified as UNKNOWN, and consequently excluded from the analysis.