You hear a lot of people claiming that Nigel Farage gets an undue amount of coverage given the number of MPs in his Reform UK party.
I wondered if that was true, so I ran a query against GDELT (Global Database of Events, Language and Tone). GDELT monitors mentions of specific people across broadcast, print, and web news.
And…. it is!
You often hear this criticism levelled specifically at the BBC, so I did a cut just for that. And it’s true there too.
Technical Appendix
I used BigQuery and the GDELT gkg_partitioned database to get the data. The SQL query is below. The lines are smoothed with statsmodels’ lowess
function (Locally Weighted Scatterplot Smoothing).
The SQL query is below.
SELECT
EXTRACT(YEAR FROM PARSE_DATETIME('%Y%m%d%H%M%S', CAST(DATE AS STRING))) as Year,
EXTRACT(MONTH FROM PARSE_DATETIME('%Y%m%d%H%M%S', CAST(DATE AS STRING))) as Month,
'^(https?://[^/]+)') as Truncated_URL,
REGEXP_EXTRACT(DocumentIdentifier, rCASE
WHEN REGEXP_CONTAINS(Persons, r'(?i)nigel.farage') THEN 'Nigel Farage'
WHEN REGEXP_CONTAINS(Persons, r'(?i)ed.davey') THEN 'Ed Davey'
WHEN REGEXP_CONTAINS(Persons, r'(?i)kemi.badenoch') THEN 'Kemi Badenoch'
END as Person_Mentioned,
COUNT(*) as Mentions
FROM `gdelt-bq.gdeltv2.gkg_partitioned`
WHERE
DATE >= 20240101000000 -- Adjust date range as needed
AND DocumentIdentifier IS NOT NULL
AND (
'(?i)nigel.farage') OR
REGEXP_CONTAINS(Persons, r'(?i)ed.davey') OR
REGEXP_CONTAINS(Persons, r'(?i)kemi.badenoch')
REGEXP_CONTAINS(Persons, r
)GROUP BY Year, Month, Truncated_URL, Person_Mentioned
ORDER BY Year DESC, Month DESC, Mentions DESC, Truncated_URL