Entropy measurements on news attention and news homogeneity using the GDELT v2.0 dataset
Abstract
We investigate the level of attention of different news outlets in the different news topics reported, and how reporters establish connection between news events. We used the April 2022 GDELT v2.0 dataset, which records details parsed from collected web articles at a wide scale. We employed entropy to quantify the uniformity of attention to news themes for two news media outlets: Associated Press (AP) and Xinhua, which have wide international coverage and are important news sources to the GDELT dataset. We also assessed the homogeneity of main themes reported in a group of connected events linked by co-mentioning and taking the entropy of the news themes discussed. Our analysis revealed non-uniformity of general reporting with observed entropy levels at 0.78. Xinhua consistently reported on government and health-related topics contributing to lower entropy levels ranging from 0.69 to 0.74, while AP has reported themes more similar to the general trend. The co-mentioning technique decreased the entropy of news themes discussed within connected events, with most entropy values ranging from 0 to 0.4.