More fun with Community Notes data
Use of X's crowdsourced fact-checking system has steadily grown over time, but so has the practice of using other X posts as sources for notes
One of the more interesting features of X (the social media platform originally known as Twitter) is the crowdsourced fact-checking system known as Community Notes. A couple of previous articles on this blog have featured analysis of Community Notes data, which is freely available for download from X as of the time of this writing. Today’s post will revisit some of that past analysis, as a few things have changed in the time since. Of particular interest, the proportion of notes where X posts (rather than external news articles) are cited as reliable sources has steadily risen.
The Community Notes dataset used in this article was downloaded on July 12th, 2024, and contains data through July 10th, 2024. As of this point in time, 1009625 notes have been created, 90558 of which (9.0%) are currently rated helpful and are visible on the relevant posts. 33587 notes (3.3%) have been rated not helpful, and the remaining 885480 notes (87.7%) have yet to receive enough ratings to determine their status, which has been the fate of the majority of Community Notes since the feature’s inception. The volume of notes written has steadily increased over time, with between 3000 and 4000 new notes being submitted each day thus far in July 2024.
One of the general expectations of the Community Notes feature is that note authors will provide links to sources that back up the notes they write. As with the August 2023 analysis, the top three domains currently cited in Community Notes are x.com/twitter.com, wikipedia.org, and youtube.com. Several Japanese domains have joined the list of frequent sources, including nhk.or.jp, yahoo.co.jp, and mhlw.go.jp; in the previous analysis, the top websites cited were almost all English-language sites. The most frequently cited mainstream news source, Reuters, has fallen from fourth place to fifth, behind the FTC.
Although X/Twitter has long been the website most frequently cited in Community Notes, the degree to which this is the case has increased over time. A year ago, the proportion of Community Notes citations that were links to X (which still used twitter.com as its primary domain at the time) was around 11%. This has more than doubled in the intervening months, to roughly 25%. User-generated content sites in general seem to be popular sources, as Instagram, Medium, and TikTok turn up prominently in addition to the aforementioned X, YouTube, and Wikipedia.
This trend is potentially messy from a fact-checking perspective, especially if it continues, as one of the reasons that the Community Notes system is effective is that it doesn’t anoint any specific organization or website to be a source of truth, instead relying on crowdsourcing and a clever algorithm to sort things out. If the community fact checks placed on X posts are increasingly coming from other X posts, Community Notes runs the risk of becoming a walled garden of sorts, where both the notes and the sources cited are written mostly by X power users (the first is already the case), and thereby end up excluding perspectives and sources that don’t have a massive fanbase on X, among other problems.