What do Wikipedia Citation Counts Mean?

 

This is a guest post contributed by Mike Thelwall and Kayvan Kousha, part of the Statistical Cybermetrics Research Group at University of Wolverhampton.

Some indicators are relatively easy to interpret. Citation counts, for example, are evidence of academic impact because they mean that a paper has been used in some way by other researchers to support their work. More transparently, syllabus citations (Kousha & Thelwall, 2008, in press-a), which are citations from online academic syllabi, are evidence of educational impact because they mean that works have been recommended by instructors for students to read. On the other hand, tweet citations can be created by anyone (although they seem to be mainly created by academics: Thelwall, Tsou, Weingart, Holmberg, & Haustein, 2013) and their purpose might be publicity, conversation or the dissemination of information. So what do they mean? This is an important question to answer when evaluating altmetrics (Sud & Thelwall, 2014).

Here we focus on a new altmetric, Wikipedia citations, and ask how Wikipedia citation counts should be interpreted.

What is a Wikipedia citation? It is a mention of an academic publication, such as a journal article, conference paper, monograph or book chapter, within a Wikipedia page. The encyclopedia encourages its authors to cite “reliable, third-party, published sources with a reputation for fact-checking and accuracy” and these sources might include newspapers, web pages and scholarly publications (https://en.wikipedia.org/wiki/Wikipedia:Identifying_reliable_sources).

How can Wikipedia citation counts be calculated? The simplest way to count how often an academic publication has been cited in Wikipedia is to run a site-specific search for it. Consider the classic article, “Hirschhorn, Norbert (1968). “Decrease in net stool output in cholera during intestinal perfusion with glucose-containing solutions”. New Eng. J. Med. 279: 176–181”. All Wikipedia pages originate from the website Wikipedia.org and so the Google or Bing search engines could be queried for the article title (in fact the first few words would be enough, as a phrase search) as long as the results are restricted to Wikipedia.org with the site: advanced search command. So, an effective query would be:

“Decrease in net stool output in cholera during intestinal perfusion” site:wikipedia.org

At the time of writing, this query gave 4 correct results in Google. In the web version of Bing it gave 7,310 results, almost all of which were incorrect, but with the four correct results within the first five. From this simple example, it seems that Google is reliable but Bing’s results would need manual checking to confirm. It is possible to automatically gather Wikipedia citations from web searches using the Bing API via Webometric Analyst (Kousha & Thelwall, in press-b) because, luckily, the Bing API gives reliable results for phrase search queries.

Hirschhorn

How should Wikipedia citation counts be interpreted? The answer might depend on who wrote the citing Wikipedia pages, who reads the pages, the purpose or topic of the pages, and what the typical reader does with the information. For the Hirschhorn paper, two of the Wikipedia pages are biographies (English and German), and two are about the diarrhoea cure that his research helped to invent (WHO-Trinklösung is a name for it in German). The last two are evidence of the health impacts of his research, and the first two are also indirect evidence of this because he has a biography page due to the health impact of his work. From the English Wikipedia page, “He was one of the inventors and developers of the life-saving method called oral rehydration therapy for adults and children suffering fluid loss from cholera and other infectious diarrheal illnesses. It is estimated that his work has saved around 50 million people suffering from dehydration.”

Clearly, however, not all Wikipedia citations reflect health impacts. The following search for a little-known book gets 1,890 hits in Google, apparently all correct (the author’s last name has been added to the query to reduce the number of false matches because the book has a short title).

“Das Kapital” Marx site:wikipedia.org

Das Kapital

Scanning the results, there is no evidence of health impact, but the book has clearly had a huge impact on politics and economics, so its Wikipedia citation count could perhaps be classed as political, economic or societal impact.

On the other hand, what about: Annalisa Di Liddo (2005) “Transcending Comics: Crossing the Boundaries of the Medium in Alan Moore and Eddie Campbell’s Snakes and Ladders”. International Journal of Comic Art. 7 (1). 530–545. It has two hits in Google:

“Transcending Comics: Crossing the Boundaries of the Medium” site:wikipedia.org

transcending comics

These Wikipedia articles probably reflect the cultural impact of Di Liddo’s paper about performance art rather than its health, political or economic impact. A similar example is “Pierce, D. (2007). Forgotten faces: Why some of our cinema heritage is part of the public domain. Film History: An International Journal, 19(2), 125-143.” with over 50 Wikipedia citations but only three citations indexed by Scopus, all from books. Most of the citing Wikipedia pages are about film history, so it also seems to have had some kind of cultural impact.

Overall, perhaps Wikipedia citations could be captured with a very general term, such as societal impact? Or could a more encyclopedia-specific term be used, such as informational impact? Is there a better term? And are these descriptions accurate (in general – of course there will be exceptions) and are they meaningful enough to be helpful for those using Wikipedia citations in research evaluations? These questions do not seem to have simple answers.

 

References

Kousha, K. & Thelwall, M. (2008). Assessing the impact of disciplinary research on teaching:  An automatic analysis of online syllabusesJournal of the American Society for Information Science and Technology, 59(13), 2060-2069.

Kousha, K. & Thelwall, M. (in press-a). An automatic method for assessing the teaching impact of books from online academic syllabi. Journal of the Association for Information Science and Technology. doi:10.1002/asi.23542

Kousha, K. & Thelwall, M. (in press-b). Are Wikipedia citations important evidence of the impact of scholarly articles and books? Journal of the Association for Information Science and Technology. doi:10.1002/asi.23694

Sud, P. & Thelwall, M. (2014). Evaluating altmetricsScientometrics, 98(2), 1131-1143. 10.1007/s11192-013-1117-2.

Thelwall, M. Tsou, A., Weingart, S., Holmberg, K., & Haustein, S. (2013). Tweeting links to academic articlesCybermetrics, 17(1), http://cybermetrics.cindoc.csic.es/articles/v17i1p1.html

Leave a Reply