3:AM – Altmetrics research session
The first presentation, given on behalf of Waqas, who was unfortunately unable to make the conference, looked at some research that had been done on social phrases having an impact in altmetrics.
A collaboration between several Irish institutions, the goal of the project was to create an advanced analytics platform for altmetrics: SOPHIA. The platform would capture mentions of scientific interest: news, blogs, government policies, and enable better comparison of those with current measures such as views, discussions, saves, and cites.
The team looked at data on avian flu and HPV vaccine – and quickly found that the majority of grey literature documents are not supported by scientific references (there were very few direct citations or the research was misrepresented). They then extended their methods beyond manual analysis (via text mining) to try to discover useful linguistic patterns that could be detected automatically, to help identify attention to an item: quotes, references to scientists and institutions. This was based on trigger phrases e.g. a scientist’s name, their organisation, or a phrase such as ‘according to…’.
In the initial experiment, using the demo system built in collaboration with Elsevier, amongst others, research from Stanford showed the best results and the least false positives. There is now ongoing work to improve the accuracy of this automation – and they are beginning to look at measuring the influence of scientists, using methods such as Katz centrality and log based weighting.
A big part of the outputs are their “network visualisations” and the ‘entity mention network’ – a word cloud enables you to dig down into themes.
The next presentation covered research conducted by Feresteh Didegah on “Co-read, co-tweet and co-citation networks”. The study focussed on exploring relationships between citations, mendeley readers, and tweets, at the network level.
Based on a set of 1.1 million articles from Web of Science (WoS), the researchers looked at the citations from WoS, Mendeley stats, and Twitter data provided by Altmetric.com.
They found that 87.9% of articles had at least one citation, 84.7% had at least one Mendeley reader, and 20.3% had tweets.
The citation map shows a dense network – influenced by publishing and peer-review process restrictions, which make those communities close-knit and often cyclical.
The co-read network (based on Mendeley data) is much broader – readers have no restrictions for adding articles to their library and therefore the mix is more varied. There were a few clusters where journals show strong connections, which align with the typical reading habits of a subject specialist.
The Twitter data showed a dense cluster of medical/biomedical journals; in fact 63.5% of co-tweeted articles are from the same journals. 71% of co-tweeted pairs were from the same subject field (compared to co-read subject categories – where just 36.8% of co-read pairs are from the same field).
The scope of the co-read, co-tweeted research outputs, allowed second order papers to be identified. I.e. this allowed one to build a network map of closely related papers via authors, co-tweets, readership and subjects. The investigative UI (software demo) of this discovery meant based on a particular author for a paper, we could discover similar authors or papers. Discovering closely knit subject groups, institutions and journals. There were two clear learnings from this investigation. Co-citations of papers are journal independant, however high impact co-citations tend to be clustered by “impact”. The reasons for this was not investigated further.
At the end of the presentation Feresteh outlined some key conclusions from the study:
- The density and clustering of the 3 groups differed a lot
- Citation and Mendeley networks are broader (Twitter is influenced by journals tweeting their own content etc)
- OA journals more visible amongst the Twitter networks, whilst high impact journals more visible in citation and readership networks
Rodrigo Costas presented next, on “identifying Twitter user communities in the context of altmetrics”. Rodrigo found that 20% of the set of recent journal papers he looked at were shared on Twitter, and estimated that 10-15% of researchers use Twitter for work.
That said, <3% of researchers’ tweets contain links to papers. Digging deeper, Rodrigo looked at the content of tweets engagement – specifically at how much the tweet text differed from title of article and the level of exposure (how many followers the tweeter had). In some in-depth analysis of Twitter profiles they 3 ‘areas’ that often appear in the personal bio entries: personal (role), topics and collectives (e.g. health, education) and academic (e.g. university, phd).
From there it was possible to explore trends amongst the different groups: “Broadcasts” are people who demonstrate high exposure but low engagement. Their more common ‘terms’ include things that highlight science and research, organisational focus (listing their affiliation, for example), and tend to focus on ‘news’ – making announcements.
“Orators”, on the other hand, typically have low exposure but high engagement. Their terms tend to indicate scientists and students, and they are more likely to share personal preferences
Rodrigo and his team are keen to answer the following questions: “Who tweets about science? Which user groups can be identified? To what extent are Twitter accounts automated?”. They used the following codebook to help categorise Tweeters:
They found that 68% of Twitter accounts studied belonged to individuals, and 21% were organisations (the rest undefined). Their analysis determined that 45% were not automated, 5% were mixed, and 8%were completely automated.
Rodrigo finished with the following conclusions:
Last to present in the session was Alenka from TU Delft. Alenka began with an introduction to altmetrics in Denmark and how they fit into the bigger research ecosystem there. She then went on to detail the work done in the AIDA project. The first project of the project involved pulling 900 DOIs for research published between 2003 and 2016 by the bio tech and nano bio department at TU Delft.
Alenka and her team then combined data from Altmetric.com, Web of Science and Pure for each of those articles, and mapped how the Altmetric Attention Score compared to times cited. The resulting maps showed articles with most citations were not the subject areas with the most attention.
The process was then repeated with 280 DOIs from another department to enable some comparison across different subject areas.
The goal of the project, and any future work, is to establish how to best combine traditional bibliometric and altmetrics data in a way that will be meaningful and useful for faculty.
The project and a toolbox that documents their data collection, analysis and communication can be found at aida.tudelft.nl.