Research Information Management Archives - Digital Science https://www.digital-science.com/tags/research-information-management/ Advancing the Research Ecosystem Sat, 15 Oct 2022 12:29:46 +0000 en-US hourly 1 https://wordpress.org/?v=6.7.2 A New ‘Research Data Mechanics’ and its Implications for Research Information Citizenship https://www.digital-science.com/blog/2016/08/new-research-data-mechanics-implications-research-information-citizenship/ Wed, 10 Aug 2016 08:56:28 +0000 https://www.digital-science.com/?p=20083 As someone who has worked in research administration for over a decade, I have spent a lot of time thinking about how research metadata data is generated, how it joins together, and how it can be recombined to create new opportunities for researchers and institutions (and now publishers). For me, research metadata forms the information […]

The post A New ‘Research Data Mechanics’ and its Implications for Research Information Citizenship appeared first on Digital Science.

]]>

As someone who has worked in research administration for over a decade, I have spent a lot of time thinking about how research metadata data is generated, how it joins together, and how it can be recombined to create new opportunities for researchers and institutions (and now publishers).

For me, research metadata forms the information network that connects researchers, research institutions, funders, publishers, and research service providers together. How well information flows across this community has direct implications on how well research activity can be supported, and fundamentally how efficiently we can move research forward.

In the latest Digital Science White Paper, I highlight six recent advances in research infrastructure and seek to recast how we think about metadata – not as a series of static records, but as objects that move between systems and organizations.

By understanding the mechanics of how this occurs, we come closer to understanding our roles in the system, not just as consumers of information, but as research information citizens with responsibilities to the information with which we interact.

I feel it is time to conceive of a new Research Data Mechanics that brings to the fore the ways in which information travels through systems and, in the process, to create a template for a more efficient research cycle.

The paper illustrates the ideas of Research Data Mechanics by examining six recent advances in research infrastructure:

  1. The increasing availability of publication information to research institutions.
  2. The transformative effect of ORCID.
  3. The disentanglement of system silos from research workflows.
  4. The connection of collaborative environments into the research ecosystem.
  5. The expanding network of research particles to cover research grants.
  6. The rise of organizational context: an increasing shift from internal to externally linked identifiers.

Daniel Hook, Managing Director said:

“Research information management is a field that’s developing quickly, and with so many initiatives and standards emerging it is especially important to step back and and consider the fundamentals. We believe that this paper heralds the coming of age of research information. Simon uses his extensive practical experience to set out a clear and accessible vision of how to think about information flow and how to use that understanding to benefit researchers and institutions alike. This paper establishes Digital Science’s philosophy for how data should flow in the research environment.”

What next – some questions for the community

One of the key ideas to come out the paper is the idea of research information citizenship.

  • When data communication is placed on equal footing to data consumption, how does this change our practices?
  • What are our responsibilities to the stewardship of research metadata as it moves across our systems?
  • Do research institutions have a responsibility to openly communicating how research grants have been subcontracted?
  • How can we better manage our administrative processes so that the information we manage is also the information we publish?
  • Is it acceptable anymore for any process that requests a researcher’s publications, not to begin with a request for an ORCID?

What do you think? Let’s have the conversation using #researchdatamechanics

The post A New ‘Research Data Mechanics’ and its Implications for Research Information Citizenship appeared first on Digital Science.

]]>
Research Evaluation’s Gender Problem – and Some Suggestions for Fixing It #STMchallenges https://www.digital-science.com/blog/2016/06/research-evaluations-gender-problem-suggestions-fixing-stmchallenges/ Tue, 07 Jun 2016 10:27:24 +0000 https://www.digital-science.com/?p=18826 Research evaluation in the sciences has a gender problem and the indicators we use to understand academic impact are faulty.

The post Research Evaluation’s Gender Problem – and Some Suggestions for Fixing It #STMchallenges appeared first on Digital Science.

]]>
Stacy Konkiel is a Research Metrics Consultant at Altmetric, a data science company that helps researchers discover the attention their work receives online. Since 2008, she has worked at the intersection of Open Science, research impact metrics, and academic library services with teams at Impactstory, Indiana University & PLOS.

CC-BY WOCinTech Chat / Flickr
CC-BY WOCinTech Chat / Flickr

Research evaluation in the sciences has a gender problem: it’s mostly based on indicators and practices that reflect an unconscious bias against women.

This bias, when embedded into research evaluation systems, can harm female researchers as they apply for grant funding, jobs, promotion, and navigate other professional advancement opportunities.

Unfortunately, this problem is spreading to other disciplines, too. As the humanities and social sciences start to favor the use of quantitative metrics like citation counts for research evaluation, the problems of bias will follow.

Though more research is being done today than ever before on these problems, solutions have been less easy to come by. The literature suggests that education, support, and the use of context-aware metrics may help mitigate the gender gap, in both the short and long term.

The most used productivity indicator penalizes women researchers

Around the world, it’s common to use “number of papers published per year” as an indicator for a researcher’s productivity, especially in the sciences. Common sense and a lot of well-written editorials explain why that practice is misguided: it incentivizes writing over the act of doing research; it has lead to the practice of “salami slicing” (whereby authors split up one paper’s worth of insight over the course of several publications); and it is an invalid measure for the ultimate desired outcome: doing a lot of high-quality research.

The practice of understanding productivity simply by counting one’s number of publications per year is a flawed metric for another reason: it penalizes women researchers.

Research has shown that female researchers tend to publish less than their male counterparts in several scientific fields, especially early in their careers, and even in fields where publishing patterns are seemingly equal, men hold more of the prestigious first and last author positions (one study found that male “first authorships” outnumbered female, nearly 2:1). The latter point is significant due to the fact that some disciplinary evaluation practices specifically take “first/last author” publications into account when determining productivity.

Perhaps the most sobering statement from the literature can be found in a recent study: “Fewer than 6% of countries represented in the Web of Science come close to achieving gender parity in terms of papers published.”

One theory for this discrepancy – that women are less productive because they bear the brunt of domestic responsibilities, including childcare – has been mostly discounted. Another theory – traditional gender roles within marriage require more domestic labor of women and therefore less time to publish – has also been ruled out.

What seems a more likely culprit is gendered differences in how research is done, along with a healthy dose of institutionalized sexism. Women tend to collaborate less than their male counterparts (especially less often internationally), have different collaboration strategies than their male counterparts, have been found to have less access to research funding, and are on the receiving end of harsher criticism from hiring panels and grant reviewers, simply for being female! All of these variables have a bearing on a woman’s ability to “be productive” by publishing, making publication counts an inherently biased metric.

Citation counts aren’t sexist, but citation practices can be

Citation counts are another popular means by which researchers are often evaluated, especially in the sciences. These indicators, much like publication counts, disadvantage female researchers. That’s because researchers cite their female colleagues less often than their male colleagues.

One of the largest ever studies on citations and gender uncovered the following fact:

“We discovered that when a woman was in any of these [prominent authorship, i.e. sole authorship, first-authorship and last-authorship] roles, a paper attracted fewer citations than in cases in which a man was in one of these roles…The gender disparity holds for national and international collaborations.”

Other studies have found that, no matter the authorship position of a female researcher, she is less likely to be cited than her male counterparts.

In fields where a researcher is judged upon the size of their h-index, this disparity has obvious consequences for women, who are more likely to have lower h-indices, due to both citation bias and publication frequency differences.

Yet citations themselves aren’t inherently biased – they’re simply an indicator of one’s influence in a discipline. Citations merely reflect our own unconscious biases. As one editorial puts it:

“[T]o say that existing measures are not gender-biased is not the same as saying that there may not be structural gender-based biases in the larger environment in which scientists send and receive messages about research findings which, in turn, affects interpretation of the meaning of these measures.”

Even with the increased attention being paid to these citation discrepancies, many institutions still use raw citation counts and h-indices in their evaluation practices.

Clearly, the indicators we use to understand academic impact are faulty. What about indicators for understanding non-academic impact, like patents and altmetrics?

Innovation indicators penalize women, too

The number of patents a researcher files – understood to be an indicator of one’s impact upon technology and the economy – is yet another research evaluation metric that doesn’t adequately represent women’s contributions.

Studies have repeatedly shown that women researchers receive patents at lower rates than their male counterparts. One study found that though women represent about a third of the workforce in disciplines where patenting is common, they only account for around 11% of all patent holders.

Much like the “productivity puzzle”, the reasons for the lack of gender equity in patenting practices is likely due to a number of social phenomena. It has been suggested that female researchers’ less diverse social networks, the lack of support for underrepresented faculty from university technology transfer offices, one’s motherhood status, and the designation of patenting as an “optional” career advancement  activity may all play a role in this disparity.

Though an improvement, altmetrics are still biased

Altmetrics improve upon citations in many ways: we can use these discussions of research on the social web to understand non-academic impacts; they’re quicker to accumulate; and they apply to research other than journal articles (datasets, software, presentations, and other outputs, as well).

However, altmetrics research has shown that articles penned by female authors tends to receive less attention on social media, blogs and in the news than those of male authors, making raw altmetrics counts – much like citations – a problematic means by which to evaluate researchers. It is worth noting, though, that the gender gap for altmetrics is less severe than the gender gap for citations, meaning that they are still an improvement upon citations.

So, what can we do to improve research evaluation?

Screen Shot 2016-06-07 at 11.24.31
CC-BY WOCinTech Chat / Flickr

Many traditionalists would be the first to posit that we need to do away with using impact and productivity metrics altogether and simply use peer review if we wish to rid ourselves of biased forms of research evaluation. But peer review isn’t necessarily the answer.

Peer review has been shown time and time again to be subject to gender bias. Even Sweden, that Nordic bastion of gender equality, has had a problem with gender-based bias affecting peer review in their national funding council!

Instead, there are a number of other tactics that academia might implement to combat the gender bias that’s manifested in various research evaluation metrics. These upstream changes could potentially alter citation practices, offer more support for female researchers, and encourage other improvements to gender equity that would render some impact metrics more reliable.

A solid first step would be to raise awareness among academics worldwide of implicit bias with respect to gender. Implicit bias is the unconscious assumption that people of certain genders, races, etc act in certain ways or, sometimes, that they are less capable than others. Certainly, it’s implicit bias (along with a health dose of the Matthew Effect) rather than overt sexism that subtly nudges researchers away from citing or sharing women-authored articles.

Luckily, being made aware of one’s own implicit bias has been reported to be the first step towards correcting it. Knowing that you don’t cite female authors as often as you do male authors may make you more likely to seek out and cite high-quality research from women.

Support programs for women in academia is another important step that can be taken. To encourage female researchers to collaborate – and thus, author – more often, bibliometrics researchers have suggested that “programmes fostering international collaboration for female researchers might help to level the playing field.” Similarly, it has been suggested that technology transfer offices have a role in eliminating the patent gender gap by offering women-oriented programs and support.

Until these upstream changes take effect, what can be done to ensure that women are being evaluated fairly using impact metrics?

To be frank, research evaluation programs should incorporate gender-aware indicators to correct for current disparities. One team of researchers has suggested the use of gender-sensitive metrics to replace the h-index. This theme was echoed by a prominent altmetrics researcher, as well, who once suggested to me that Altmetric should provide “gender percentiles” based on lead authors on our details pages, alongside other “Score in Context” percentiles. (It’s something that certainly got support at Altmetric, though we have no firm plans to add that feature soon.) In theory, gender-aware percentiles could be built into citation databases like Scopus and Web of Science, too.

Do you have suggestions for how we might address the structural inequities in research evaluation? Leave them in the comments below or share them with me on Twitter (@skonkiel).

The post Research Evaluation’s Gender Problem – and Some Suggestions for Fixing It #STMchallenges appeared first on Digital Science.

]]>
Impact and Engagement: Thinking Beyond Assessment https://www.digital-science.com/blog/2016/04/impact-engagement-thinking-beyond-assessment/ Mon, 25 Apr 2016 09:06:26 +0000 https://www.digital-science.com/?p=18140 The Australian Research Council’s (ARC) approach to measuring impact and engagement is yet to be refined, and new information needs to be collected.

The post Impact and Engagement: Thinking Beyond Assessment appeared first on Digital Science.

]]>
Australia is currently embarking on a process to define a methodology for assessing the impact and engagement of its research. This process features two research and engagement working groups, a performance and incentives working group and a technical working group, of which Jonathan Adams, Digital Science’s Chief Scientist, is in fact a member. In light of this process, it is useful to reflect on the effect that assessment exercises have, on a research institution’s internal information management practices.

Within Australia, yearly government publication reporting exercises have partly been responsible for the sector having one of the most internationally advanced approaches to publications’ reporting. As early adopters of advanced publications management systems (such as Symplectic Elements) Australian institutions have leveraged the information that they have collected to deliver more than just government reporting. The wide command of an institution’s publications has allowed Australian universities to deploy this information for sophisticated internal reporting, as well as, to create systematic, up to date public profiles for all of their researchers. Publications reporting is now so embedded in Australian practice that reporting to government could be considered a byproduct of information a university collects for its own purposes.

Although the Australian Research Council’s (ARC) approach to measuring impact and engagement is yet to be defined and refined, it is not inconceivable that at least part of the evidenced-based information that it requires will involve some new information that needs to be collected. Following the template that publications reporting has established then, we should also ask what additional value does this information add to an institution? In what other ways could this information be incorporated into the institution? Without pre-empting what may or may not be included in the ARC’s new assessment process for impact and engagement, we can get some feel for these questions by considering the additional uses that information collected in Symplectic’s new Impact Module can provide, as well as look at the opportunities systematic reporting on altmetrics (as one measure of engagement) can provide.

Exploring the multiple uses of impact data

At the end of 2015, Symplectic introduced an Impact Module to their Elements product. This module helps researchers and administrators capture a narrative of emerging evidence of Research Impact. The evidence is recorded in a structured way, with links to research activity including grants and publications. Created at least in part, as an acknowledgment of the enormous effort the UK sector invested in retroactively creating research impact narratives, Symplectic Element’s Impact module aims to make it easier to record evidence of impact as it happens. Making impact easy to record is only one part of the equation, however, as the prospect of periodic assessment is not enough to motivate most researchers to keep their impact profile up to date. To this end, Symplectic is committed to working with the community to establish how impact narratives can be reused. Examples of reuse might include the publishing of appropriately reviewed impact summaries on public profile pages, support internal requests for stories on research, complete impact statements as part of grants reporting, and the incorporation of impact reporting into annual performance reviews.

Exploring multiple uses of engagement data

On the question of how a research institution can compare and benchmark its engagement profile, one approach might be to leverage the daily feed of attention data on publications that is available as part of the Altmetric Explorer for Institutions. Through the use of a publication disambiguation service, or integration with an up to date institutional publications collection, it is possible to create an aggregate daily attention measure that can be used to compare institutions.

Rolling Weekly Altmetric Activity Sep 15 – Mar 16

Screen Shot 2016-04-21 at 15.37.25

Investing time and resources in new ways to measure engagement requires justification, however, especially when the methodology is still being established. Again, by looking outside of institutional and government reporting, gaining command over an institution’s engagement profile may provide additional benefits that justify the investment. The same techniques used to profile engagement can also be used to identify research that is being talked about ‘right now.’  Armed with this information, universities have a new ability to disseminate ‘research of interest’ communications to alumni and prospective students.

publication
A template for considering impact and engagement

Although only lightly treated in this post, these considerations provide a template for taking a holistic approach to measuring impact and engagement. In 2016, the Australian government ceased its annual publication collection. To the best of my knowledge, there isn’t an Australian institution that has used the lack of government mandate to cease university-wide publication collection, such is the value of the information. This is perhaps a key indicator of an effective evaluation exercise. When this next Impact and Engagement exercise  is eventually retired, it can be hoped that it leaves the research sector similarly enhanced.

The post Impact and Engagement: Thinking Beyond Assessment appeared first on Digital Science.

]]>
Who’s Afraid of Preprints? Looking at the Origin and Motivation Behind arXiv for Clues as to Why It’s so Successful https://www.digital-science.com/blog/2016/03/whos-afraid-preprints-looking-origin-motivation-behind-arxiv-clues-successful/ Fri, 04 Mar 2016 11:08:55 +0000 https://www.digital-science.com/?p=17327 A couple of weeks ago, I explored a theme that emerged from the recent Researcher to Reader conference in London. Specifically, I asked the question as to whether we should separate out the roles of dissemination and accreditation in scholarly publishing. In a sense, the question is at the heart of the open science movement, […]

The post Who’s Afraid of Preprints? Looking at the Origin and Motivation Behind arXiv for Clues as to Why It’s so Successful appeared first on Digital Science.

]]>
Kitchen-Scooper-LargeA couple of weeks ago, I explored a theme that emerged from the recent Researcher to Reader conference in London. Specifically, I asked the question as to whether we should separate out the roles of dissemination and accreditation in scholarly publishing. In a sense, the question is at the heart of the open science movement, as advocates seek to find new and faster ways of communicating research outputs. The danger is that by focusing on the dissemination aspect of scholarly communication, we run the risk of ignoring accreditation, or rather, the quality control mechanisms that enable it.

Last week, an article by Ewen Callaway and Kendall Powell in Nature News discussed the ASAPBio conference, which is dedicated to finding a way to make preprints do for biology, what they’ve done for other disciplines. The article talks about bioRxiv, the life science preprint server which was founded at Cold Spring Harbour in 2013 and modelled after arXiv.

While bioRxiv has been growing steadily since its foundation, with around 200 submissions per month, it has a long way to go to catch up with arXiv, which boasts almost 9,000 per month, with a total of over a million articles so far. What accounts for this difference? Is it just the fact that bioRxiv is the new kid on the block or is there something more at work?

One key difference between the two projects is the communities that they serve. As this article by Jocelyn Kaiser in Science Magazine pointed out, critics claim that there are cultural differences between biology, for example and physics. I’d actually go a little further and say that arXiv was designed to automate a process that was already going on in fields like high energy physics (HEP). It therefore owes its inception to a community that it didn’t have to be sold to in the way that bioRxiv needs to be. This article by Dawn Levy, Stanford University’s news service science writer outlines the way in which HEP physicists in particular have been distributing articles among themselves prior to publication since the 1960s, in order to get feedback and to communicate more rapidly. As Heath O’Connell, HEP database manager forThe Stanford Linear Accelerator Center (SLAC) said at the time:

“The physics community had a really rapid adoption of this because in a sense it was just an evolutionary process rather than a revolutionary one,”

So the arXiv wasn’t initially an open access effort; it was about dissemination but also partly about quality control. This culture is in stark contrast with modern biology where in many cases, the results of work are kept secret for fear that somebody will steal the idea, rush out an article and get that high impact prize.

This begs the question as to why this difference exists? How can somebody working in big physics get away with showing their data to all their competitors a couple of years before it is recorded in the citable version of record, but biologists can’t. Is it partly because in fields like HEP, or plasma physics, the instruments (EG Joint European Torus, NIF or CERN) are unique, making it pretty obvious where the data came from? I don’t think that’s the answer because computer scientists and even economists make use of preprints. Is it because the communities are smaller? Perhaps, but I certainly know of people who have a reputation for sharp practice and scooping in the life sciences. It doesn’t seem to hurt their careers but their peers know who they are. Is excessive competition in the life sciences for money and position?

Maybe part of it is that the risk is overblown, or perhaps it’s more that when people build on work in the physical and computer sciences, it isn’t seen as a bad thing and it doesn’t harm the originator’s career. Whatever the reason, in order for preprints, or other forms of open-science to be truly successful in the life sciences, this is an issue that shouldn’t be ignored. The reasons why biologists are afraid of being scooped need to be identified and addressed either by convincing researchers that the rewards outweigh the risks, or changing the incentive structures so that they do.

The post Who’s Afraid of Preprints? Looking at the Origin and Motivation Behind arXiv for Clues as to Why It’s so Successful appeared first on Digital Science.

]]>
Structured or Unstructured (Data): That is the Question https://www.digital-science.com/blog/2016/01/structured-or-unstructured-data-that-is-the-question/ Wed, 13 Jan 2016 09:33:18 +0000 https://www.digital-science.com/?p=16135 Not the best way to organise data Over the past month or so, my colleague from figshare, Mark Hahnel and I have been working on an article for an upcoming special edition of Against the Grain guest edited by: Andrew Wesolek, Head of Digital Scholarship at Clemson University, Dave Scherer, Assistant to the Dean at […]

The post Structured or Unstructured (Data): That is the Question appeared first on Digital Science.

]]>
Not the best way to organise data

Over the past month or so, my colleague from figshare, Mark Hahnel and I have been working on an article for an upcoming special edition of Against the Grain guest edited by: Andrew Wesolek, Head of Digital Scholarship at Clemson University, Dave Scherer, Assistant to the Dean at Carnegie Mellon, and Burton Callicott, Librarian at College of Charleston. The focus of the special edition is to explore the future of journal articles as the container for scholarship. That is, as the digital scholarship age unfolds, will academics continue to use articles as the primary way they communicate? Will the nature of journals change out of all recognition, or will they even be replaced? It promises to be an interesting read, so keep an eye out for it. Mark and I were tasked with writing a piece on data management and sharing. Obviously, this is a big topic so we tried to cover as much ground as we could without being too cursory, I hope that we got the balance right.

In our article, we’ve explored the reasons why researchers are increasingly interested in data sharing, some of the barriers and challenges, and the relationship between traditional publishers and data repositories. One aspect that I found very interesting is the difference between structured and unstructured repositories and the arguments for and against their use. In the interests of full disclosure, Figshare does several things, including a data management solution for institutions; and visualization, linking and data hosting solutions for journals; all of which are built on the foundation of an unstructured repository. Having written that, I’m about to argue that the choice between the two is a false dichotomy and we need both.

What’s the difference?

Before listing the arguments, it makes sense to start by defining the terms. Put simply, structured repositories have more rules about the data that goes into them. Very often these types of repositories are intended to catalogue a specific type of data, with the aim of creating a super-data set that has been collectively gathered by many researchers in a certain field.

In many respects, this type of subject-specific, structured repository is related to the idea of industrial scale science that Timo Hannay, former Managing Director of Digital Science has discussed in the past. Traditionally, science has existed as a sort of cottage industry, in which individual labs, headed by principal investigators explored topics on their own terms. Today, disciplines like astronomy and the various -omics fields are pursuing a unified goal to answer larger questions than is possible through a single research project. This new model of industrial scale science inevitably requires standards for information interchange and so it makes sense for a repository to enforce those standards. A good example of a structured repository is the NIH’s Genbank, which is part of the International Nucleotide Sequence Database Collaboration. The NIH have published an annotated example record, so you can see how clearly the data is curated.

The advantages to having a highly curated database with enforced formats and standards are obvious. Given the volume of data that collaborations such as these generate, the data must be machine readable. The better codified the data formats are, the easier it is to write a computer program to read them and therefore, the easier it is to make use of the data.

Unstructured repositories are very different. In this type of data solution, the format of data has no restrictions and is not necessarily curated. In a sense, these types of solutions consider data in different ways to their structured counterparts, arguably with a different implicit definition of what data is. There are many different definitions of data, but one idea is that it is any digital product of scholarly research. This could mean anything from a video recording of ballet performance to a spreadsheet of numbers to a computer program.

Essentially, if structured repositories provide a place for data that is part of industrial scale scientific collaborations, unstructured repositories are where everything else goes.

When Unstructured Becomes Structured

Data scientists and many publishers working in the field of data and data linking strongly advocate the use of structured repositories where appropriate. Nature’s Scientific Data, which Figshare has partnered with, is a good example. As Andrew Hufton, Managing Editor at Scientific Data, said to me a few months ago.

‘We would like authors to put their data in the most appropriate place for that data’.

Tellingly, however, the repository most used by authors is Figshare (~30%), with most authors using some form of unstructured repository, whether that be an institutional one or a third party (like Figshare or Dryad). This illustrates that most authors don’t have data that conforms to an existing standard.

Over time, more and more data standards (and associated structured repositories) are emerging. The Registry of Research Data Repositories and BioSharing both maintain lists of several hundred. As techniques mature and the need to create a standard becomes apparent, professional societies and other communities such as the Open Microscopy Project and Research Data Alliance work to create standards, which then enables information and data to be more readily stored with a consistent structure and more easily reused.

Where Will The Balance Lie?

Data sharing is still a growing area of scholarly communication. Over time, it is likely that more data types will become codified with appropriate structured repositories. On the other hand, it is the nature of academic endeavour that researchers do new things. It takes a long time for any new technique to be widely adopted as a gold standard and even longer for a standard to be agreed on. Very often, the data that researchers are gathering is of a unique type, specific to the work that they are doing. Nobody knows just how much unstructured data is generated in the countless labs and offices around the world. Much of this data sits on computers under desks or on dropbox – undoubtedly, there’s a lot.

With funders and institutions increasingly asking for data to be available for review and reuse, it’s clear that all researchers need appropriate data sharing solutions and so both types of repositories are needed.

The post Structured or Unstructured (Data): That is the Question appeared first on Digital Science.

]]>
What’s so Bad About the Impact Factor – Part 3. What Did You All Say? https://www.digital-science.com/blog/2015/10/whats-so-bad-about-the-impact-factor-part-3-what-did-you-all-say/ Wed, 21 Oct 2015 08:00:46 +0000 https://www.digital-science.com/?p=14857 Over the past couple of weeks, I’ve taken a brief look at Impact Factor (IF) and some of the criticisms that are levelled at it. As promised, I’m going to report on some of the feedback that we’ve received, particularly on the Twitter hashtag #IFwhatswrong. The week before last, I focused on one of the […]

The post What’s so Bad About the Impact Factor – Part 3. What Did You All Say? appeared first on Digital Science.

]]>
We asked you to join the conversation on #IFwhatswrong. Fortunately, some of you did.
We asked you to join the conversation on #IFwhatswrong. Fortunately, some of you did.

Over the past couple of weeks, I’ve taken a brief look at Impact Factor (IF) and some of the criticisms that are levelled at it. As promised, I’m going to report on some of the feedback that we’ve received, particularly on the Twitter hashtag #IFwhatswrong.

The week before last, I focused on one of the statistical objections to IF, put simply – it’s an arithmetic mean, but should be a median. The mathematical argument appealed to Riccardo Sapienza, from King’s College, who tweeted about it.

As a physicist, Sapienza is familiar with statistics and how they work. I was interested when he implied that the correct metric ought to be the most frequent citation number (mode), rather than the median. I imagine that we could debate the best type of average to use, but as I argued in my first post in this series, I’m not convinced that it makes much difference. I don’t think that changing the way IF is calculated would do much good for research or researchers.

Marie McVeigh, who has had a long association with both JCR and Thomson Reuters, including working as Director of Content Selection for Web of Science, as well as being former Director of Product Research at Research Square,  expressed some frustration with the question.

McVeigh makes an interesting point here. It’s remarkable that we’re still using IF in inappropriate ways despite decades of formal research and years of debate in academic and library circles about its overuse. It seems to me that the metric has become far too deeply embedded in the way that academics themselves think about the quality of research and that’s why we’re still discussing it.

Tibor Tscheke of Science Open tweeted

Tscheke’s point is key here, I believe. In fact, it was the issue that I settled on as the central issue with IF in my second post. The problem isn’t really how the evaluation is done but the way it is interpreted – as a proxy for the quality of the research contained within a journal when it’s really only a measure of the skill of the editorial team in selecting highly citable articles.

Vicky Williams contributed a number of tweets to the debate in which she set out a different objection, one which is fair to level at IF, but is also true of all citation based metrics including H-index and article based citation counting generally.

Williams raises a great question and a controversial point. If we talk about the types of metrics that might augment the IF and count for other forms of impact, a common objection is that alternative metrics don’t capture impact they count attention. Are academic citations more worthy than other uses of research output?

Julia Shaw, senior lecturer and researcher in the Department of Law and Social Sciences at London South Bank University answered that question very directly.

I have to stress that I’m speaking entirely for myself when I say that I broadly agree with Shaw here. It seems odd to me to elevate a citation in a review article that does not contribute to the argument as being impact, while at the same time, calling a news article that sways public opinion merely attention. Whether a mention is impactful is obviously strongly related to the medium, but context is very important.

Williams went onto cite the Effects of participation in EU frameworks programmes for research and technological development – for researchers, institutions and private companies in Denmark report published by the Ministry of Higher Education and Science in Denmark. The primary conclusion of the report being that participation by researchers, institutions, and companies in Horizon 2020 was measurably beneficial for traditional academic impact and that companies believed it to be economically beneficial. The primary benefit for companies was to allow the funding of activities that would not otherwise have been implemented. Unfortunately, the authors are unable to show statistically significant quantitative economic benefit. The report points out many of the challenges in measuring economic impact of research including the length of time it takes for it to be observable. The take home message for me is that there is still a need to develop metrics that are both fast and predictive, but that’s a whole other blog post.

So, what do I take away from this exercise? 

Some people are sick of talking about IF and about metrics but I think part of the issue is that we’ve not solved the underlying problem of the evaluation gap. We’re still using the IF as a blunt proxy measure and have not succeeded in agreeing to change, despite the fact that most people seem to think that we need to.

IF lacks the resolution to measure what many people rely on it to measure – the quality of research. It’s a metric designed to evaluate the quality of subscription access journals for selection in a library collection. Back when that was all it was used for, it wasn’t a problem, but now that academics are under increasing pressure to write high impact articles, it has introduced a level of gamesmanship to academic authorship that I think doesn’t benefit science.

At the same time, IF is too narrow to measure impact. Research affects society in a breadth of ways. By limiting the type of impact that we measure to citations, we fail to get the full picture.

My last observation is about the use of the word impact. I don’t like the way it’s used and I’m not the only one. At the Frankfurt STM conference last week, Richard Gedye, STM’s Director of Outreach Programmes, spoke about Reseach4Life, which is the collective name for Hinari, Agora, OARE, and ARDI, which provide access to academic literature in the developing world. During the Q+A, Richard Padley of Semantico took to the roaming mic to thank Gedye for saying impact without saying factor afterwards.

The name Impact Factor was an excellent piece of branding. It seems to have changed the way we think about research. Many see academic citations as the only form of impact. If I could, I’d rename the IF to journal citation factor. Maybe then, we can think more freely about the benefits of research to society and the reasons why we communicate it.

So maybe that’s what’s so bad about the Impact Factor: the name.

The post What’s so Bad About the Impact Factor – Part 3. What Did You All Say? appeared first on Digital Science.

]]>
What’s so Wrong with the Impact Factor? Part 2 https://www.digital-science.com/blog/2015/10/whats-so-wrong-with-the-impact-factor-part-2/ Tue, 13 Oct 2015 10:30:58 +0000 https://www.digital-science.com/?p=14469 In last week’s perspective I asked the question ‘What’s wrong with the Impact Factor? Part 1’. Anybody who’s followed the debate over the years will be familiar with many of the common objections to the metric – Here’s an example of a blog post on the subject. But how valid are the common objections? Does […]

The post What’s so Wrong with the Impact Factor? Part 2 appeared first on Digital Science.

]]>
The self fulfilling prophecy: Oedipus in the arms of Phorbas.
The self fulfilling prophecy: Oedipus in the arms of Phorbas.

In last week’s perspective I asked the question ‘What’s wrong with the Impact Factor? Part 1’. Anybody who’s followed the debate over the years will be familiar with many of the common objections to the metric – Here’s an example of a blog post on the subject. But how valid are the common objections? Does the Impact Factor (IF) really harm science? If so, is the IF the cause or just a symptom of a bigger problem? Last week I focused on the mathematical arguments against IF. Principal among those is that IF is mean when it should be a median. This week, I’m going to look more closely at the psychology of IF and how it alters authors’ and readers’ behavior, potentially for the worse.

Impact Factor is a self-fulfilling prophecy

Whether we’re discussing, as we did last week, the propensity for highly cited papers to gather more citations, or whether it’s the fact that papers published in high impact journals are likely to get more citations simply because of the perceived value of the journal brand, IF creates a sort of feedback loop where articles in high impact journals are perceived to be better, leading to greater citations, which raises the prestige of the journal, and so on.

It’s worth noting that in Anureg Acharya’s keynote at the ALPSP conference about a month ago, he talked about research that he had done into changing citation patterns. The article is on arXiv, here. Acharya et al showed that the fraction of top cited articles in non-elite journals is steadily rising. Acharya’s central thesis is that this effect is due to increasing availability of scholarly content. The fact that scholars are not entirely limited to the collections in their libraries, but are able to access information both in Open Access journals and also through scholarly sharing, means that they are no longer limited to reading (and therefore citing) articles published in core collections.

Others would argue that with a flatter search landscape through services like PubMed, Google, and arXiv, the power of the journal brand for readers (although perhaps not for authors), is steadily eroding.

It’s a journal-level metric that is misused as an article-level one

The IF was originally designed as a way to judge journals, not articles. Eugene Garfield, the scientometrician who came up with the measure, was simply trying to provide a metric to allow librarians to decide which subscription journals should be in their core collections. He never intended it to be used as a proxy measure for the quality of the articles in the journal.

You can hardly blame the IF itself for not being a good measure of research quality. Nobody said it was. Or, at least they didn’t until recently. As Hicks et al point out in the Leiden Manifesto, the ubiquitous use of IF as a metric for research quality only really started in the mid 90s. So, if the metric is misused, that leads us to an obvious corollary.

It’s unfair to judge researchers on the impact factor of the journals they publish in

If we’re judging researchers poorly, we’re likely to be denying grants and tenure to people who could be making more of a contribution. However, the question is: can we blame the IF itself for that?

If the impact factor only became the ubiquitous measure of research quality that it is in the last 20 years, does that mean that publishing in Cell, Nature or Science was previously not important?

We can argue whether it’s gotten more or less important to publish in high impact journals in recent decades but one senior scientist said to me recently that getting ‘a really good Nature paper’ launched their career. The reality is that even before the IF became the juggernaut that it is today, articles in high prestige journals were always seen as a measure of research quality.

Impact Factor isn’t the problem

The problem isn’t the measure itself. Sure, there are issues with it from a statistical best-practice point of view and it seems to distort the way we value research, but I think that something else is at work here. The problem is that when researchers are evaluated, very often the venue in which they publish is taken to be more important than the work itself. If we’re going to judge research and researchers fairly against one another in the future and move past IF, that has to change.

For researchers and their outputs to be judged fairly, two things have to happen. Firstly, the trend towards article level metrics and alternative metrics for evaluation has to continue and be supported by librarians, publishers, funders and scholars themselves. The study from Google that I mentioned shows erosion of the citation advantage of the journal brand. It’s happening quite slowly and arguably only in terms of readership and citation, not authorship.

The second thing is more cultural and more subtle. When speaking to academics about the fact that assessment strategies are moving towards multiple measures and a broader sense of impact and value, this point is met with suspicion. I wrote a post a while ago about confusion around the concept of excellence in academia. I think the reason for suspicion is that reviewers on assessment panels are generally senior academics whose ideas of what constitutes good work are rooted in the age of the paper journal. If this is to change, funders, librarians and scientometricians must all do more to reach out to academics, and particularly those who sit on review panels. We need a clearer, more consistent message as to how assessment should be changing.

That’s what I think is wrong with Impact Factor, or rather how our obsession with it reflects a deeper problem. What do you think is the heart of the matter? Why are we so fixated on this overly simplistic metric? Is it really harming the advancement of knowledge? What can we do to change things? Please feel free to post a comment below. Alternatively, you can contribute to the conversation on twitter using hashtag #IFwhatswrong. Next week’s perspective will be a partly crowd sourced post built from the ideas and thoughts that everybody contributes.

 

The post What’s so Wrong with the Impact Factor? Part 2 appeared first on Digital Science.

]]>
What’s so Wrong with the Impact Factor? Part 1 https://www.digital-science.com/blog/2015/10/whats-so-wrong-with-the-impact-factor-part-1/ Tue, 06 Oct 2015 13:47:06 +0000 https://www.digital-science.com/?p=14410 Over the last year, I’ve been learning more and more about research metrics and evaluation. One of the common objections that I’ve heard voiced to the use of new metrics for evaluating research performance is that they may in part be driven by unsuccessful and disgruntled researchers who want to change the way that research […]

The post What’s so Wrong with the Impact Factor? Part 1 appeared first on Digital Science.

]]>
An Impact driver
An Impact driver

Over the last year, I’ve been learning more and more about research metrics and evaluation. One of the common objections that I’ve heard voiced to the use of new metrics for evaluating research performance is that they may in part be driven by unsuccessful and disgruntled researchers who want to change the way that research is assessed in order to better suit themselves. The unspoken corollary being that the outputs and efforts that they seek credit for are less valid, or rigorous. In other words – easier.

I have to tread a little carefully here, I left academia in part because I felt I wouldn’t have the career that I wanted given the way that research is currently assessed and grant money awarded, particularly in the US. That said, I knew and still know many competent and active researchers whose contributions are significantly underrated. The need to publish articles in journals with high impact factors sometimes punishes those who are doing work that is no less important but has a narrower audience because it is more specialized or more challenging to understand.

You can’t get too far into a conversation about research assessment without somebody mentioning the Impact Factor (IF). I’ll spare you the potted history of the IF and I’ll leave aside the pedestrian exercise of listing the objections to it. There’s a good post here which lists them out and gives the arguments. Instead I’d like to pick apart a couple of them, just because I think that’s a more interesting thing to do.

Impact Factor is a mean and it should be a median

Most people, who haven’t done a course in statistics, and many who have would look at this objection and think that it’s just fussy and pedantic. Means and medians turn out the same usually, don’t they? If, like me you care a little too much about numbers or have seen how using the wrong type of statistic can lead to the wrong conclusion, this objection is worth taking a closer look at. So does it matter?

Yes…. and no.

Let’s start with the ‘yes’ part. If you plot citation frequency from a journal you’ll almost certainly see that the data doesn’t look like a classic bell curve, which sciencey types call a Gaussian distribution. It’s strongly peaked at a low number and has a long tail pointing to the right. This isn’t something that would surprise a statistician, citations are independent events; they’re discrete, not continuous (you can’t have half a citation), and they’re rare (statistically speaking). Even a high IF like 35 isn’t really a big number, mathematically speaking, and most papers don’t get cited much, if at all. This is why a small number of highly cited papers tend to have a disproportionate effect on the mean value (the IF), why recruiting more review articles is such an effective tactic to raise the IF, and in extreme cases, why a single paper can have a large effect. Many of the problems concerning IF are due in part to the fact that you shouldn’t use an arithmetic mean for a non-Gaussian data set. You should probably use the median (or if you’re so inclined, figure out which distribution it should be, and use that to calculate the expected value using clever maths that involve letters rather than numbers).

Now for the ‘no’ part. Arguably, it’s not all that important because, generally speaking, the IF correlates very well with the median 5 year citation rate for the journal. Perhaps that’s to be expected. In a recent conversation I had with Digital Science’s Jonathan Adams, he told me that the current thinking is that citation distributions are most likely a negative binomial, because well cited papers tend to go on to be better cited in the future. (I guessed Poisson distribution, at least for the 2 year IF. Shows what I know)

Therein lies the rub. Once a paper has its perceived value raised by being cited, it’s likely to get cited again. That’s related to the next objection which I will explore in part two of this blog, so stay tuned for next week’s posting. 

The post What’s so Wrong with the Impact Factor? Part 1 appeared first on Digital Science.

]]>
Gearing Up For The Conference Season – What’s On Your Mind? https://www.digital-science.com/blog/2015/09/gearing-up-for-the-conference-season-whats-on-your-mind/ Wed, 09 Sep 2015 09:00:46 +0000 https://www.digital-science.com/?p=14002 Next week is the beginning of Freshers’ Week, I beg your pardon, ‘Welcome week’ at Edinburgh University and the students are already pouring in. Today the ALPSP conference kicks off in London (#ALPSP15), followed by the SSP fall seminars, the Frankfurt STM conference and Book Fair are not too far away as well as a host […]

The post Gearing Up For The Conference Season – What’s On Your Mind? appeared first on Digital Science.

]]>
Gearing up
Credit: Jean-Rémy Duboc

Next week is the beginning of Freshers’ Week, I beg your pardon, ‘Welcome week’ at Edinburgh University and the students are already pouring in. Today the ALPSP conference kicks off in London (#ALPSP15), followed by the SSP fall seminars, the Frankfurt STM conference and Book Fair are not too far away as well as a host of other great conferences and events. In the meantime, the SSP conference planning committee is getting underway and plans for this year’s STM tech trends brainstorm will be kicking off soon. There’s a definite sense that the silly season is over and it’s time to get back to work and have some new and original discussions.

So what does this conference season have in store? I certainly can’t speak for everybody, but here are some of the things that I’d like to talk more about this season.

Research Assessment

The roles of publishers and librarians are overlapping more and more, at the same time, the needs of researchers are evolving because funder expectations are changing. In the future, researchers are going to need more support tracking and collating the impact of their work, not just within academia, but on society in general. I think that we’re going to be getting more input from people who work in informetrics, like Diana Hicks and Paul Wouters of Leiden University who co-authored the Leiden manifesto along with three other researchers based at Leiden University.

The question that we have to ask is how do we, as a society measure that impact in a way that is both fair to researchers and maximizes the benefit to society, and what must we do as both publishers and librarians to facilitate that assessment and support researchers so that they can spend more time doing their actual work. The idea of reputation management is a somewhat related topic. The desire or otherwise of researchers to manage their online personas and maximize their personal impact is something that we may be hearing more about.

Quality Control and Ethics

The debate around predatory publishing has matured in recent weeks. Even before the recent inflammatory post by Jeffrey Beall that compared SciELO to a favela, Rick Anderson raised his concerns that the label of predatory publishers is too binary and that a deeper understanding of quality control issues and ethics in scholarly communication is needed. Publishing has until recently been fairly self-policed, but at this point, diversification of business models has led to a proliferation of new entrants to the marketplace. This is good because it encourages innovation, but brings new challenges.

Donald Samulack of Editage has been giving a series of talks about the scholarly communication landscape in the developing world, particularly in China, where he outlines some of the challenges that researchers face and the conditions that make it ripe for exploitation by disreputable companies. Samulack recently announced an initiative aimed at tackling the problems of quality control, ethics and predation from an international perspective. Meanwhile, on the Scholarly Kitchen Blog, some of the liveliest discussion sections of the summer have been about this issue, here and here. I imagine that the major professional societies and associations, as well as organizations like COPE, ICMJE and even ORCID might end up playing a part in this.

Redefining Engagement

We’re coming to the end of a period in publishing platform design and innovation that was intended to keep users on platforms for longer. The idea of making a platform ‘sticky’ is heading out of vogue as new concepts of researcher engagement emerge.

Last year, David Burgoyne of Taylor and Francis made the point to me that the most successful website in the world built its fame on sending people elsewhere. The example of Google goes to show that it’s better to give end users what they need so that they will return, rather than maximizing the time that they spend on your site each time they visit. In other words, if the reader feels like the proverbial rat in a maze, they’re going to want to look for alternative places to source content.

Workflow tools are going to be key to the new strategies. As Micheal Clark pointed out in his round-up of the SSP conference, there are a number of companies working on collaborative authorship tools that promise to enhance the flow of information from author to author and from authors to publishers, as well as helping publishers connect with authors earlier in the process. Publishing is becoming an increasingly author-centric world, but readers remain important. Publisher’s still struggle to keep contact with readers once they have downloaded the PDF and saved it to their own hard drive. The readership side of the engagement problem isn’t new, but across the industry, people are still working on it because it’s important. Do we make the HMTL version more attractive, to tempt people to stay on platform, like Wiley’s Anywhere Article or Elsevier’s Article of the Future, or do we try to find a way to connect with readers downstream by merging the online and offline experiences.

What have you been thinking about?

Those are just three things that I’ve been thinking about over the summer. I’d be very interested to hear what’s on everybody else’s mind. Please leave a comment below and let me know what you think the big story will be this conference season.

The post Gearing Up For The Conference Season – What’s On Your Mind? appeared first on Digital Science.

]]>
More Confusion: Do We All Agree on What Constitutes Authorship? https://www.digital-science.com/blog/2015/08/more-confusion-do-we-all-agree-on-what-constitutes-authorship/ Tue, 25 Aug 2015 14:38:00 +0000 https://www.digital-science.com/?p=13813 As I was scrolling through Twitter this morning, I came across a tweet from fellow Scholarly Kitchen chef and distinctive eyeglasses wearer, Phil Davis, that pointed to an article in Inside Higher Ed (IHE) on the thorny issue of academic credit and authorship: Research reveals significant share of scholarly papers have ‘guest’ or ‘ghost’ authors […]

The post More Confusion: Do We All Agree on What Constitutes Authorship? appeared first on Digital Science.

]]>
As I was scrolling through Twitter this morning, I came across a tweet from fellow Scholarly Kitchen chef and distinctive eyeglasses wearer, Phil Davis, that pointed to an article in Inside Higher Ed (IHE) on the thorny issue of academic credit and authorship:

The article reports on a piece of research presented by Professor John P Walsh of the Georgia Institute of Technology and recent graduate Sahra Jabbehdari at the annual meeting of the American Sociological Association. Walsh and Jebbehdari report on the apparently high instances of both guest authors (those who didn’t make adequate contribution to be listed) and ghost authors (those who did make a contribution and were left off the list).

One point in the article stood out for me because it shows that authorship is one among many areas of scholarly communication where many people are concerned about ethical standards, but we don’t all agree on what those standards should be. For example, the article in IHE states

…the new research shows that 37 percent of medical papers had a guest author, with about two-thirds of those being authorships granted simply for providing data.

please-do-not-ask-for-credit-as-refusal-often-offendsSo the problem is that about a quarter of academic articles give authorship credit to the person who actually sat at the bench and did the work?

How shocking!

Sarcasm aside, in my experience (I’m an author on 21 peer-reviewed articles, six conference proceedings and one book chapter) working in both biology and physics labs, it was considered only right and proper to include the person who gathered the data as an author on an article, irrespective of whether they were involved in the intellectual design of the experiment. To be clear, it is considered best practice for the primary author to send the manuscript to all authors for feedback and to incorporate any changes until a consensus is reached. In other words, all authors have a responsibility to participate at some level in the writing, but providing data is absolutely adequate for authorship.

The IHE article quotes the International Committee of Medical Journal Editors:

“recent graduate of GITsubstantial contributions to the conception and design” of a research project, a key role in “drafting the article or revising it,” and a role in final approval. Merely getting funding or gathering data are not sufficient, the standards say.

This made me wonder whether this was an example of how sometimes in the publishing industry (and sometimes among librarians), our ideas about how researchers do or should behave is different to the way that they actually behave in practice. Is this an area where the industry’s idea of how researchers do, or should, work is out of step with what the academy itself thinks? I did a bit of googling and I think that the answer is.. kind of.

If we look at the actual guidelines published by ICMJE in 2013, it says

1. Substantial contributions to the conception or design of the work; or the acquisition, analysis, or interpretation of data for the work; AND
2. Drafting the work or revising it critically for important intellectual content; AND
3. Final approval of the version to be published; AND
4. Agreement to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part

The acquisition of data is stated as sufficient reason to be included as an author. So the article in IHE is misinterpreting the ICMJE. To be fair, the IHE article does say simply for providing data, which could be interpreted as authors who provided data but never saw the manuscript prior to submission, but I doubt that.

On the other hand, the Faculty Council of Harvard Medical School state:

Everyone who is listed as an author should have made a substantial, direct, intellectual contribution to the work. For example (in the case of a research report) they should have contributed to the conception, design, analysis and/or interpretation of data. Honorary or guest authorship is not acceptable. Acquisition of funding and provision of technical services, patients, or materials, while they may be essential to the work, are not in themselves sufficient contributions to justify authorship.

The acquisition of data is not directly included as reason for authorship. Arguably, ‘provision of technical services’ might cover data acquisition. Meanwhile, the office of the provost at Yale writes that authorship should be granted to those that ‘conduct’ a component of the research. To me, that means taking the data but again, it’s not clear.

The IHE article quotes Walsh as saying, ‘We are in an era of high-stakes evaluation’. The implication being that this creates incentives to extend the author list. This is absolutely true. My own personal anecdote, I remember once being persuaded to ‘be more generous’ when writing my author list and include a senior faculty member who had made no intellectual contribution to the article but owned the piece of equipment that I was using. Singling out those that actually do the experiments particularly as unworthy of authorship seems unfair to me and runs the risk of addressing the problem of growing author lists by picking on the most junior members of the research community; graduate students and postdocs.

I suggest that we need to take a step back here as an industry and discuss in greater detail what should and should not constitute the right to take credit for a piece of work. Digital Science began working on these ideas some time ago when Amy Brand, who is now Director of the MIT Press, co-chaired the working committee for project CRediT. Moving forward, the discussion needs to include funders, publishers, people working in scientometrics and most importantly, the researchers themselves, so that we shape our system of incentives in such a way as to benefit the advancement of knowledge.

The post More Confusion: Do We All Agree on What Constitutes Authorship? appeared first on Digital Science.

]]>