Skip to content

Commit 06c2d87

Browse files
committed
Adding productivity indicator
1 parent 7a8e5e8 commit 06c2d87

File tree

2 files changed

+164
-8
lines changed

2 files changed

+164
-8
lines changed
Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
---
2+
author:
3+
- name: V.A Traag
4+
orcid: 0000-0003-3170-3879
5+
affiliations:
6+
- ref: cwts
7+
8+
affiliations:
9+
- id: cwts
10+
name: Leiden University
11+
department: Centre for Science and Technology Studies
12+
city: Leiden
13+
country: the Netherlands
14+
---
15+
16+
# Productivity {#productivity .unnumbered}
17+
18+
<div>
19+
20+
## History
21+
22+
| Version | Revision date | Revision | Author |
23+
|---------|---------------|-------------|------------|
24+
| 1.0 | 2024-12-06 | First draft | V.A. Traag |
25+
26+
</div>
27+
28+
## Description
29+
30+
In general, productivity estimates the amount of output relative to the amount of input. In the context of academia, outputs can be various objects, varying from publications to data, code, or peer reviews. Although productivity is an aspect of interest, it should usually be considered jointly with something like quality. That is, a higher productivity may just stimulate more, but lower quality, outputs. There is some evidence of such a type of effect [@butler_explaining_2003], although this evidence is also disputed [@van_den_besselaar_perverse_2017].
31+
32+
Output is usually only measured for a limited set of objects, with scholarly publications being the most typical example. Nonetheless, other relevant outputs should not be ignored, and limitations of productivity based on publications should be considered. Moreover, we should be aware of certain potential differences between productivity at the individual level and the collective level. For instance, consider a research group for which one individual is tasked with data quality assurance and code review. That individual might perhaps have a lower productivity in terms of publication outputs, yet her/his activities are a boon to the other researchers in the group, whose productivity might greatly increase as a result [@tiokhin_shifting_2023].
33+
34+
In addition, one aspect of productivity that is usually missing is the overall input [@abramo_farewell_2016]. That is, we typically do not know how many people are employed at a certain institution. Even if part of that becomes visible in authorships, not every employee's contribution will become visible in authorship. Hence, institutions that have for example more research assistants who are not acknowledges as author may seem to have relatively few authors, but in reality there are much more people active at the institution. Moreover, even if we know whether a particular author as affiliated with a certain institution, we do not know the amount of time (s)he spends at that affiliation, which is particularly challenging with multiple affiliations. Going one step further, the input could also be specified in financial terms. Unfortunately, none of this data is typically available [@waltman_elephant_2016]. Nonetheless, this is an important limitation to taken into account when considering productivity.
35+
36+
### Avg. number of papers per author
37+
38+
#### Measurement
39+
40+
For a certain institutions $i$ we can count how many authors $a_i$ are affiliated with institution $i$ and how many publications $n_i$ are published in a given year $y$. The ratio of $\frac{n_i}{a_i}$ then gives the average number of papers per author, which is an indicator of productivity. We typically observe an increase in productivity over time, such that in more recent years, the number of papers per author is usually larger than in earlier years.
41+
42+
One relevant aspect in the context of counting number of papers per author is the increase in collaboration. If the total amount of publications remains the same in a given year, but more of them are co-authored, then the metric will be higher. Hence, it sometimes makes sense to use "fractional counting" for publications [@waltman2015]. This means that we can consider fractions, or weights, for all publications, based on the "fraction" of their authorship. For instance, if a publication has three authors: each has a fraction of 1/3. If two of the authors are affiliated with a single institution, say institution A, that institution will have a weight of 2/3. If, in addition, the third author would have two affiliations, one with the aforementioned institution A, and one with institution B, we could count that author as belonging to institution A for 1/2, bringing the total to 5/6.
43+
44+
If we indicate $n_{ji}$ the fraction to which publication $j$ belongs to institution $i$, we can define $n'_i = \sum_j w_{ji}$ the number of fractionally counted publications. Similarly, if we indicate with $a_{ji}$ the fraction with which author $j$ belongs to institution $i$, we can define the fractionally counted number of authors as $a'_{i} = \sum_j a_{ji}$. The productivity can then be simply specified as $\frac{n'_i}{a'_i}$.
45+
46+
If there is input data available, such that the total amount of budget of fte available is indicated by $f_i$, the average number of publications per currency unit or fte can be expressed as $\frac{n_i}{f_i}$.
47+
48+
## Datasources
49+
50+
### OpenAlex
51+
52+
[OpenAlex](https://openalex.org/) covers publications based on previously gathered data from Microsoft Academic Graph, but mostly relies on Crossref to index new publications. OpenAlex offers a user interface that is at the moment still under active development, an open API, and the possibility to download the entire data snapshot. The API is rate-limited, but there are options of having a premium account. Documentation for the API is available at <https://docs.openalex.org/>.
53+
54+
It is possible to retrieve the number of authors for a particular publication in OpenAlex, for example by using a third-party package for Python called `pyalex`.
55+
56+
``` python
57+
import pyalex as alx
58+
alx.config.email = "[email protected]"
59+
w = alx.Works()["W3128349626"]
60+
61+
authors = w["author"]
62+
institutions = w["institutions"]
63+
countries = w["countries"]
64+
```
65+
66+
Based on this type of data, the above-mentioned metrics can be calculated. When large amounts of data need to be processed, it is recommended to download the full [data snapshot](https://docs.openalex.org/download-all-data/snapshot-data-format), and work with it directly.
67+
68+
OpenAlex provides disambiguated authors, institutes and countries. The institutions are matched to [Research Organization Registry (ROR)](https://ror.org/), the countries might be available, even if no specific institution is available.
69+
70+
### Dimensions
71+
72+
[Dimensions](https://app.dimensions.ai/discover/publication) is a bibliometric database that takes a comprehensive approach to indexing publications. It offers limited free access through its user interface. API access and access through its database via Google BigQuery can be arranged through payments. It also offers the possibility to apply for access to the API and/or Google BigQuery for [research purposes](https://www.dimensions.ai/request-access/). The API is documented at <https://docs.dimensions.ai/dsl>.
73+
74+
The database is closed access, and we therefore do not provide more details about API usage.
75+
76+
### Scopus
77+
78+
[Scopus](https://www.scopus.com/) is a bibliometric database with a relatively broad coverage. Its data is closed and is generally available only through a paid subscription. It does offer the possibility to apply for access for research purposes through the [ICSR Lab](https://www.elsevier.com/insights/icsr/lab). Some additional documentation of their metrics is available at <https://www.elsevier.com/products/scopus/metrics>, in particular in the Research Metrics Guidebook, with documentation for the dataset available through ICSR Lab being available separately.
79+
80+
The database is closed access, and we therefore do not provide more details about API usage.
81+
82+
### Web of Science
83+
84+
[Web of Science](https://webofscience.com/) is a bibliometric database that takes a more selective approach to indexing publications. Its data is closed and is only through a paid subscription.
85+
86+
The database is closed access, and we therefore do not provide more details about API usage.

references.bib

Lines changed: 78 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,20 @@ @article{aagaard_considerations_2017
1212
pages = {923--926}
1313
}
1414

15+
@article{abramo_farewell_2016,
16+
title = {A farewell to the {MNCS} and like size-independent indicators},
17+
volume = {10},
18+
issn = {1751-1577},
19+
doi = {10.1016/j.joi.2016.04.006},
20+
abstract = {The arguments presented demonstrate that the Mean Normalized Citation Score (MNCS) and other size-independent indicators based on the ratio to publications are not indicators of research performance. The article provides examples of the distortions when rankings by MNCS are compared to those based on indicators of productivity. The authors propose recommendations for the scientometric community to switch to ranking by research efficiency, instead of MNCS and other size-independent indicators.},
21+
number = {2},
22+
journal = {Journal of Informetrics},
23+
author = {Abramo, Giovanni and D'Angelo, Ciriaco Andrea},
24+
month = may,
25+
year = {2016},
26+
pages = {646--651}
27+
}
28+
1529
@article{aksnes2019,
1630
title = {Citations, Citation Indicators, and Research Quality: An Overview of Basic Concepts and Theories},
1731
author = {Aksnes, Dag W. and Langfeldt, Liv and Wouters, Paul},
@@ -224,6 +238,7 @@ @techreport{brown2016
224238
langid = {en}
225239
}
226240

241+
227242
@article{bryan2021,
228243
title = {The impact of open access mandates on invention},
229244
author = {Bryan, Kevin A. and Ozcan, Yasin},
@@ -237,7 +252,6 @@ @article{bryan2021
237252
note = {Publisher: MIT Press One Rogers Street, Cambridge, MA 02142-1209, USA journals-info {\ldots}}
238253
}
239254

240-
241255
@article{budi2022,
242256
title = {Understanding the meanings of citations using sentiment, role, and citation function classifications},
243257
author = {Budi, Indra and Yaniasih, Yaniasih},
@@ -250,6 +264,19 @@ @article{budi2022
250264
doi = {10.1007/s11192-022-04567-4}
251265
}
252266

267+
@article{butler_explaining_2003,
268+
title = {Explaining {Australia}'s increased share of {ISI} publications—the effects of a funding formula based on publication counts},
269+
volume = {32},
270+
issn = {0048-7333},
271+
doi = {10.1016/S0048-7333(02)00007-0},
272+
number = {1},
273+
journal = {Res. Policy},
274+
author = {Butler, Linda},
275+
month = jan,
276+
year = {2003},
277+
pages = {143--155}
278+
}
279+
253280
@article{carlin2023,
254281
title = {Where is all the research software? An analysis of software in UK academic repositories},
255282
author = {Carlin, Domhnall and Rainer, Austen and Wilson, David},
@@ -340,6 +367,7 @@ @misc{codeof
340367
langid = {en}
341368
}
342369

370+
343371
@article{cohen2002,
344372
title = {Links and Impacts: The Influence of Public Research on Industrial R&D},
345373
author = {Cohen, Wesley M. and Nelson, Richard R. and Walsh, John P.},
@@ -355,6 +383,7 @@ @article{cohen2002
355383
langid = {en}
356384
}
357385

386+
358387
@article{colavizza2020,
359388
title = {The citation advantage of linking publications to research data},
360389
author = {Colavizza, Giovanni and Hrynaszkiewicz, Iain and Staden, Isla and Whitaker, Kirstie and McGillivray, Barbara},
@@ -371,7 +400,6 @@ @article{colavizza2020
371400
langid = {en}
372401
}
373402

374-
375403
@article{cole_societal_2024,
376404
title = {The societal impact of {Open} {Science}: a scoping review},
377405
volume = {11},
@@ -388,7 +416,6 @@ @article{cole_societal_2024
388416
pages = {240286}
389417
}
390418

391-
392419
@book{cost-ben2018,
393420
title = {Cost-benefit analysis for FAIR research data: cost of not having FAIR research data},
394421
year = {2018},
@@ -1037,6 +1064,7 @@ @book{huyer2020
10371064
langid = {eng}
10381065
}
10391066

1067+
10401068
@article{istrate,
10411069
title = {A large dataset of software mentions in the biomedical literature},
10421070
author = {Istrate, Ana-Maria and Li, Donghui and Taraborelli, Dario and Torkar, Michaela and Veytsman, Boris and Williams, Ivana},
@@ -1057,7 +1085,6 @@ @inproceedings{jackson2016
10571085
langid = {en}
10581086
}
10591087

1060-
10611088
@inproceedings{jacob2019,
10621089
title = {FAIR principles, an new opportunity to improve the data lifecycle},
10631090
author = {Jacob, Daniel},
@@ -1083,6 +1110,7 @@ @article{janssens
10831110
langid = {en}
10841111
}
10851112

1113+
10861114
@article{johnston2017,
10871115
title = {Contemporary Guidance for Stated Preference Studies},
10881116
author = {Johnston, Robert J. and Boyle, Kevin J. and Adamowicz, {Wiktor (Vic)} and Bennett, Jeff and Brouwer, Roy and Cameron, Trudy Ann and Hanemann, W. Michael and Hanley, Nick and Ryan, Mandy and Scarpa, Riccardo and Tourangeau, Roger and Vossler, Christian A.},
@@ -1107,7 +1135,6 @@ @book{jung2023
11071135
url = {https://cran.r-project.org/web/packages/scrutiny/index.html}
11081136
}
11091137

1110-
11111138
@article{keller2014,
11121139
title = {Re-use of public sector information in cultural heritage institutions},
11131140
author = {Keller, Paul and Margoni, Thomas and Rybicka, Katarzyna and Tarkowski, Alek},
@@ -1985,6 +2012,7 @@ @inbook{roberts2013
19852012
url = {https://api.taylorfrancis.com/content/chapters/edit/download?identifierName=doi&identifierValue=10.4324/9780203824696-29&type=chapterpdf}
19862013
}
19872014

2015+
19882016
@inbook{roberts2013a,
19892017
title = {Scientific literacy/science literacy},
19902018
author = {Roberts, Douglas A.},
@@ -2009,7 +2037,6 @@ @article{robinson-garcia2017
20092037
url = {https://www.sciencedirect.com/science/article/pii/S1751157717300834}
20102038
}
20112039

2012-
20132040
@article{robinson-garcia2020,
20142041
title = {Open Access uptake by universities worldwide},
20152042
author = {Robinson-Garcia, Nicolas and Costas, Rodrigo and van Leeuwen, Thed N.},
@@ -2248,6 +2275,20 @@ @article{tennant2016
22482275
url = {https://f1000research.com/articles/5-632}
22492276
}
22502277

2278+
@article{tiokhin_shifting_2023,
2279+
title = {Shifting the {Level} of {Selection} in {Science}},
2280+
issn = {1745-6916},
2281+
doi = {10.1177/17456916231182568},
2282+
abstract = {Criteria for recognizing and rewarding scientists primarily focus on individual contributions. This creates a conflict between what is best for scientists’ careers and what is best for science. In this article, we show how the theory of multilevel selection provides conceptual tools for modifying incentives to better align individual and collective interests. A core principle is the need to account for indirect effects by shifting the level at which selection operates from individuals to the groups in which individuals are embedded. This principle is used in several fields to improve collective outcomes, including animal husbandry, team sports, and professional organizations. Shifting the level of selection has the potential to ameliorate several problems in contemporary science, including accounting for scientists’ diverse contributions to knowledge generation, reducing individual-level competition, and promoting specialization and team science. We discuss the difficulties associated with shifting the level of selection and outline directions for future development in this domain.},
2283+
language = {en},
2284+
urldate = {2024-09-26},
2285+
journal = {Perspectives on Psychological Science},
2286+
author = {Tiokhin, Leo and Panchanathan, Karthik and Smaldino, Paul E. and Lakens, Daniël},
2287+
month = aug,
2288+
year = {2023},
2289+
pages = {17456916231182568}
2290+
}
2291+
22512292
@article{tomkins_reviewer_2017,
22522293
title = {Reviewer bias in single- versus double-blind peer review},
22532294
volume = {114},
@@ -2288,6 +2329,20 @@ @article{traag2021
22882329
langid = {en}
22892330
}
22902331

2332+
@article{van_den_besselaar_perverse_2017,
2333+
title = {Perverse effects of output-based research funding? {Butler}'s {Australian} case revisited},
2334+
volume = {11},
2335+
issn = {1751-1577},
2336+
doi = {10.1016/j.joi.2017.05.016},
2337+
number = {3},
2338+
journal = {J. Informetr.},
2339+
author = {van den Besselaar, Peter and Heyman, Ulf and Sandström, Ulf},
2340+
month = aug,
2341+
year = {2017},
2342+
note = {Publisher: Elsevier Ltd},
2343+
pages = {905--918}
2344+
}
2345+
22912346
@book{venturini2021,
22922347
title = {Controversy Mapping: A Field Guide},
22932348
author = {Venturini, Tommaso and Munk, Anders Kristian},
@@ -2312,6 +2367,20 @@ @incollection{vohland_citizen_2021
23122367
pages = {35--53}
23132368
}
23142369

2370+
@article{waltman_elephant_2016,
2371+
title = {The elephant in the room: {The} problem of quantifying productivity in evaluative scientometrics},
2372+
volume = {10},
2373+
issn = {1751-1577},
2374+
shorttitle = {The elephant in the room},
2375+
doi = {10.1016/j.joi.2015.12.008},
2376+
number = {2},
2377+
journal = {Journal of Informetrics},
2378+
author = {Waltman, Ludo and van Eck, Nees Jan and Visser, Martijn and Wouters, Paul},
2379+
month = may,
2380+
year = {2016},
2381+
pages = {671--674}
2382+
}
2383+
23152384
@article{waltman_field_2019,
23162385
title = {Field {Normalization} of {Scientometric} {Indicators}},
23172386
doi = {10.1007/978-3-030-02511-3_11},
@@ -2404,6 +2473,7 @@ @misc{whatper
24042473
langid = {en}
24052474
}
24062475

2476+
24072477
@article{wilkinson2016,
24082478
title = {The FAIR Guiding Principles for scientific data management and stewardship},
24092479
author = {Wilkinson, Mark D. and Dumontier, Michel and Aalbersberg, IJsbrand Jan and Appleton, Gabrielle and Axton, Myles and Baak, Arie and Blomberg, Niklas and Boiten, Jan-Willem and da Silva Santos, Luiz Bonino and Bourne, Philip E. and Bouwman, Jildau and Brookes, Anthony J. and Clark, Tim and Crosas, {Mercè} and Dillo, Ingrid and Dumon, Olivier and Edmunds, Scott and Evelo, Chris T. and Finkers, Richard and Gonzalez-Beltran, Alejandra and Gray, Alasdair J. G. and Groth, Paul and Goble, Carole and Grethe, Jeffrey S. and Heringa, Jaap and {{\textquoteright}t Hoen}, Peter A. C. and Hooft, Rob and Kuhn, Tobias and Kok, Ruben and Kok, Joost and Lusher, Scott J. and Martone, Maryann E. and Mons, Albert and Packer, Abel L. and Persson, Bengt and Rocca-Serra, Philippe and Roos, Marco and van Schaik, Rene and Sansone, Susanna-Assunta and Schultes, Erik and Sengstag, Thierry and Slater, Ted and Strawn, George and Swertz, Morris A. and Thompson, Mark and van der Lei, Johan and van Mulligen, Erik and Velterop, Jan and Waagmeester, Andra and Wittenburg, Peter and Wolstencroft, Katherine and Zhao, Jun and Mons, Barend},
@@ -2436,6 +2506,7 @@ @article{wilkinson2016a
24362506
langid = {en}
24372507
}
24382508

2509+
24392510
@article{wilner,
24402511
title = {Complete recovery of values in Diophantine systems (CORVIDS)},
24412512
author = {Wilner, Sean and Wood, Katherine and Simons, Daniel J.},
@@ -2453,6 +2524,7 @@ @book{wood2021
24532524
note = {original-date: 2018-01-29T16:15:29Z}
24542525
}
24552526

2527+
24562528
@article{woods2022,
24572529
title = {Incentivising research data sharing: a scoping review},
24582530
author = {Woods, Helen Buckley and Pinfield, Stephen},
@@ -2466,7 +2538,6 @@ @article{woods2022
24662538
url = {https://wellcomeopenresearch.org/articles/6-355/v2}
24672539
}
24682540

2469-
24702541
@article{wuchty2007,
24712542
title = {The Increasing Dominance of Teams in Production of Knowledge},
24722543
author = {Wuchty, Stefan and Jones, Benjamin F. and Uzzi, Brian},
@@ -2492,7 +2563,6 @@ @article{yarkoni2019
24922563
url = {https://psyarxiv.com/jqw35/}
24932564
}
24942565

2495-
24962566
@article{zahedi2017,
24972567
title = {Mendeley readership as a filtering tool to identify highly cited publications},
24982568
author = {Zahedi, Zohreh and Costas, Rodrigo and Wouters, Paul},

0 commit comments

Comments
 (0)