TY - JOUR
T1 - Characterizing in-text citations in scientific articles
T2 - A large-scale analysis
AU - Boyack, Kevin W.
AU - van Eck, Nees Jan
AU - Colavizza, Giovanni
AU - Waltman, Ludo
N1 - Funding Information:
Kevin Boyack and Giovanni Colavizza both thank CWTS for hosting them as visiting scholars, during which time most of this work was performed. We thank Mike Patek of SciTech Strategies, Inc. for extraction and fielding of the full text from PubMed Central, and Richard Klavans and Vincent Traag for helpful discussion on our work. Giovanni Colavizza is funded by Swiss National Fund grant number P1ELP2_168489 .
Publisher Copyright:
© 2017 Elsevier Ltd
PY - 2018
Y1 - 2018
N2 - We report characteristics of in-text citations in over five million full text articles from two large databases – the PubMed Central Open Access subset and Elsevier journals – as functions of time, textual progression, and scientific field. The purpose of this study is to understand the characteristics of in-text citations in a detailed way prior to pursuing other studies focused on answering more substantive research questions. As such, we have analyzed in-text citations in several ways and report many findings here. Perhaps most significantly, we find that there are large field-level differences that are reflected in position within the text, citation interval (or reference age), and citation counts of references. In general, the fields of Biomedical and Health Sciences, Life and Earth Sciences, and Physical Sciences and Engineering have similar reference distributions, although they vary in their specifics. The two remaining fields, Mathematics and Computer Science and Social Science and Humanities, have different reference distributions from the other three fields and between themselves. We also show that in all fields the numbers of sentences, references, and in-text mentions per article have increased over time, and that there are field-level and temporal differences in the numbers of in-text mentions per reference. A final finding is that references mentioned only once tend to be much more highly cited than those mentioned multiple times.
AB - We report characteristics of in-text citations in over five million full text articles from two large databases – the PubMed Central Open Access subset and Elsevier journals – as functions of time, textual progression, and scientific field. The purpose of this study is to understand the characteristics of in-text citations in a detailed way prior to pursuing other studies focused on answering more substantive research questions. As such, we have analyzed in-text citations in several ways and report many findings here. Perhaps most significantly, we find that there are large field-level differences that are reflected in position within the text, citation interval (or reference age), and citation counts of references. In general, the fields of Biomedical and Health Sciences, Life and Earth Sciences, and Physical Sciences and Engineering have similar reference distributions, although they vary in their specifics. The two remaining fields, Mathematics and Computer Science and Social Science and Humanities, have different reference distributions from the other three fields and between themselves. We also show that in all fields the numbers of sentences, references, and in-text mentions per article have increased over time, and that there are field-level and temporal differences in the numbers of in-text mentions per reference. A final finding is that references mentioned only once tend to be much more highly cited than those mentioned multiple times.
KW - Citation counts
KW - Citation position analysis
KW - Field-level analysis
KW - In-text citations
KW - Reference age
U2 - 10.1016/j.joi.2017.11.005
DO - 10.1016/j.joi.2017.11.005
M3 - Journal article
AN - SCOPUS:85036452854
SN - 1751-1577
VL - 12
SP - 59
EP - 73
JO - Journal of Informetrics
JF - Journal of Informetrics
IS - 1
ER -