Mapping the influence of humanities

David Budtz Pedersen  presented a new research proposal undertaken in Denmark, Mapping the Public Influence of the Humanities, with the aim to map the influence, meaning and value of the humanties in Denmark. His blogpost on the Impact Blog about this project generated a lot of attetion already. Even in the holiday season.

What struck me however, is that the project starts with collecting data from three different sources:

  1. names and affiliations of active scientific experts in Denmark;
  2. by documenting the educational background and research profile of the population;
  3. by downloading and collecting the full corpus of reports, whitepapers, communications, policy briefs, press releases and other written results of Danish advisory groups from 2005 to 2015.

It was the third objective of Buts Pedersen’s project that grabbed my attention: collecting the full corpus of reports, whitepapers, communications, policy briefs, press releases and other written results of Danish advisory groups from 2005 to 2015. That in the country where Atira, the makers of Pure, reside (I know currently wholly owned by Elsevier). It struck a chord with me since this is exactly what should have been done already. Assuming the influence of the humanities is a scholarly debate, all universities contributing to this debate should have an ample filled current research information systems (CRIS) filled with exactly those reports, whitepapers, communications, policy briefs, press releases et cetra.

In this post I want to concentrate on the collection side, assuming that all material collected in the CRIS is available online and free for the public at large to inspect, query and preferably -but not necesarrily- free to download. Let’s look at the collection side for a moment. Most CRIS have all kind of database coupling possibilities with major (scholarly) bibliographic databases: Web of Science, Scopus, Pubmed, Worldcat, CrossRef etc. However, those reports, whitepapers, communications, policy briefs, press releases and other written results are not normally contained in these bibliographic databases. These are the so called grey literature. Not formally published. Not formally indexed. Not easily discovered. Not easily found. Not easily collected. To collect these materials we have to ask and beg researchers to dutifully add these manually in the university CRIS.  That is exactly why universities have bought into CRIS systems. Why libraries are the ideal candidate to maintain CRIS systems. The CRIS system takes away the burden of keeping track of the formal publications through coupling with the formal bibliographic databases. Librarians have knoweldge about all these couplings and search profiles required to make life easy for the researchers. That should leave some time for researchers to devote a little of their valuable time on those other more esoteric materials. Especially in the humanities, where we apparently have more of those grey literature.  A well maintained CRIS should have plentiful of these materials registered. So I was taken aback slightly that this project in Denmark, the cradle of a major CRIS supplier, needs to collect these materials from the start. They should have been registered long time ago already. That is where the value kicks in of a comprehensive, all output inclusive CRIS, resulting in a website with a comprehensive institutional bibliography.

Just a second thought. It is odd to see that two of the major providers of CRIS systems, Thomson Reuters with Converis and Elsevier with Pure are both providers of major news information sources. It is odd that neither of these CRIS products have coupling with the proprietary news databases either Reuters or LexisNexis for press clipping and mentios in the media. From a CRIS managers’ point of view strange to make this observation since we are dealing with the same companies. But the internal company structures seem to hinder these kind seemingly logical coupling of services.


Document types in databases

Currently I am reading with a great deal of interest the Arxiv preprint of the review by Ludo Waltman (2015) from CWTS on bibliometric indicators. In this post I want to provide a brief comment on his section 5.1 where he discusses the role document types in bibliometric analyses. Ludo mainly reviews and comments on the inclusion or exclusion of certain document types in bibliometric analyses, he does not touch upon the subject of discrepancies between databases. I want to argue that he could take his review a step further in this area.

Web of Science and Scopus, differ quite a bit from each other on how they assign document types. If you don’t realize that this discrepancy exists, you can draw wrong conclusions when bibliometric analyses between these databases are studied.

This blogpost is a quick illustration that this is an issue that should be adressed in a review like this. To illustrate my argument I looked to the document types assigned to Nature publications from 2014 in Web of Science and Scopus. The following tables gives an overview of the results:

Document Types Assigned to 2014 Nature Publications in Web of Science and Scopus
WoS Document type # Publications Scopus Document type # Publications
Editorial Material 833 Editorial 244
Article 828 Article 1064
Article in Press 56
News item 371
Letter 272 Letter 253
Correction 109 Erratum 97
Book Review 102
Review 34 Review 51
Biographical Item 13
Reprint 3
Note 600
Short Survey 257
Total 2565 2622

In the first place Scopus yields for the year 2014 a few more publications for Nature than Web of Science does. The difference can be explained by the articles in press that are still present in the Scopus search results. This probably still requires maintenance from Scopus and should be corrected.

More importantly WoS assigns 833 publications as “editorial material” whereas Scopus assigns only 244 publications as “editorial”. It is a well known tactic from journals such as Nature to assign articles as editorial material, since this practice artificially boosts their impact factor. I have had many arguments with professors whose invited news and views items (most often very well cited!) were not included in bibliometric analyses since they were assigned to “editorial material” category and therefore not included in the analysis.

“Letters”, “corrections” or “errata” are in the same order of size between Scopus and Web of Science. “News Items” are a category of publications in Web of Science, but not in Scopus. They are probably listed as “note” in Scopus. Some of the “short surveys” in Scopus turn up in Web of Science as “news item”. But all these categories probably don’t affect bibliometric analyses too much.

The discrepancy in “reviews” between Web of Science and Scopus however is important. And large as well. Web of Science assigns 34 articles as a “review”, whereas Scopus counts 51 “reviews” in the same journal over the same period. Reviews are included in bibliometric analyses, and since the attract relatively more citations than oridinary articles, special baselines are construed for this document type. But comparisons between these databases are foremost affected by differences in document assignation between these databases.

The differences in editorial material, articles and reviews between Web of Science and Scopus are most likely to the affect outcomes of comparisons in bibliometric analyses between these two databases. But I am not sure about the size of this effect. I would love to see some more quatitative studies in the bibliometrics arena to investigate this issue.



Waltman, Ludo (2015). A review of the literature on citation impact indicators.

The Impact Factor of Open Access journals

In the world of Open Access publishing the golden road has received a great deal of attention. At least this is what our researchers seem to remember. Of course there are other roads to open access, but I want to present the impact factors of the journals facilitating the golden road to open access. This blogpost lists all open access journals included in DOAJ and assigned an Journal Impact Factor in the JCR 2009. The reason for this, is that our researchers see publishing in open access journals as the simplest way of achieving open access to their work, but on the other hand they are required for judgement of the citation impact that they publish in journals covered by Web of Science and therefore the Journal Citation Reports (JCR).

In the past there have been studies on citation impact of the open access journals that have actually received a journal impact factor from Thomson Reuters Scientific (formerly ISI). The first was by (McVeigh 2004) followed by (Vanouplines and Beullens 2008) (in Dutch, and not openly accessible) and recently by (Giglia 2010). These consecutive studies showed an increasing number of open access journals that received an Journal Impact Factor from Thomson Reuters. McVeigh reported 239 OA journals for the JCR 2004, Vanouplines reported 295 OA journals for the JCR 2005 and Giglia reported 385 OA journals for the JCR 2008 (there are some methodological issues that make these figures not entirely comparable).

The pitfall of these studies is that although they showed interesting figures and additional analyses, none of these studies actually published the list of open access journals that received an impact factor. The sole purpose of this blogpost is to publish this actual list. The probable reason for the previous authors is that the impact factors are proprietary information from Thomson Reuters. You are not allowed to publish these figures. On the other hand most publishers, use it in all their marketing outings for their journals. So the journal impact factor is virtually information in the public domain.

To avoid any intellectual property problems with Thomson Reuters I have included the ScimagoJR and Scopus SNIP indicator for the journals rather than the Journal Impact Factor. The correlation for this set of journals between SNIP and IF was 0.94 and between SJR and IF was 0.96. In total 619 journals from DOAJ were present in the JCR 2009 report (Science and Social Science & Humanities version deduplicated). The growth in journal coverage is due to the growth in OA journals and the significant expansion of journal coverage in 2008. On the other hand looking at the journal list of Scopus indexed journals I note that they include some 1365 journals open access journal which have a ScimagoJR or SNIP.

For the current table I matched the journal list from DOAJ downloaded on December 13th 2010, with the deduplicated list of the JCR 2009 indexed journals. This journal set of 619 journals was matched against the journal list from to include the ScimagoJR 2009 and SNIP2009 as well. For each journal the subject categories indicated by DOAJ were included. The journals were sorted alphabetically on subjects and descending IF within a subject. For the following table journals with multiple subject assignments in DOAJ were included in their different categories as well. This expanded the list to 782 lines. Finally the column with impact factors was removed, showing only the ScimagoJR and SNIP for the journals. A few journals were not assigned a ScimagoJR or SNIP, but these were assigned a Journal Impact Factor. In some cases this was due to differences in journal coverage between Scopus and Web of Science, but in a few cases this appears also the problem of different ISSN assignments by the respective databases.

Download: List of open access journals that are assigned an Impact Factor in the JCR 2009 showing their respective SNIP and ScimagoJR for 2009.

Have fun with this list


Giglia, E. (2010). The Impact Factor of Open Access journals: data and trends. ELPUB 2010 International Conference on Electronic Publishing, Helsinki (Finland), 16-18 June 2010. and

McVeigh, M.E. (2004). Open Access Journals in the ISI Citation Databases: Analysis of Impact Factors and Citation Patterns A citation study from Thomson Scientific, Thomson Scientific.

Vanouplines, P. & R. Beullens (2008). De impact van open access tijdschriften. IK Intelectueel Kapitaal 7(5): 14-17. (In Dutch, Not OA available)

Possibly related posts
Another expansion of journal coverage by Thomson

Scopus is adding institute disambiguation

Today it was announced that institute disambiguation, or the affiliation identifier, will become functional in Scopus early January 2008.  At this promotional site it is demonstrated what a search for the University of Liverpool returns in options of selection the right University of Liverpool and whether or not you want to include the teaching hospitals in a subsequent search as well.

Web of Science already included a refine option with an affiliation option amongst others, but they way the results are presented for Scopus shows that Elsevier has taken a different approach to solving this problem.

It will be interesting to test both approaches in more detail when the Scopus tool is officially launched.

Scopus is speeding up it’s indexing

I knew it was coming, today I noted it for the first time that Scopus is already indexing and alerting ‘articles in press’ (or any of its variations such as ‘online first’). In one of my regular alerts I got this article from Henk Moed:

Moed, H.F. (2007) UK Research Assessment Exercises: Informed judgments on research quality or quantity? Scientometrics, pp. 1-9. Article in Press