Mapping the influence of humanities

David Budtz Pedersen  presented a new research proposal undertaken in Denmark, Mapping the Public Influence of the Humanities, with the aim to map the influence, meaning and value of the humanties in Denmark. His blogpost on the Impact Blog about this project generated a lot of attetion already. Even in the holiday season.

What struck me however, is that the project starts with collecting data from three different sources:

  1. names and affiliations of active scientific experts in Denmark;
  2. by documenting the educational background and research profile of the population;
  3. by downloading and collecting the full corpus of reports, whitepapers, communications, policy briefs, press releases and other written results of Danish advisory groups from 2005 to 2015.

It was the third objective of Buts Pedersen’s project that grabbed my attention: collecting the full corpus of reports, whitepapers, communications, policy briefs, press releases and other written results of Danish advisory groups from 2005 to 2015. That in the country where Atira, the makers of Pure, reside (I know currently wholly owned by Elsevier). It struck a chord with me since this is exactly what should have been done already. Assuming the influence of the humanities is a scholarly debate, all universities contributing to this debate should have an ample filled current research information systems (CRIS) filled with exactly those reports, whitepapers, communications, policy briefs, press releases et cetra.

In this post I want to concentrate on the collection side, assuming that all material collected in the CRIS is available online and free for the public at large to inspect, query and preferably -but not necesarrily- free to download. Let’s look at the collection side for a moment. Most CRIS have all kind of database coupling possibilities with major (scholarly) bibliographic databases: Web of Science, Scopus, Pubmed, Worldcat, CrossRef etc. However, those reports, whitepapers, communications, policy briefs, press releases and other written results are not normally contained in these bibliographic databases. These are the so called grey literature. Not formally published. Not formally indexed. Not easily discovered. Not easily found. Not easily collected. To collect these materials we have to ask and beg researchers to dutifully add these manually in the university CRIS.  That is exactly why universities have bought into CRIS systems. Why libraries are the ideal candidate to maintain CRIS systems. The CRIS system takes away the burden of keeping track of the formal publications through coupling with the formal bibliographic databases. Librarians have knoweldge about all these couplings and search profiles required to make life easy for the researchers. That should leave some time for researchers to devote a little of their valuable time on those other more esoteric materials. Especially in the humanities, where we apparently have more of those grey literature.  A well maintained CRIS should have plentiful of these materials registered. So I was taken aback slightly that this project in Denmark, the cradle of a major CRIS supplier, needs to collect these materials from the start. They should have been registered long time ago already. That is where the value kicks in of a comprehensive, all output inclusive CRIS, resulting in a website with a comprehensive institutional bibliography.

Just a second thought. It is odd to see that two of the major providers of CRIS systems, Thomson Reuters with Converis and Elsevier with Pure are both providers of major news information sources. It is odd that neither of these CRIS products have coupling with the proprietary news databases either Reuters or LexisNexis for press clipping and mentios in the media. From a CRIS managers’ point of view strange to make this observation since we are dealing with the same companies. But the internal company structures seem to hinder these kind seemingly logical coupling of services.

 

Grey Literature at Wageningen UR, the Library, the Cloud(s) and Reporting

A while back I gave a presentation at the offices of SURF during a small scale seminar on Grey Literature in the Netherlands. The occasion was the visit of Amanda Lawrence to SURF to discuss Grey Literature in the Netherlands.

I was invited to give a presentation of Grey Literature at Wageningen UR. The slides I used are shared in this Slideshare.

Where I always assume that slides tell their story themselves. It is perhaps a good idea to provide some narrative in this blog post to explain certain parts that are perhaps less obvious. In the first slides I present the university and the research institutes at Wageningen. The student staff ratios are so favourable since Wageningen UR comprises of a university and a number of substantive research institutes that concentrate on research in the life sciences only and have no teaching obligations.

CRIS and repository

At the library we manage two systems for the whole organization that are closely intertwined. The current research information system (CRIS) called Metis. In Metis we register all output of Wageningen UR faculty and staff. The data entry normally takes places at the chair group level. Most often the secretary of the chair group or business unit is responsible and the library checks the quality of the data entry and maintains the various lists that facilitates data entry and quality control. Output registration in the metis is really comprehensive, since evaluations, award of bonuses, prizes, promotions, research assessments and external peer reviews take place on the metadata registered in the Metis.

All information that is registered in the CRIS, Metis, is published in our institutional bibliography called Staff Publications. I prefer the term institutional bibliography since the term repository is often associated with Open Access repositories or (open access) institutional repositories only. Whereas in my view the institutional bibliography is the comprehensive metadata collection for all output of the institution including, but not limited to, Open Access publications. It goes without saying that data sets are an integral part of the research output, and we are starting to register datasets in our systems as well.

The coupling of the CRIS and the institutional bibliography exists only since 2003. We have in our bibliography a collection of 90,000 heritage metadata records of lesser quality. Of the 200,000+ items in our repository 25% contain open access items. Looking at the peer reviewed journal articles registered in our staff publications (indicated as WaY in the graph) you can see that it closely follows th enumber of articles that van be retrieved from either Scopus or Web of Science. There are differences in the coverage between Web of Science and Scopus, but both databases seem to cover Wageningen UR output quite closely. Or not?

Comprehesive registration

In slides 6 I show all metadata registrations of publication output. Reaching more than 12,500 items described for the publication year 2010. In the year the number of peer reviewed articles registered was only around 2700 peer reviewed publications. We registered nearly 10,000 other items of research output. In slide 7 I present an overview of the various document types registered on top of peer reviewed publications only. Most important are the “other” articles, those are
articles published in trade or vocational journals. These have very often to do with the societal role the university and research institutes play. These articles are aimed at the larger public and therefore very often in Open Access as well. Book chapters and reports are also very substantial amounts of publications. The reports are most often aimed at the various ministries for which the research institutes work and most often published as OA reports as well. With book chapters this is often not the case. On a yearly basis they are not so conspicuous, but the PhD-theses are nearly all available in Open Access or in a few cases as delayed Open Access. The other items include presentations, brochures, lectures, patents, interviews for newspapers, radio or TV. It is all registered. It all makes a very substantial addition to the peer reviewed publications only.

Dissemination to the Cloud(s)

The institutional bibliography plays a crucial role in the dissemination of information to other parties. All metadata records are indexed in both Google and slightly less in Google Scholar, but we experience problems with Google Scholar indexing the full text of our Open Access publications, since the full text files are located on a separate filing systems. All Dutch language publications are disseminated trough Groenkennisnet.nl a portal for education and practitioners in the green sector. Wageningen UR Staff Publications is fully OAI/PMH compatible and data is disseminated to Narcis, the overarching repository of repositories in the Netherlands. Other repository aggregators include OAISTER and BASE. The information is harvested by the FAO, which plays a pivotal role in the dissemination of agricultural information in the world. All our PhD-theses are disseminated to DART-Europe the Electronic Theses and Dissertations (ETD) portal for Europe. With our retrospectively digitized collection of theses we are the 12th largest collection of PhD theses in Europe.

Open Access

The growth of Open Access publications is a steady one, although we occasionally face sets backs. Last year for instance we got claims from photographers whose images were used in trade journals for illustration purposes, and the IP rights for electronic dissemination were not rightfully addressed. Currently we just passed the 50,000 OA publications border. When you look at all depositions of OA material in Dutch repositories, Wageningen UR stands out in depositing current material (slide 10). Outperforming any of the other universities (slide 11). Looking at the documentation types of the recent material deposited, it is immediately apparent that Wageningen deposits relatively large numbers of reports and contributions to periodicals (the trade and vocational journals) and also deposits more conference papers as Open Access publications.

De deposition of green OA peer reviewed journal articles is not very successful. We don’t have an intuitive system for the researchers to deposit their publication in place. The library systematically checks the publications and see what we are allowed to do with the publishers versions of the article. In the first place we look at the DOAJ journal list, and actively load those articles in the repository. Secondly we look at the Sherpa/Romeo list of publishers allowing the delayed archiving of publishers PDF. The third list, not truly OA, is the list of publishers allowing free to read access after an embargo period, which we link. A last resort, could be, to link to deposited material in PMC. But we haven’t done that yet. The first two steps leads to 23% of our peer reviewed journal articles being available in Open Access, steps 3 and 4 still need to be executed.

Grey Literature

Why are we so successful in collecting the grey literature output? At the university registration of output is grind in the system. We started at the university in 1975 already and it took years before everybody complied. But faculty and staff are now quite used to do this. Registration also leads to comprehensive reports on publications activities of researchers and research groups. For the relatively recent introduced tenure track, the systems calculates the research credits for the candidates. For staff we provide an attractive graphic overview of their publications with various par charts and pie charts and their co-author network, but most important is a bibliometric report on the basis of articles published in journals covered in the Web of Science, benchmarked on the basis of the baselines from the Essential Science Indicators.

If all universities register the publication output more comprehensively in their current research information systems, these outputs can then be made available trough their repositories. In the example of the publication in Dutch on Culicoides, we see that it concerns a report by researchers from Utrecht University, but this report is not to be found in their OA repository (The publication is not scientific!?) nor in the catalogue of the university. If Narcis would be made the official tool for reporting publication output to the ministry of education on publication out put in the Netherlands in a transparent and verifiable way, publications like these will make a chance to be collected, described and curated.

If the OA repository infrastructure in the Netherlands improves, Narcis can be turned into a service as link resolver. Using the DOI, we could resolve that against the publishers site, but also to Narcis which point to an OA version of the same paper at a repository of one of the universities. In the case of public libraries in the Netherlands, we could configure a national link resolver that exposes OA material in addition to the efficient Google Scholar Open Access material. This is important since not all repository content is discovered in Google Scholar.

Knowledge Economy

With regards to a new knowledge economy, a important report was published quite recently. However, the report did not mention libraries, did not mention repositories, did not mention grey literature. So there is still a world to win for comprehensive institutional repositories that collects and disseminate all the grey literature that is openly available.

References
WRR. 2013. Naar een lerende economie : Investeren in het verdienvermogen van Nederland. WRR report Vol. 90. Amsterdam: Amsterdam University Press. 440 pp. http://www.wrr.nl/publicaties/publicatie/article/naar-een-lerende-economie-1/