Article Processing Charges (APC) of Gold Open Access journals are very often deeply hidden in journal websites. Sometimes they aren’t even stated on the journal website, eg. “For inquiries relating to the publication fee of articles, please contact the editorial office“. The lack of good overviews hinders research into APCs between different publishers and journals. To my knowledge there is only the Eigenfactor APC overview that provides a reasonable amount of information, but is already getting outdated. The DOAJ used to have at least a lost of free journals, but that is currently no longer available, due to the restructuring of DOAJ. For this reason I have made a small start to collect the article processing charges of some major Open Access publishers. I do invite anybody to add more journals from any Open Access publishers. However most interesting are of course the price information of journals listed in Web of Science or Scopus. Please inform others and help to complete this list. Anybody with the link can edit the file.
Book review: Devine, J., & Egger-Sider, F. (2014). Going beyond Google again : strategies for using and teaching the Invisible Web. Chicago: Neal-Schuman, an imprint of the American Library Association. ISBN 9781555708986, 180p.
The invisible web, as we know it, dates back to at least 2001. In that year both Sherman & Price (2001) as well as Bergman (2001) came out with two studies describing the whole issue surrounding the deep, or invisible web, for the first time. These two seminal studies each used a different term to indicate the same concept, invisible and deep, but both described independently from each other convincingly that there was more information available that ordinary search engines can see.
Later on Lewandowski & Mayr (2006) showed that Bergmann perhaps overstated the size of the actual problem, but it certainly remained a problem for those unaware of the whole issue. Whilst Ford & Mansourian (2006) added the concept of the “cognitive inivisbility”, i.e. everything beyond page 1 in the Google results page. Since then very little has happened in the research on this problem in the search or information retrieval community. The notion of “deep web” has continued to receive some interest in the computer sciences, where they look into query expansion and data mining to alleviate the problems. But ground breaking scientific studies on this subject in the area of information retrieval or LIS have been scanty.
The authors of the current book Devine and Egger-Sider have been involved with the invisible web already since 2004 (Devine & Egger-Sider, 2004; Devine & Egger-Sider, 2009). Their main concern is to get the concept of the invisible web in the curriculum for information literacy. The current book documents a major survey in this area. For the purpose of getting the invisible web in the information literacy curriculum they maintain a useful website with invisible web discovery tools.
The current book is largely a repetition of their previous book (Devine & Egger-Sider, 2009). However two major additions to the notion of the invisible web have been added. Web 2.0 or the social web, and the mobile or the apps web. The first concept I was aware of and used it in classes for information professionals in the Netherlands for quite a long time already. The second concept was an eye opener for me. I did realize that search on mobile devices was different, more personalized than anything else, but I had not categorized it as a part of the invisible web.
Where Devine and Egger-Sider (2014) disappoint is that the proposed solutions, curricula etc, only address the invisible as a database problem. Identify the right databases and perform your searches. Make students and scholars aware of the problem, guide them to the additional resources and the problem is solved. However, no solution whatsoever, is provided to solve the information gap due to the social web or the mobile web. On this part the book does not add anything to the version from 2009.
Another notion of the ever increasing invisible web as we know it, concerns grey literature. Scholarly output in the form of peer reviewed articles or books are reasonably well covered by (web) search engines and library subscribed A&I databases, but to retrieve the grey literature still remains a major problem. The whole notion of grey literature is mentioned in this book. Despite the concern about the invisible or deep web, they also fail to stress the advantages that full scale web search engines have brought. Previously we only had the indexed bibliographic information to search whereas web search engines brought us full text search. Full text search, while not being superior, has brought us new opportunities and sometimes improved retrieval as well.
The book is not entirely up to date. The majority of the reference are up to date to 2011, only a few 2012 let alone 2013 references are included. Apparently the book took a long time to write and produce. But what is really lacking is a suitable accompanying website. The many URLs provided in the book on a short list would have been helpful to probably many readers. For the time being we have to do it with their older webpage which is less comprehensive than the complete collection of sources mentioned in this edition.
Where the book completely fails is the inclusion of the darknet. Since Wikileaks and Snowden we should be aware that even more is going on in the invisible web than ever before. Devine & Egger Sider, only mention the darknet or dark web as an area not to treat. This is slightly disappointing.
If you have already the 2009 edition of this book, there is no need to upgrade to the current version.
Bergman, M.K. (2001). White Paper: The Deep Web: Surfacing Hidden Value. The Journal of Electronic Publishing, 7(1). http://dx.doi.org/10.3998/3336451.0007.104
Devine, J., & Egger-Sider, F. (2004). Beyond Google : The invisible Web in the academic library. The Journal of Academic Librairianship, 30(4), 265-269. http://dx.doi.org/10.1016/j.acalib.2004.04.010
Devine, J., & Egger-Sider, F. (2009). Going beyond Google : the invisible web in learning and teaching. London: Facet Publishing. 156p.
Devine, J., & Egger-Sider, F. (2014). Going beyond Google again : strategies for using and teaching the Invisible Web. Chicago: Neal-Schuman, an imprint of the American Library Association. 180p.
Lewandowski, D., & Mayr, P. (2006). Exploring the academic invisible web. Library Hi Tech, 24(4), 529-539. http://dx.doi.org/10.1108/07378830610715392 OA version: http://eprints.rclis.org/9203/
Sherman, C., & Price, G. (2001). The invisible web: Discovering information sources search engines can’t see. Medford NJ, USA: Information today. 439p.
Ford, N., & Mansourian, Y. (2006). The invisible web: An empirical study of “cognitive invisibility”. Journal of Documentation, 62(5), 584-596. http://dx.doi.org/10.1108/00220410610688732
Other reviews for this book
Malone, A. (2014). Going Beyond Google Again: Strategies for Using and Teaching the Invisible Web, Jane Devine, Francine Egger-Sider. Neal-Schuman, Chicago (2014), ISBN: 978-1-55570-898-6. The Journal of Academic Librarianship, 40(3–4), 421. http://dx.doi.org/10.1016/j.acalib.2014.03.006
Mason, D. (2014). Going Beyond Google Again: Strategies for Using and Teaching the Invisible Web. Online Information Review, 38(7), 992-993. http://dx.doi.org/10.1108/OIR-10-2014-0228
Stenis, P. (2014). Going Beyond Google Again: Strategies for Using and Teaching the Invisible Web. Reference & User Services Quarterly, 53(4), 367-367. http://dx.doi.org/10.5860/rusq.53n4.367a
Sweeper, D. (2014). A Review of “Going Beyond Google Again: Strategies for Using and Teaching the Invisible Web”. Journal of Electronic Resources Librarianship, 26(2), 154-155. http://dx.doi.org/10.1080/1941126x.2014.910415
A while back I gave a presentation at the offices of SURF during a small scale seminar on Grey Literature in the Netherlands. The occasion was the visit of Amanda Lawrence to SURF to discuss Grey Literature in the Netherlands.
I was invited to give a presentation of Grey Literature at Wageningen UR. The slides I used are shared in this Slideshare.
Where I always assume that slides tell their story themselves. It is perhaps a good idea to provide some narrative in this blog post to explain certain parts that are perhaps less obvious. In the first slides I present the university and the research institutes at Wageningen. The student staff ratios are so favourable since Wageningen UR comprises of a university and a number of substantive research institutes that concentrate on research in the life sciences only and have no teaching obligations.
CRIS and repository
At the library we manage two systems for the whole organization that are closely intertwined. The current research information system (CRIS) called Metis. In Metis we register all output of Wageningen UR faculty and staff. The data entry normally takes places at the chair group level. Most often the secretary of the chair group or business unit is responsible and the library checks the quality of the data entry and maintains the various lists that facilitates data entry and quality control. Output registration in the metis is really comprehensive, since evaluations, award of bonuses, prizes, promotions, research assessments and external peer reviews take place on the metadata registered in the Metis.
All information that is registered in the CRIS, Metis, is published in our institutional bibliography called Staff Publications. I prefer the term institutional bibliography since the term repository is often associated with Open Access repositories or (open access) institutional repositories only. Whereas in my view the institutional bibliography is the comprehensive metadata collection for all output of the institution including, but not limited to, Open Access publications. It goes without saying that data sets are an integral part of the research output, and we are starting to register datasets in our systems as well.
The coupling of the CRIS and the institutional bibliography exists only since 2003. We have in our bibliography a collection of 90,000 heritage metadata records of lesser quality. Of the 200,000+ items in our repository 25% contain open access items. Looking at the peer reviewed journal articles registered in our staff publications (indicated as WaY in the graph) you can see that it closely follows th enumber of articles that van be retrieved from either Scopus or Web of Science. There are differences in the coverage between Web of Science and Scopus, but both databases seem to cover Wageningen UR output quite closely. Or not?
In slides 6 I show all metadata registrations of publication output. Reaching more than 12,500 items described for the publication year 2010. In the year the number of peer reviewed articles registered was only around 2700 peer reviewed publications. We registered nearly 10,000 other items of research output. In slide 7 I present an overview of the various document types registered on top of peer reviewed publications only. Most important are the “other” articles, those are
articles published in trade or vocational journals. These have very often to do with the societal role the university and research institutes play. These articles are aimed at the larger public and therefore very often in Open Access as well. Book chapters and reports are also very substantial amounts of publications. The reports are most often aimed at the various ministries for which the research institutes work and most often published as OA reports as well. With book chapters this is often not the case. On a yearly basis they are not so conspicuous, but the PhD-theses are nearly all available in Open Access or in a few cases as delayed Open Access. The other items include presentations, brochures, lectures, patents, interviews for newspapers, radio or TV. It is all registered. It all makes a very substantial addition to the peer reviewed publications only.
Dissemination to the Cloud(s)
The institutional bibliography plays a crucial role in the dissemination of information to other parties. All metadata records are indexed in both Google and slightly less in Google Scholar, but we experience problems with Google Scholar indexing the full text of our Open Access publications, since the full text files are located on a separate filing systems. All Dutch language publications are disseminated trough Groenkennisnet.nl a portal for education and practitioners in the green sector. Wageningen UR Staff Publications is fully OAI/PMH compatible and data is disseminated to Narcis, the overarching repository of repositories in the Netherlands. Other repository aggregators include OAISTER and BASE. The information is harvested by the FAO, which plays a pivotal role in the dissemination of agricultural information in the world. All our PhD-theses are disseminated to DART-Europe the Electronic Theses and Dissertations (ETD) portal for Europe. With our retrospectively digitized collection of theses we are the 12th largest collection of PhD theses in Europe.
The growth of Open Access publications is a steady one, although we occasionally face sets backs. Last year for instance we got claims from photographers whose images were used in trade journals for illustration purposes, and the IP rights for electronic dissemination were not rightfully addressed. Currently we just passed the 50,000 OA publications border. When you look at all depositions of OA material in Dutch repositories, Wageningen UR stands out in depositing current material (slide 10). Outperforming any of the other universities (slide 11). Looking at the documentation types of the recent material deposited, it is immediately apparent that Wageningen deposits relatively large numbers of reports and contributions to periodicals (the trade and vocational journals) and also deposits more conference papers as Open Access publications.
De deposition of green OA peer reviewed journal articles is not very successful. We don’t have an intuitive system for the researchers to deposit their publication in place. The library systematically checks the publications and see what we are allowed to do with the publishers versions of the article. In the first place we look at the DOAJ journal list, and actively load those articles in the repository. Secondly we look at the Sherpa/Romeo list of publishers allowing the delayed archiving of publishers PDF. The third list, not truly OA, is the list of publishers allowing free to read access after an embargo period, which we link. A last resort, could be, to link to deposited material in PMC. But we haven’t done that yet. The first two steps leads to 23% of our peer reviewed journal articles being available in Open Access, steps 3 and 4 still need to be executed.
Why are we so successful in collecting the grey literature output? At the university registration of output is grind in the system. We started at the university in 1975 already and it took years before everybody complied. But faculty and staff are now quite used to do this. Registration also leads to comprehensive reports on publications activities of researchers and research groups. For the relatively recent introduced tenure track, the systems calculates the research credits for the candidates. For staff we provide an attractive graphic overview of their publications with various par charts and pie charts and their co-author network, but most important is a bibliometric report on the basis of articles published in journals covered in the Web of Science, benchmarked on the basis of the baselines from the Essential Science Indicators.
If all universities register the publication output more comprehensively in their current research information systems, these outputs can then be made available trough their repositories. In the example of the publication in Dutch on Culicoides, we see that it concerns a report by researchers from Utrecht University, but this report is not to be found in their OA repository (The publication is not scientific!?) nor in the catalogue of the university. If Narcis would be made the official tool for reporting publication output to the ministry of education on publication out put in the Netherlands in a transparent and verifiable way, publications like these will make a chance to be collected, described and curated.
If the OA repository infrastructure in the Netherlands improves, Narcis can be turned into a service as link resolver. Using the DOI, we could resolve that against the publishers site, but also to Narcis which point to an OA version of the same paper at a repository of one of the universities. In the case of public libraries in the Netherlands, we could configure a national link resolver that exposes OA material in addition to the efficient Google Scholar Open Access material. This is important since not all repository content is discovered in Google Scholar.
With regards to a new knowledge economy, a important report was published quite recently. However, the report did not mention libraries, did not mention repositories, did not mention grey literature. So there is still a world to win for comprehensive institutional repositories that collects and disseminate all the grey literature that is openly available.
“some would argue that digital libraries have very little to do with libraries as institutions or the practice of librarianship”
(Lynch, 2005). But this new book of Karen Calhoun has removed al my reservations against the term digital libraries, and built the bridge from digital library research to practical librarianship.
First of all, Calhoun has written an excellent book. Go, buy, read and learn from it. For anybody working in today’s academic library settings, a must read. Calhoun elegantly makes the connection between the digital library scientists that started in the previous century and the last decade, to the current questions we are dealing with in the academic library setting.
Calhoun describes the context around the usual pathways, from link resolvers, to the metalib solutions ending with the current options of discovery tools. But those off the shelf solutions are not too exciting.
Where I liked the book the most, and learned a lot was around the chapters on the repository. Those are insightful chapters, albeit I didn’t always agree with Cahoun’s views. Calhoun and I probably agree on the fact that repositories are the most challenging areas for academic libraries to be active in. Calhoun did not address the fact that this has resulted in an enormous change in workflow. In the classical library catalogue we only dealt with monographs and journals. In repositories we are dealing with more granular items such as book chapters, contributions to proceedings, articles and posters. That is not only a change from paper to digital, but also a completely different level of metadata descriptions. That are changes that we are still struggling to grasp with. I see in the everyday practice.
A shortcoming of the book is that Calhoun equated repositories with open access repositories. That is a misnomer to my mind. It is perhaps the more European setting where most academic libraries get involved in current research information systems (CRIS). This crisses form an essential part in the university digital infrastructure and feed a comprehensive institutional repository. The repository becomes thus far more than only a collection of OA items. Dear Karen have a look at our repository. More than 200,000 items collected, of which 50,000 available in Open Access. But more important, next to the peer 55,000 peer reviewed articles we have nearly 35,000 articles in professional or trade journals that boast our societal impact. We have also 27,000+ reports, nearly 18,000 abstracts and conference contributions as well. Institutional repositories to my mind should be more than Open Access repositories of peer reviewed journal articles alone. The institutional repository plays an important role in dissemination al kinds a “grey” literature output. Calhoun could probably learn more from the changing European landscape where CRIS and repositories are growing to each other and as a result completely new library role arises, when libraries can get a role in the management of the CRIS. But that is a natural match. Or should be.
What Calhoun made me realize is that we have a unique proposition in Wageningen. Our catalogue is comprehensively indexed in Google and nearly as well in Google Scholar. The indexing for our repository goes well in Google, but for our repository we are still struggeling to get the contents in Google Scholar. We have a project under way to correct this. But no success guaranteed, since Google Scholar is completely different from Google. No ordinary SEO expert has experience with these matters. But that we are indexed both in Google as well as Google Scholar are valuable assets. With our change to WorldCat local we have something to loose. We should tread carefully in this area.
Where I learned a lot from Calhoun, is from those chapters I normally don’t care too much about. The social roles of digital libraries and digital library communities. Normally areas, and literature, I tend to neglect, but the overview presented by Calhoun, really convinced me to solicit more buy-in for our new developments. We are in the preparation of our first centennial (in 2018) and running a project to collect and digitize all our official academic output. Where we present the results? Our comprehensive institutional bibliography! Of course. Not an easy task, but we are building our own, unique, digital library.
Disclaimer: I don’t have an MLIS, but work already for nearly 15 years with a lot of pleasure at Wageningen UR library, where I work in the area of research support.
Calhoun, K. 2014. Exploring digital libraries : Foundations, practices, prospects. Chicago: Neal-Schuman. 322p.
Lynch, C. 2005. Where do we go from here? The next decade for digital libraries. D-Lib Magazine, 11(7/8) http://www.dlib.org/dlib/july05/lynch/07lynch.html
Narcis is the overarching repository of (Open Access) repositories in the Netherlands. The website was entirely refreshed last week. It got a fresh, modern look. This new look was badly needed.
What did not change was the underlying database and quality of the data. That is a rally missed opportunity. Changing the paint, where repairing the woodwork is really needed is actually a waste of time and money.
Of course Narcis can’t repair it’s framework without the co-operation of the underlying repositories. With at least all universites buying in to better Current Research Information Systems (CRIS) this is the moment to prepare Narcis for the future.
I have pleaded on this blog before to make Narcis the comprehensive metadata aggregator for all scholarly output in the Netherlands. Not only Open Access (OA) publications. But the comprehensive university output. The numbers for the official VSNU reports on scholarly productivity should be based on Narcis, and all metadata underlying those reports should become verifiable in Narcis. This improves the transparency of reporting and transparency of the generated reports. Then, it should go without saying that meaningful reports of the status of Open Access in the Netherlands, as requested by the minister of education, should be generated on the basis of Narcis.
Narcis should serious work on the deduplication of all information. Currently many metadata descriptions reported by separate universities are reported separately, leading to over reporting of actual figures. Based on the estimated of national co-publlication, an overreporting of at least 20% is currently expected. Narcis should merge those records and offer link outs to all repositories contributing the metadata. This deduplication can be greatly improved if they also make better use of standard identieifers such as the Digital Object Identifier (DOI). Currently the DOI is not part of the metadata exchange protocol and this is a serious miss of course.
Narcis should take up the role as metadata exchange platform. e.g. If Groningen and Wageningen have both a co-publication and there is an OA version available in Groningen. There should be service that Wageningen can use to check and harvest that OA version as well and thus safeguard the item on basis of the Lots of Copies Keeps Stuff Safe (LOCKSS) principle. Similar for the exchange of Digital Author Identifiers (DAI). If Utrecht has indicated a DAI for an author in Utrecht in a co-publication with Wageningen, we should be able to resolve the DAI from the author in Utrecht through Narcis and complete the metadata in our systems, starting with the CRIS of course, and harvest the DAI for the none Wageningen authors from Narcis.
Narcis as a link resolver. It should’s be too difficult to change Narcis into a link resolver to find OA versions of Toll Access articles. Exchange of the DOI would help of course, since you want to resolver on the article level and not on the journal level as is done in the current link resolvers. The benefits would be great to the Dutch public and the relevance of the individual repositories would increase.
Narcis got a new colour and letter type. It looks really nice now, but I look forward to bold steps in the direction of improving the database. Making the database an essential part in the Dutch repository infrastructure and boosting the importance and relevance of the institutional repositories.