The invisible web is still there, and it is probably larger than ever

Book review: Devine, J., & Egger-Sider, F. (2014). Going beyond Google again : strategies for using and teaching the Invisible Web. Chicago: Neal-Schuman, an imprint of the American Library Association. ISBN 9781555708986, 180p.

Going Beyond Google Again: Strategies for Using and Teaching the Invisible Web

The invisible web, as we know it, dates back to at least 2001. In that year both Sherman & Price (2001) as well as Bergman (2001) came out with two studies describing the whole issue surrounding the deep, or invisible web, for the first time. These two seminal studies each used a different term to indicate the same concept, invisible and deep, but both described independently from each other convincingly that there was more information available that ordinary search engines can see.

Later on Lewandowski & Mayr (2006) showed that Bergmann perhaps overstated the size of the actual problem, but it certainly remained a problem for those unaware of the whole issue. Whilst Ford & Mansourian (2006) added the concept of the “cognitive inivisbility”, i.e. everything beyond page 1 in the Google results page. Since then very little has happened in the research on this problem in the search or information retrieval community. The notion of “deep web” has continued to receive some interest in the computer sciences, where they look into query expansion and data mining to alleviate the problems. But ground breaking scientific studies on this subject in the area of information retrieval or LIS have been scanty.

The authors of the current book Devine and Egger-Sider have been involved with the invisible web already since 2004 (Devine & Egger-Sider, 2004; Devine & Egger-Sider, 2009). Their main concern is to get the concept of the invisible web in the curriculum for information literacy. The current book documents a major survey in this area. For the purpose of getting the invisible web in the information literacy curriculum they maintain a useful website with invisible web discovery tools.

The current book is largely a repetition of their previous book (Devine & Egger-Sider, 2009). However two major additions to the notion of the invisible web have been added. Web 2.0 or the social web, and the mobile or the apps web. The first concept I was aware of and used it in classes for information professionals in the Netherlands for quite a long time already. The second concept was an eye opener for me. I did realize that search on mobile devices was different, more personalized than anything else, but I had not categorized it as a part of the invisible web.

Where Devine and Egger-Sider (2014) disappoint is that the proposed solutions, curricula etc, only address the invisible as a database problem. Identify the right databases and perform your searches. Make students and scholars aware of the problem, guide them to the additional resources and the problem is solved. However, no solution whatsoever, is provided to solve the information gap due to the social web or the mobile web. On this part the book does not add anything to the version from 2009.

Another notion of the ever increasing invisible web as we know it, concerns grey literature. Scholarly output in the form of peer reviewed articles or books are reasonably well covered by (web) search engines and library subscribed A&I databases, but to retrieve the grey literature still remains a major problem. The whole notion of grey literature is mentioned in this book. Despite the concern about the invisible or deep web, they also fail to stress the advantages that full scale web search engines have brought. Previously we only had the indexed bibliographic information to search whereas web search engines brought us full text search. Full text search, while not being superior, has brought us new opportunities and sometimes improved retrieval as well.

The book is not entirely up to date. The majority of the reference are up to date to 2011, only a few 2012 let alone 2013 references are included. Apparently the book took a long time to write and produce. But what is really lacking is a suitable accompanying website. The many URLs provided in the book on a short list would have been helpful to probably many readers. For the time being we have to do it with their older webpage which is less comprehensive than the complete collection of sources mentioned in this edition.

Where the book completely fails is the inclusion of the darknet. Since Wikileaks and Snowden we should be aware that even more is going on in the invisible web than ever before. Devine & Egger Sider, only mention the darknet or dark web as an area not to treat. This is slightly disappointing.

If you have already the 2009 edition of this book, there is no need to upgrade to the current version.

References
Bergman, M.K. (2001). White Paper: The Deep Web: Surfacing Hidden Value. The Journal of Electronic Publishing, 7(1). http://dx.doi.org/10.3998/3336451.0007.104
Devine, J., & Egger-Sider, F. (2004). Beyond Google : The invisible Web in the academic library. The Journal of Academic Librairianship, 30(4), 265-269. http://dx.doi.org/10.1016/j.acalib.2004.04.010
Devine, J., & Egger-Sider, F. (2009). Going beyond Google : the invisible web in learning and teaching. London: Facet Publishing. 156p.
Devine, J., & Egger-Sider, F. (2014). Going beyond Google again : strategies for using and teaching the Invisible Web. Chicago: Neal-Schuman, an imprint of the American Library Association. 180p.
Lewandowski, D., & Mayr, P. (2006). Exploring the academic invisible web. Library Hi Tech, 24(4), 529-539. http://dx.doi.org/10.1108/07378830610715392 OA version: http://eprints.rclis.org/9203/
Sherman, C., & Price, G. (2001). The invisible web: Discovering information sources search engines can’t see. Medford NJ, USA: Information today. 439p.
Ford, N., & Mansourian, Y. (2006). The invisible web: An empirical study of “cognitive invisibility”. Journal of Documentation, 62(5), 584-596. http://dx.doi.org/10.1108/00220410610688732

Other reviews for this book
Malone, A. (2014). Going Beyond Google Again: Strategies for Using and Teaching the Invisible Web, Jane Devine, Francine Egger-Sider. Neal-Schuman, Chicago (2014), ISBN: 978-1-55570-898-6. The Journal of Academic Librarianship, 40(3–4), 421. http://dx.doi.org/10.1016/j.acalib.2014.03.006
Mason, D. (2014). Going Beyond Google Again: Strategies for Using and Teaching the Invisible Web. Online Information Review, 38(7), 992-993. http://dx.doi.org/10.1108/OIR-10-2014-0228
Stenis, P. (2014). Going Beyond Google Again: Strategies for Using and Teaching the Invisible Web. Reference & User Services Quarterly, 53(4), 367-367. http://dx.doi.org/10.5860/rusq.53n4.367a
Sweeper, D. (2014). A Review of “Going Beyond Google Again: Strategies for Using and Teaching the Invisible Web”. Journal of Electronic Resources Librarianship, 26(2), 154-155. http://dx.doi.org/10.1080/1941126x.2014.910415

Karen Calhoun on digital libraries

Review of : Calhoun, K. 2014. Exploring digital libraries : Foundations, practices, prospects. Chicago: Neal-Schuman. 322p.

As a library practitioner I am always a bit weary about the term digital libraries. I have had sincere doubts about the role of library practitioners in digital libraries

“some would argue that digital libraries have very little to do with libraries as institutions or the practice of librarianship”

(Lynch, 2005). But this new book of Karen Calhoun has removed al my reservations against the term digital libraries, and built the bridge from digital library research to practical librarianship.

First of all, Calhoun has written an excellent book. Go, buy, read and learn from it. For anybody working in today’s academic library settings, a must read. Calhoun elegantly makes the connection between the digital library scientists that started in the previous century and the last decade, to the current questions we are dealing with in the academic library setting.

Calhoun describes the context around the usual pathways, from link resolvers, to the metalib solutions ending with the current options of discovery tools. But those off the shelf solutions are not too exciting.

Where I liked the book the most, and learned a lot was around the chapters on the repository. Those are insightful chapters, albeit I didn’t always agree with Cahoun’s views. Calhoun and I probably agree on the fact that repositories are the most challenging areas for academic libraries to be active in. Calhoun did not address the fact that this has resulted in an enormous change in workflow. In the classical library catalogue we only dealt with monographs and journals. In repositories we are dealing with more granular items such as book chapters, contributions to proceedings, articles and posters. That is not only a change from paper to digital, but also a completely different level of metadata descriptions. That are changes that we are still struggling to grasp with. I see in the everyday practice.

A shortcoming of the book is that Calhoun equated repositories with open access repositories. That is a misnomer to my mind. It is perhaps the more European setting where most academic libraries get involved in current research information systems (CRIS). This crisses form an essential part in the university digital infrastructure and feed a comprehensive institutional repository. The repository becomes thus far more than only a collection of OA items. Dear Karen have a look at our repository. More than 200,000 items collected, of which 50,000 available in Open Access. But more important, next to the peer 55,000 peer reviewed articles we have nearly 35,000 articles in professional or trade journals that boast our societal impact. We have also 27,000+ reports, nearly 18,000 abstracts and conference contributions as well. Institutional repositories to my mind should be more than Open Access repositories of peer reviewed journal articles alone. The institutional repository plays an important role in dissemination al kinds a “grey” literature output. Calhoun could probably learn more from the changing European landscape where CRIS and repositories are growing to each other and as a result completely new library role arises, when libraries can get a role in the management of the CRIS. But that is a natural match. Or should be.

What Calhoun made me realize is that we have a unique proposition in Wageningen. Our catalogue is comprehensively indexed in Google and nearly as well in Google Scholar. The indexing for our repository goes well in Google, but for our repository we are still struggeling to get the contents in Google Scholar. We have a project under way to correct this. But no success guaranteed, since Google Scholar is completely different from Google. No ordinary SEO expert has experience with these matters. But that we are indexed both in Google as well as Google Scholar are valuable assets. With our change to WorldCat local we have something to loose. We should tread carefully in this area.

Where I learned a lot from Calhoun, is from those chapters I normally don’t care too much about. The social roles of digital libraries and digital library communities. Normally areas, and literature, I tend to neglect, but the overview presented by Calhoun, really convinced me to solicit more buy-in for our new developments. We are in the preparation of our first centennial (in 2018) and running a project to collect and digitize all our official academic output. Where we present the results? Our comprehensive institutional bibliography! Of course. Not an easy task, but we are building our own, unique, digital library.

Disclaimer: I don’t have an MLIS, but work already for nearly 15 years with a lot of pleasure at Wageningen UR library, where I work in the area of research support.

References
Calhoun, K. 2014. Exploring digital libraries : Foundations, practices, prospects. Chicago: Neal-Schuman. 322p.
Lynch, C. 2005. Where do we go from here? The next decade for digital libraries. D-Lib Magazine, 11(7/8) http://www.dlib.org/dlib/july05/lynch/07lynch.html