I do sincerely apologize for this boring video, a few talking heads is not the right medium to pass a message. An important message that is. But I couldn’t find any palatable alternatives on YouTube. Has nobody tried to make an attractive, short film on this subject?Anyway, a couple of bigshots from the Dutch University world passing the message on the importance of Open Access. They talk in Dutch, but this version has English sub-titles.
Archive for the 'Open Access' Category

This morning I had to look up the citations to an article. It did no show up in WoS immediately so I had to look a bit around to trace it’s exact details. I found the article as an open access article on Highwire. No problem.
However, I was struck by the extensive and confusing copyright statements at the top of the abstract. On the first line is has the classic copyright sign © which indicates to me “all rights reserved” in this case to the CBS fungal biodiversity Centre. But the all rights reserved sign was followed immediately with their own worded Creative Commons license. CC 3.0 in this case.
I was little bemused by the third clause “”You may not alter, transform, or build upon this work”. Isn’t that what science is all about? Building on previous work?
Another annoying fact is that the DOI is not working. But this is the link to the abstract, there are plenty of similar examples in this “Studies in Mycology” to be found.
The last couple of days I had the pleasure to attend the Elsevier Development Partners meeting. The exact products they are working on might be of interest to some people, but that’s up to Elsevier to announce. But what was really the big surprise at this meeting -which lasted 3 days- was the tone from Elsevier. It was all about open Science. They clearly wanted to open up. There was a lot of talk about sharing information, making mash-ups possible, Application programming Interfaces (API). Elsevier Science wanted to move away from the double barred information silo to become an open solution provider in the scholarly world. If Elsevier is thinking and acting in this direction, then change will become a major issue for the entire scientific publishing industry and that is good news for libraries who want to remain a vital service in the future as well.
This change will take time. It doesn’t happen overnight. But Raphael Sidi just announced the other day on his blog the Elsevier Article API at the programmable Web. So, Elsevier is not only talking, they are acting up on it as well.
Let other publishers follow this example!
Just a post to support the idea of an International Open Access day. I wonder with Bert Zeeman which Dutch University will take the lead in the Netherlands to organize some sort of event on this subject during October 14th.
Hagendorn and Santelli (2008) just published an interesting article on the comprehensiveness of indexing of academic repositories by Google. This article triggers this me to write up some observations I was intending to make for quite some time already. It addresses the question I got from a colleague of mine, who observed that the deep web apparently doesn’t exist anymore.
Google has made a start to index flash files. Google has made a start to retrieve information that is hidden behind search forms on the web, i.e. started to index information contained in databases. Google and OCLC exchange information on books scanned, and those contained in Worldcat. Google so it seems has indexed the Web comprehensively with 1 trillion indexed webpages. Could there possibly be anything more to be indexed?
The article by Hagendorn and Santelli shows convincingly that Google still has not indexed all information that is contained in OAISTER, the second largest archive of open access article information. Only Scientific Commons is more comprehensive. They tested this with the Google Research API using the University Research Program for Google Search. They only checked whether the URL was present. This approach only partially reveals some information on depth of the Academic Deep Web. But those are staggering figures already. But reality bites even more.
A short while ago I taught a Web Search class for colleagues at the University Library at Leiden. For the purpose of demonstrating what the Deep or Invisible Web actually constitutes I used and example from their own repository. It is was a thesis on Cannabis from last year and deposited as one huge PDF of 14 MB. Using Google you can find the metadata record. With Google Scholar as well. However, if you try to search for a quite specific sentence on the beginning pages of the actual PDF file Google gives not the sought after thesis. You find three other PhD dissertations. Two of those defended at the same university that same day, but not the one on Cannabis.
Interestingly, you are able to find parts of the thesis in Google Scholar, eg chapter 2, chapter 3 etc. But those are the parts of the thesis contained in different chapters that have been published elsewhere in scholarly journals. Unfortunately, none of these parts in Google Scholar refers back to the original thesis that is in Open Access or have been posted as OA journal article pre-prints in the Leiden repository. In Google Scholar most of the materials is still behind toll gates at publishers websites.
Is Google to blame for this incomplete indexing of repositories? Hagendorn and Santelli point the finger to Google indeed. However, John Wilkin, a colleague of them, doesn’t agree. Just as Lorcan Dempsey didn’t. And neither do I.
I have taken an interest in the new role of librarians. We are no longer solely responsible for bringing external –documentary- resources from outside into the realm of our academic clientele. We have also the dear task of bringing the fruits of their labour as good as possible for the floodlights of the external world. Be it academic or plain lay interest. We have to bring the information out there. Open Access plays an important role in this new task. But that task doesn’t stop at making it simply available on the Web.
Making it available is only a first, essential step. Making it rank well is a second, perhaps even more important step. So as librarians we have to become SEO experts. I have mentioned this here before, as well as at my Dutch blog.
So what to do about this chosen example from the Leiden repository. Well there is actually a slew of measures that should be taken. First of course is to divide the complete thesis in parts, at chapter level. Albeit publishers give permission only to publish articles, of which most theses in the beta sciences exists in the Netherlands, when the thesis is published as a whole. On the other hand, nearly 95% of the publishers allow publication of pre-prints and peer reviewed post prints. The so called Romeo green road. So it is up to the repository managers, preferably with the consent from the PhD candidate, to tear up the thesis in its parts –the chapters, which are the pre-print or post-prints of articles- and archive the thesis on chapter level as well. This makes the record for this thesis with a number of links to far more digestible chunks of information better palatable for the search engine spiders and crawlers. The record for the thesis thus contains links to the individual chapters deposited elsewhere in the repository.
Interesting side effect of this additional effort at the repository side is that the deposit rates will increase considerably. This applies for most Universities in the Netherlands, for our collection of theses as well. Since PhD students are responsible of the lion’s share of academic research at the University, depositing the individual chapters as article preprints in the repository will be of major benefit to the OA performance university. It will require more labour at the side of repository management, but if we take this seriously it is well worth the effort.
We still have to work at the visibility of the repositories really hard, but making the information more palatable is a good start.
Reference:
Hagedorn, K. and J. Santelli (2008). Google still not indexing hidden web URLs. D-Lib Magazine 14(7/8). http://www.dlib.org/dlib/july08/hagedorn/07hagedorn.html
I have been watching the Peace Palace Library using twitter for quite some time already. They use as one of the various means to inform their users. Apart from Twitter the use mail, chat and RSS to broadcast messages. Their use of twitter is mainly for informing users on updates, systems changes and all those kind of things. Short messages, of course.
I was therefore interested by the application of the Library of the Technical University of Hamburg Harburg where they have implemented Twitter as a document stream on their electronic repository -which they prefer to call a document server. To me this makes a lot of sense. Too many libraries treat their repository as just one of their ordinary databases. It sits there and that’s about it. Okay they use OAI-PMH to make it possible to exchange information. That is important indeed.
But it shouldn’t stop there. Libraries should try their utter best to broadcast or syndicate the content of their repositories as widely as possible. They have the task trusted upon them to make the rest of the world aware of the valuable publications the researchers of their Alma Mater have produced. Relying on OAI-PMH only is not sufficient to reach that goal.
RSS is absolutely a necessity. If it was only to trickle feed the Google’s of this world with fresh information. But RSS is an excellent tool for getting your content to appear in other place on the Web as well. So RSS on your repository is a prerequisite. Let me be clear about that beforehand.
Today I was amused by the ingenious use of Twitter to syndicate updates of this repository. It is up to the user to subscribe to this feed if they wish too. On the other hand, I observe some conversion for my blogs from the twitter streams from these blogs. It is not much in comparison to RSS, but if you can please some of your clients by this form of syndication and the implementation costs are next to nothing. Then why not? Why not give it a try an see how it works out.
I love these small experiments.
hattip: netbib
The libraries of the three cooperating technical universities in the Netherlands have started a data repository for long term archiving of digital data sets. In their combined press release they state:
The world of technical science is to have its own data centre for digital data sets. The 3TU.Datacentre will ensure well-documented storage and long-term access to technical-science study data. This will guarantee the long-term availability of the Netherlands’ entire technical-science heritage.
The 3TU.Datacentre will provide storage of and continuing access to technical-science study data. After all, data sets often remain highly valuable even after a study has been completed. They may be reused in a new study or used to verify the original study. The long-term storage of test data also enables studies to be held over a long period.
A very good initiative, but I am missing out on one point. Is it open? One might expect soo, but the press release does not make a mention of this fact. In my opinion there is no use in having a repository when we don’t have open access to it. But it’s perhaps too obvious to mention.
Let’s hope so.
Since a little while -say a year and a half or so- I teach at regular intervals a course on finding scholarly information with freely available resources on the Web. The course is titled “Searching for Science“. The course material is freely available in one of my Wikis’. The main reason for using a wiki for presenting a course like this, is that linking to examples on the Web works so much more smoothly than using a powerpoint instead.
With regards to the course today, a small group attended. 4 researchers and 5 (mostly) international students. A nice mix. I really enjoyed it, and I think they did as well. Well at least they gave me a really positive evaluation.
During the course I spend about three quarters of the morning, say a littel over 2 hours, on general search tactics. Search engines and their commands, Web directories and the Deep Web. During the evaluation I always get the feedback that just some plain Google commands and search tips receive the most Brownie points. What’s always interesting is an exercise where we compare the coverage of scholarly search engines plus Live Academic on retrieving a known article from an OA repository in the Netherlands. I always ask the students to do the search with the full title of an article and repeat the exercise with a sentence from the discussion part of the article. It is always interesting to see the outcome of this exercise. As usual Live Academic failed entirely. Google Scholar did reasonbaly well on both, but today Scirus and Scientific Commons only worked with the title words. These outcomes can be different again tomorrow. It is always difficult to explain these outcomes.
Meanwhile I find some real gratification in the fact to point my students to some of the OA discussions as well, whilst covering collections of OA journals, Repositories or mentioning Open Course Ware sources.
On most occasions the participants are entirely new to some of de Science 2.0 developments. RSS? never heard off. So I introduce them to Bloglines, Netvibes and Google Reader. Show them something about scholarly blogs, social bookmarking for scientists or Digg.
We do actually have a course on Science 2.0 in the planning for somewhere in April. Needs still a lot of developing though. But it will be interesting.
Too much too read to comprehend at once, but tthree reports on the status of the European reposistories have been released. In the (Dutch) press release I noted that they talked about to findability of research reports, and did not mention the availability. It just struck me since I see many repositories in the Netherlands functioning as metadata repositories rather than ftxt repositories. But I have to admit, I should read the reports first.
Weenink, Kasja, Leo Waaijers and Karen van Godtsenhoven (eds.), A DRIVER’s Guide to European Repositories: Five studies of important Digital Repository related issues and good Practices (AUP: Amsterdam, 2007) ISBN 9789053564110, 200p.
This Driver’s guide is a practical guide to be used by repository managers and institutions for setting up and develop a repository and extra services. In this guide five essential aspects for realizing and amplifying repositories are described: the business plan, intellectual property rights, storing research data, curation of data and the long-time conservation of data. The authors have chosen for workable solutions that are applicable on local and national level.
Maurits van der Graaf en Kwame van Eijndhoven, The European Repository Landscape : Inventory study into present type and level of OAI compliant Digital Repository activities in the EU (Amsterdam University Press, 2008) ISBN 9789053564103, 144p.
What is the current state of digital repositories for research output in the European Union? What should be the next steps to stimulate an infrastructure for digital repositories at a European level? To address these key questions, an inventory study into the current state of digital repositories for research output in the European Union was carried out as part of the DRIVER Project. The study produces a complete inventory of the state of digital repositories in the 27 countries of the European Union as per 2007 and provides a basis to contemplate the next steps in driving forward an interoperable infrastructure at a European level
Muriel Foulonneau and Francis André, Investigative study of standards for Digital repositories and related services (Amsterdam University Press, 2008) ISBN 9789053564127. 112p.
This study is meant for institutional repository managers, service providers, repository software developers and generally, all players taking an active part in the creation of the digital repository infrastructure for e-research and e-learning. It reviews the current standards, protocols and applications in the domain of digital repositories. Special attention is being paid to the interoperability of repositories to enhance the exchange of data in repositories. It aims to stimulate discussion about these topics and supports initiatives for the integration of and, where needed, development of new standards. The authors also take a look at the nearby future: which steps have to be taken now in order to comply with future demands?
What amazes me most is that I can only find a press release in Dutch. The Driver website hasn’t got the news yet….
Well, Peter it is up to you…..
The first issue of the Code4Lib Journal is online. It is an very interesting Open Acces Journal. I first noted it at Ken Varnum’s RSS4Lib blog. Ken is on the editorial board of this journal. Don’t think it is a journal for techies only, even I as a none programmer found plenty interesting stuff to read in the inaugural issue, like beyond OPAC 2.0, on the future of the library catalog system. It is exactly one of those articles that fully addresses the focal point of their mission statement: “the intersection of libraries, technology, and the future.” If they adhere to that statement, I am sold.
The articles in this first issue of Code4Lib Journal (C4LJ) are:
- Editorial Introduction — Issue 1, by Jonathan Rochkind
- Beyond OPAC 2.0: Library Catalog as Versatile Discovery Platform, by Tito Sierra, Joseph Ryan, and Markus Wust
- Facet-based search and navigation with LCSH: Problems and opportunities, by Kelley McGrath
- The Rutgers Workflow Management System: Migrating a Digital Object Management Utility to Open Source, by Grace Agnew & Yang Yu
- Communicat: The Next Generation Catalog That Almost Was…, by Ross Singer
- Connecting the Real to the Representational: Historical Demographic Data in the Town of Pullman, 1880-1940, by Andrew H. Bullen
- BOOK REVIEW: The Success of Open Source by Steven Weber, reviewed by Eric Lease Morgan
- COLUMN: 700 Dollars and a Dream : Take a Chance on Koha, There’s Very Little to Lose, by BWS Johnson


Latest Comments