Archive for the 'Conferences' Category

Some observations during the bibliometrics session at the Österreichische Bibliothekartag

Albeit the program consistently talks about the Österreichische Bibliothekartag (singular) the whole library day spans actually 4 days. One would have expected at least the Österreichische Bibliothekartaggen (plural) but they insist in mentioning only one day. Of those four days, I was only present during part of the morning of the third day, so this is a very limited report on the Österreichische Bibliothekartag. Looking at their program, it is a very comprehensive and interesting program. Never thought that you could cover a complete session, 5 presentations, talking about cooking books (No pun intended). It only reflects that bibliometrics was only a small part of the program amongst many other subjects covered. I noticed a lot of presentations on e-book platforms, many digitization projects, plenty of mobile less of library 2.0 than you would expect (is the hype over?) and open access had also a very limited role. What struck me as interesting for conference organizers, is that many commercial presentation were programmed equally throughout the sessions. Just a sign of taking the sponsors seriously.

So far on the conference as a whole, of which I actually experienced too little. On to the bibliometrics sessions. The session was chaired by Juan Gorraiz, a bubbly Spaniard working already for years in Austria. Give him the opportunity and he will take the floor and would love to take all the time available and fill the slots for all presentations planned.

The first presentation was on a piece of research that should result in a masters thesis at some point, but some preliminary results were presented in this session by Christian Gumpenberger. The focus of the research was on the acceptance and familiarity of Austrian researchers with bibliometrics. The results were not really shocking, most researchers stated that they were familiar with impact factors, but for the moment there was no clue as to whether they were aware about a thing like a two year citation window. Or the difference between citable items and non-citable items leading to the inflation of impact factors for journals like Nature and Science. Christian sketched some sunny skies for bibliometrics in Austria, but in the subsequent discussion part this sunny view was criticized quite a bit. Notwithstanding I would like to have a look at this MS thesis when it becomes available.

The second presentation was from Italian origin by Nicola de Bellis. Nicola has written an interesting book on citation analysis in which he stresses the sociological, philosophical and historical aspects of bibliometric analyses. It is always interesting to hear a presentation like this, away from the fact finding number crunching approach which I normally have and dream a bit away on outlines of what in an ideal world should be done on a subject like this. Quite a lot, but some of it is beyond being practical. When you carry out bibliometric analyses in the library at some scale, like dealing with 18,000 papers that have collected 265,000 citations like we do in our library, you can only be practical. So there is an interesting conflict between his presentation (which will be on-line soon, I hope) and mine which followed Nicola his presentation.

I don’t want to cover all aspects of Nicolas his presentation. Go and read the book, which I am going to do as well. But at one point during his presentation I strongly disagreed with him. Where he stated that only the mediocre scientists have an interest in bibliometrics and the top scientists normally don’t have an interest in this topic. My experience it quite the contrary. In the first place it was one of Wageningen’s top scientist who urged the library to take a subscription on Web of Science back in 2001, and made it possible with a special contribution from his top institute. He knew he was a highly cited scientist, but somehow he needed Web of Science to confirm his reputation. Later on as well, apart from the discussion with scholars in the social sciences department, it has always been those top performing groups that invited me to give a presentation on this subject rather than the groups that were lagging behind in the bibliometric performance indicators. To me it has always appeared that those who are leading the pack are also interested in staying ahead of the rest and invite the library to explain the results obtained and enhance their performance in the future.

The second observation in Nicola his presentation where he was far beyond practical where he insisted on the point that for a publication all citations to this publication should be retrieved from the three general databases (Web of Science, Scopus and Google Scholar) in the first place supplemented with citations from at least one citation enriched subject specific database. Well that’s a lot of work for single publication in the first place, leading to deduplication errors if you’re not very careful. Secondly it should be well know that Google Scholar, albeit attractive because of tools like Harzing’s Publish-or-Perish, is not a reliable database for citation counts at his moment (Jacso 2008). Google Scholar still has serious problems with ordinary counting and depuplication and should therefore not be used for serious citation analyses. The third argument against the use of multiple databases goes a bit further into the theory of bibliometrics and relies on approaches described by Waltman et al. (2011) and Leydesdorff et al. (2011). The key point is that a number of citations in itself has no meaning. It should be related to the citations of related documents in the same field of science. You can do that by normalizing on the mean citation rate in the field (Waltman et al. 2011) or by the perhaps more sophisticated approach sketched by Leydesdorff et al. (2011) based on the citation distributions in the fied to which the paper belongs. The latter approach is very novel, and has not really been widely tested yet. Both these approaches rely on the availability of the all the citations to the publications in a certain field of science of a certain age and document type. This can be expected that you have the availability of the means or citation distribution when you work with a specific database (for WoS there is plenty experience, with Scopus it is coming with SciVal Strata but for Google Scholar it doesn’t exist yet), but is beyond reality when you derive citation data from three or four databases at the same time.

But apart from these critical points I just made, I liked the presentation by De Bellis very much. For those interested in similar views on the citation practice I really recommend to read MacRoberts & MacRoberts (1996) as well.

The session closed with my presentation, which is enclosed here

Bibliometric analysis tools on top of the university’s bibliographic database, new roles and opportunities for library outreach

View more presentations from Wouter Gerritsma

After which the session ended with some discussion but soon all 30 or so participants hurried themselves to the coffee.

References

De Bellis, N. (2009). Bibliometrics and citation analysis : From the Science Citation Index to cybermetrics. ISBN 9780810867130, The Scarecrow Press, 450p. (download here)
Jacsó, P. (2008). The pros and cons of computing the h-index using Google Scholar. Online Information Review, 32 (3): 437-451 http://dx.doi.org/10.1108/14684520810889718 http://www.jacso.info/PDFs/jacso-pros-and-cons-of-computing-the-h-index.pdf
Leydesdorff, L., L. Bornmann, R. Mutz & T. Opthof (2011). Turning the tables on citation analysis one more time: Principles for comparing sets of documents. Journal of the American Society for Information Science and Technology n/a-n/a http://dx.doi.org/10.1002/asi.21534 http://arxiv.org/abs/1101.3863
MacRoberts, M. H. & B. R. MacRoberts (1996). Problems of citation analysis. Scientometrics, 36(3): 435-444 http://dx.doi.org/10.1007/BF02129604
Waltman, L., N. J. van Eck, T. N. van Leeuwen, M. S. Visser & A. F. J. van Raan (2011). Towards a new crown indicator: Some theoretical considerations. Journal of Informetrics, 5(1): 37-47. http://dx.doi.org/10.1016/j.joi.2010.08.001 http://arxiv.org/abs/1003.2167

Stephen Abram’s presentation in Rotterdam

Online TV Shows by Ustream

After the Ticer course Stephen Abram gave the same presentation in Rotterdam for a group of Dutch, mostly public librarians. This sessions was recorded by the infamous people from DOK Delft. Really good to have this available for all librarians. It is a must see wake up call.

A day at Ticer: lessons learned

It was a well packed day at Ticer yesterday. 4 presenters making five presentations. I have tried to blog the first impressions live, but network difficulties –had to install VPN and more difficult things- prevented to post the first presentation immediately.

Stephen Abram, as could be expected, urged us librarians to wake up. After he started to speak he seemed unstoppable. So now and then he posed a rhetoric question, but hurried on without awaiting any responses. Stephen used quite a bit of exaggeration but I think that was valid approach. He was quite sincere in warning us that it is really five to twelve, or perhaps already four. If we don’t want to become redundant in the future we have focus on our user’s needs rather than our librarian’s needs and adopt Web 2.0 tools in our roster. His list of 25 technologies is a good starting point. These we should master and preferably on a mobile device.

Marshall Breeding was perhaps the biggest contrast in presentation style to the Stephen Abram that you could imagine. Small and shy, but with a clear voice. He asked us first to complete his library web cats survey. The most important trend from his presentation was the increasing popularity of Open Source Systems. He presented very clearly the different shades of openness that exist. Open source is by no means a cheaper alternative than regular integrated library systems. The place to keep an eye on in the near future is the workgroup of OLE

Birte Christensen-Dalsgaard radiated library enthusiasm in her presentation. At first she she broke down the fallacy of library knowledge of our users, since most library users are perhaps the Drive in users, that really want to spend the least of time in the library or with the library systems. And this systems needs to be broken down and rebuild from the ground to offer more relevant and better information to meet users needs. That we have to datamine and model our user’s behaviour as closely as possible is not a problem to her. Privacy laws might be prohibitive, though. In her list of examples is Summa of course. Towards the end she pleads for standards, stands and standards of course, since libraries can’t go on this alone, standards will help to cooperate more fully.

The remainder of the afternoon was reserved for Herbert van de Sompel. Perhaps the most interesting presentations of the day, but about applications that are only at the horizon of practical digital libraries today. It is good however that somebody from the digital library research world came to share some of their research with library practitioners. It was Cliford Lynch who once wrote “Digital libraries”: this oxymoronic phrase has attracted dreamers and engineers, visionaries and entrepreneurs, a diversity of social scientists, lawyers, scientists and technicians. And even, ironically, librarians – though some would argue that digital libraries have very little to do with libraries as institutions or the practice of librarianship.“. His point was really to watch the pages of his research groups since a lots will be coming out the coming months.

This year the time keeping and discussion rounds were less strict than last year. A bit of a pity since I really enjoyed those last year. Sylvia van Peteghem did a beautiful round up of the presentations at the end of the day though.

Lynch, C. (2005). Where do we go from here? The next decade for digital libraries. D-Lib Magazine 11(7/8). http://www.dlib.org/dlib/july05/lynch/07lynch.html.

Stephen Abram at Ticer: Twenty five technologies to watch and how

Stephen Abram had the honour to quick off the second day at Ticer. During the introduction he put successfully the finger on the areas where (Academic) libraries are failing when they don’t cooperate and provide services that are geared towards the needs of users.

An important point he makes is the classical opposition of librarians, who are text based learners to graphical user interfaces. Libraries are equipped for documentary information whereas the whole world is changing towards a multimedia information world. Libraries are on most occasions not yet equipped or prepared for this change in information formats. Where they are shy of graphical user interfaces they are also shy of multimedia.

The point he makes in his extensive introduction is that libraries should interoperate on a global basis, and immerse people in content. All because

“The world is going to change with or without you….
Get ready”

He goes on to explain the importance of the generation y, the younger generation who can multitask, cooperate and are trained at problem solving rather than learning facts. Those are our future users with needs completely different needs. “Who is archiving computer games?” he asks the audience. Simulations are the most important way of teaching in military and defense industries. YouTube movies and Podcasts for research and learning are on many occasions much more effective for learning than textbooks. “Whose study collections include podscasts or vodcasts?” He challenges his audience.

A prediction from Stephen is that an iPod like device will contain all content ever created by 2020, i.e. the complete Web in your pocket. The future is mobile and we better prepare ourselves for this fact. The real question that we should be discussing therefore is what a Web 2.0 or Library 2.0 application should look like in a mobile environment.

Only after about 90 minutes het gets down to his 25 technologies that will transform Academic Libraries in the near future:

  1. Mobile
  2. Presence management – Twitter
  3. Tagging – Delicious
  4. Scrapbooking – Zotero, Connotea
  5. Personal Homepages
  6. Microblogging – Twitter (again)
  7. Social content – Wikipedia, Knol
  8. Public Social Networking – Orkut, Facebook, MySpace
  9. Private Social Networking – Plaxo, LinkedIn, Ning
  10. Social Network Integration – f8, opensocial
  11. e-Books and devices
  12. eLearning – Blackboard, Sakai, AngelLearning
  13. XML
  14. Cloud Software – Yahoo, Google, Bebo
  15. RSS groups and readers – Bloglines, Google Reader
  16. iTunes, MP3
  17. Podcasts & Screencasts
  18. Streaming Media
  19. SEO and GIS
  20. Federated Search
  21. Custom Search
  22. Next Generation content
  23. DRM
  24. up to you
  25. Humans as the Competitive Edge

An intended powerpoint, which is actually different from the one presented can be found at Stephens Lighthouse.

Herbert van de Sompel at Ticer: OAI object reuse and exchange: Support

van de Sompel describes his project simply as doing Web 2.0 type of things with scholarly communication with additional stuff to add to the value chain of scholarly communication. It is geared towards the machine readable web.

The ORE project brings together URI, RDF and Vocabularies. It has all to do with the semantic Web. The beta version of ORE was published June 2008. Best part of that document is the primer to understand what the project is really about. The primer though, will be completely rewritten by the end of September to make it less technical.

More info at:
Van de Sompel, H. and C. Lagoze (2007). Interoperability for the Discovery, Use, and Re-Use of Units of Scholarly Communication CTWatch Quarterly 3(3): 32-40. http://www.ctwatch.org/quarterly/articles/2007/08/interoperability-for-the-discovery-use-and-re-use-of-units-of-scholarly-communication/

Birte Christensen-Dalsgaard at Ticer: Intelligent / Next generation / Dynamic catalogue

Birte starts her presentation with the vision that libraries can develop intelligent systems that are able to follow you, knows your different profiles and knows where you are. She is not shy of data mining to achieve this objective.

Federated versus Integrated search
In the definition of Dalsgaard Federated search is something that Metalib does. i.e. Searching different information silos simultaneously and merged the results on a single screen. Federated search was nice solution, but ranking is lously,
With integrated search all content is harvest and indexed within a single system and search by users with any kind of tool. With integrated search you are able to rank in theory much better. However, it will not come easy. You have to balance the relatively “thin” metadata catalogue records and fulltext information. Where will the catalogue record be of a journal like Nature, which is a very important term in the life sciences. It remided me of an article by Tamar Sadeh (2006) which uses different definition than use by Birte.

Federated search is typically associated with:
• Database approach
• Queries
• Based on Z39.50 protocol
• Structured
• “Exact” match

Integrated search is typically associated with:
• Search engine approach
• Natural language
• Large Volume
• Statistical approach

In Denmark they have carried out a data mining experiment with library lending data to develop a recommender system. To their own amazement they privacy policy police did not object, but wherever you are trying to data mine and model data on users privacy problems might crop up.

Interesting point she argues that we need different search systems for different research questions. A common search is a known item lookup, which is completely different from an explorative search on a new subject. Perhaps we need different search engines for these questions, and not expect one system to handle those very different questions.

Realizing that we actually need different search engines, we need to develop the library system as a modular approach.

Towards the end she gets back to the paradigm of Robin Murray: Synthesize, Specialize, Mobilize.

Reference:
Sadeh, T. (2006). Google Scholar versus metasearch systems. High Energy Physics Libraries Webzine(12). http://library.cern.ch/HEPLW/12/papers/1/

Christensen-Dalsgaard, B. (2008) The Intelligent catalogue. http://www.tilburguniversity.nl/services/lis/ticer/08carte/publicat/christensendalsgaard.pdf

Marshall Breeding at Ticer: Library automation for the next generation


One of the disruptions in the Integrated Library System (ILS) market in the USA is that many libraries are shifting towards open source (OS) ILS. Most of these decisions taken in favor of the adoption of OS systems are religious decisions. Thus without a proper evaluation of the pros and cons of OS. At the end of the day costs of OS and closed systems are probably equal.

 

Breeding noted that the investment into Open Source ILS was last year about 10% of the market and will be about 25% of investment this year in North America. The installed base of OS ILS is about 2 to 3%

 

As examples of OS ILS het mentions

Koha – commercial support from LibLime

Evergreen – Commercial support from Equinox

OPALS – commercial support from Media Flex

NewGenLib – Open Source ILS for the developing world.

 

Next he goes on to explain the different shades of green that can make a system Open Source. In many cases an open API layer allows libraries to configure and manipulate the system to their liking. Breeding pleads for the development of universal API that can applied towards different ILS. Het talks about the Berkeley Accords.

 

Rethinking the ILS

Traditional ILS model is not suitable for hybrid libraries where print and digital come together. The classical ILS focuses on Cataloging + Circulation + OPAC + Serials + Acquisitions, whereas nowadays integration includes link resolvers, full text, federated search and Electronic resource management. However the foundations of ILS were carved in stone in the 1965 and still stand their time. We should be pushing the standards constantly. The influence that Google has had on our users is that they expect to do full text searches. Libraries are still worrying about Metadata, users want the data.  

 

The next generation ILS should be based on a Services Oriented Architecture wich consists of many small granular modules that complete the tasks.

 

Towards the end het makes mention of the Open Library Environment (OLE) project sponsored by the Andrew W. Mellon Foundation where they are rethinking the next generation of library systems.

Ticer: Digital Libraries à la Carte 2008

It took me some hassles, but I have finally a wi-fi connection in the lecture room at Ticer. Stephen Abram has finished his presentation which was schedulded for 60 minutes, but took some 90+ minutes. I will blog some of his presentation later, but in the mean time some of his planned presentation can be found at his blog. All the time of his presentation was well spent. Right now I am listening to Marshall Breeding on library systems.

Kjell Tjensvoll: The e-only library Helsebiblioteket.no

In Norway they have build a national digital health library. And what is really special about it, everybody in Norway, I mean everybody with internet access, is allowed to read, browse and download all medical journals. It is based on the national contracts for higher education and with a small additional fee to cover the national access.

Think about it. If all higher education institutions cover the main costs already and there is not much of an additional market to be expected, why not. If in the Netherlands for instance a publisher of scientific journals has already contracts with the universities and research universities, than there is not much of a market left, so why not open up access to the IP range of the whole country.

It takes some courage to develop and implement such a model.

I was much impressed by the fact that they managed to do this. As information junkie I dwell on this idea.

Kjell also showed the implementation of the portal to host all these journals and databases and it was interesting to see that they used the federated search and clustering engine of Vivisimo.

If I now had only a Norwegian proxy server to my availability.

ELAG 2008: creating a new services infrastructure for the European Library

Theo van Veen explains that the new infrastructure for the European Library services infrastructure started about two years ago. Web users have become accustomed to Google, Del.icio.us and Flcikr type of application for quite some time already. That means that libraries have to lower their barriers as well to encourage the users to use the library systems.

Library systems have to perceive, interpret and respond to user requests. He illustrates these new services on a demo machine where he translates the abstracts of a record as an option from a menu that appears onmouse over. Het actually lists about 15 different type of services hat re available in this way. An option to is follow on service which triggers a speech service that reads out the translated abstract. All responses can generate new requests. His demonstrator is full of nice little tricks.

The user can make service descriptions themselves, but the systems should learn from the user interactions with the system. It will add some intelligence to the portal.

At the end he comes the legal issues. TEL can’t be held responsible for what users are doing, so they try to work with trusted partners.

The portal is in in a test phase available at http://dev.theeuropeanlibrary.org/vga/SRUportal/