Archive for the 'Information Retrieval' Category

PubMed sucks, or the user is broken

Anna Kushnir runs a blog on a high profile platform over at Nature Publishing. Last Saturday she complained about the user fiendliness of PubMed.

I have spent an absurd amount of time on PubMed recently and can say in no uncertain terms that it is making my dissertation writing way more painful than it needs to be. I can hold a paper in my hands, search for two authors’ last names and have PubMed come up with nothing.

PubMed, however is probably the most widely used bibliographic database in the world. Certainly in the world of Medicine. Many libraries run special classes to teach the intricacies of PubMed. We -librarians- have to admit, searching PubMed is not easy.  It is certainly not intuitive. After you’ve found what you searched for, then it is complicated to get the information over to another programme such as Reference Manager or EdnNote. If you succeed in that, you get abbreviated journal title’s, authors with maximally two initials etc….

How surprising was the reaction of Dean Giustini. Well his reaction is perhaps typical for a librarian in general, we go out and teach the user a few tricks. We teach and teach. The database is not broken! It’s the user we need to mend.

I thought Dean would know better than this. Of course he is right in the fact that this complaint on PubMed is an excellent teaching moment. But I would rather stress the message from Anna Kushnir, and that is that searching PubMed is not intuitive. Far from it. Even if you would have had classes some years ago in searching PubMed, that knowledge is now obsolete. That is good for PubMed, they innovate and improve, but when we think that refresher courses in searching PubMed should be high of the lists of Doctors, surgeon and medical researchers, we are speculating on the wrong track. They simply don’t have time for these courses. It is a rat race to keep informed on the progress of their own specialities. Why would they need courses for full time MLIS professionals to search a bibliographic database?

We have to go out there and listen to our users. Anna Kushnir is one of them. Her message is plain and simple, searching PubMed -however good we think it already might be- should become more intuitive. I think we should do a lot better and can do a lot better to build these more intuitive search engines.

I see the post from Anna more as a challenge for our profession, than as a teaching moment.

Full feeds versus partial feeds

The full feeds versus partial feeds is an old debate. Have a look at the 2.7 million Google hits for this simple query. Most of the debate however, concentrates on the presumed effects on visitors to the actual blog and -missed?- advertising revenue.

This afternoon I was having an interesting discussion with a representative from a library organization and we were discussing the theme of findability and accessibility of scientific information. My point of view was that blogging about science and scientific articles would at least increase the findability of these articles. However, this is only true when the feeds of the blog are full feeds. The discovery of very new, young or even premature information on the web should be complemented nowadays with searches on blog search engines and news search engines. These search engines are on most occasions not exactly what their name suggests. In most instances they are rss feed search engines, i.e. they only index rss feeds.

The consequences are simple. When a blog is using partial feeds only the headline is indexed by blog search engines. Have for instance a look at the Technorati results for the IAALD blog,  or from Google Blog search, or at Ask blog search. These represent the top three blog search engines at the moment. The discoverablity of content with these search engines for content from the IAALD blog is miserable, whereas it has some excellent content.

Where the discussion of full text feeds versus partial feeds so far has concentrated on arguments of pro-bloggers who are worried about their advertising revenue. For scientists, the argument of discoverablity is far more important and they should always opt for full feeds to syndicate their content as widely as possible.

It sounds strange but a lot of people have not yet realized this.

Searching for Science

Since a little while -say a year and a half or so- I teach at regular intervals a course on finding scholarly information with freely available resources on the Web. The course is titled “Searching for Science“. The course material is freely available in one of my Wikis’. The main reason for using a wiki for presenting a course like this, is that linking to examples on the Web works so much more smoothly than using a powerpoint  instead.

With regards to the course today, a small group attended. 4 researchers and 5 (mostly) international students. A nice mix. I really enjoyed it, and I think they did as well. Well at least they gave me a really positive evaluation.
During the course I spend about three quarters of the morning, say a littel over 2 hours, on general search tactics. Search engines and their commands, Web directories and the Deep Web.  During the evaluation I always get the feedback that just some plain Google commands and search tips receive the most Brownie points. What’s always interesting is an exercise where we compare the coverage of scholarly search engines plus Live Academic on retrieving a known article from an OA repository in the Netherlands. I always ask the students to do the search with the full title of an article and repeat the exercise with a sentence from the discussion part of the article. It is always interesting to see the outcome of this exercise. As usual Live Academic failed entirely. Google Scholar did reasonbaly well on both, but today Scirus and Scientific Commons only worked with the title words.  These outcomes can be different again tomorrow. It is always difficult to explain these outcomes.

Meanwhile I find some real gratification in the fact to point my students to some of the OA discussions as well, whilst covering collections of OA journals, Repositories or mentioning Open Course Ware sources.

On most occasions the participants are entirely new to some of de Science 2.0 developments. RSS? never heard off. So I introduce them to Bloglines, Netvibes and Google Reader. Show them something about scholarly blogs, social bookmarking for scientists or Digg.

We do actually have a course on Science 2.0 in the planning for somewhere in April. Needs still a lot of developing though. But it will be interesting.