Archive for the 'Scientometrics' Category

Elsevier’s topcited just launched

Where Thomson scientific has already for quite some years the free website ISIhighlycited, Elsevier has launched today (?) a competitive product called TopCited. Albeit not the same, it is clear that the competition is inspiring both companies to come up with new products in each other niches. The databases are effectively a lure to get reserchers interested in the products behind it. TopCited gives an overview of subject-specific top 20 cited articles in the past 3, 4 or 5 years of publication. The underlying database for the citation data is Scopus of course.
I just discovered it, some quick impressions:

  • A time frame of maximally 5 years is a bit brief. I would love to see a 10 year frame as well.
  • I suspect they have some difficulty of determining the research field of article published in multidisciplinary journals such as Nature and Science. They seem to be lacking from rankings, albeit a glimpsed a few. Too few according to my impression.

Later on I will look at this new site more carefully, and will attempt to make a comparison with the competitive Thomson databases.

What’s in a name

In courses on citation analysis for research evaluation I always give stern warning to researchers not to change their names. That is most important nowadays since it has become fashion to publish on a first name basis. First names differ occasionally from given names and can lead therefore to confusion when evaluators perform a citation analysis for whatever purpose. The situation is always a trifle more  complicated  for female researchers.  Young aspiring scientist start publishing  with their own name. Later on in their career some of them opt to publish under their husband’s name. Not to mention what happens after a divorce.

Since citation analysis is seemingly easy to perform with more and more databases offering simple citation lookup options, researchers should be aware of the consequences of their, often sloppy or at least in consequent, habits of referring to their own names in scholarly articles.

In today’s newspaper (NRC 20080305) there was a very interesting article reporting on some research carried out at the University of Tilburg. In this research they experimented with the influence of the change of the woman’s name after marriage on their social career. Three different experiments were performed and all three of them showed unequivocally that changing names after marriage had a negative effect on their social careers.

So far so good. But what amazed me most was that 83% of the female students of Tilburg University (going for their MSc)  taking part in these experiments planned to change their names after marriage.  This  is apparently  about the national average.  Of the male students 81%  expected their future wifes to be to adopt their names.

I was under the impression that years and years of women’s lib would have solved this problem quite soon. How wrong I was.

Hattip: GS

New version of Citeseer available

Citeseer was the first citation enhanced  bibliographic  database which provided free available citation data for the scientific literature. It was  therefore the first serious competitior for the kings of citation data ISI/Thomson Scientific. Citseer covered the literature of computer and information science. Started in 1997 at the NEC Research Institute, Princeton, New Jersey it has come a long way. Since it’s inception, the original CiteSeer grew to index over 750,000 documents and served over 1.5 million requests daily, pushing the limits of the system’s capabilities.

The next Generation Citeseer, CiteseerX, is now available for search.

My first impression is a really nice intuitive layout, and a fast search performance. I will keep pointing students to this free resource during my classes on citation analysis.

Impact factors and Scimago JR compared

In December I promised to look into more detail of the newly launched Scimago Country & Journal Rank database. Scimago has attracted some attention in the blogosphere outside Spain since December and got some serious attention from Declan Butler as a news item in Nature (Subscription required).

It is too early for some thorough in-depth investigations of this new database, but the better blog reactions were at Information Research and a second time again and the Biomed Central Blog . They both had an issue of self interest to see where they where their journals were standing in this new database. We have to wait a bit longer for the reviews in the scholarly literature, I’m afraid.

Meanwhile I have looked into this database a bit more closely. In this blogpost I report some of my findings. My reason to look into this database more closely is mainly triggered by the fact that it allows us –librarians- to evaluate the rankings of a larger set of journals in a quantitative way. Impact factors have played a role in the decisions on journal subscriptions and cancellations –albeit not the sole criterion- How does the SJR compare to the impact factor is my main question.

SJR is “an indicator that expresses the number of connections that a journal receives through the citation of its documents divided between the total of documents published in the year selected by the publication, weighted according to the amount of incoming and outgoing connections of the sources.” In essence is the SJR an Pagerank type of indicator in which citations from highly ranked journals increase the ranking of the journal.

To gain more understanding SJR and I have looked at the journals in the subject category ‘Library and Information Science’. This category includes some 98 journals. It is important to note that SCImago JR has a much more refined subject categorization than included in Scopus itself. Although I speculate that this subject categorization is possibly somewhere under the hood in Scopus as well. The corresponding category in JCR is Information ‘Science & Library Science’ which contains 53 journals.

It is really easy to transfer the data from Simago JR to excel, where it always take a bit more clicks (making a marked list) and using the print export to get the data into excel. Interesting to note that in the web environment SCImago uses a European number notation with comma’s indicating the fraction and the dot indicating the thousands. On transfer to excel this is corrected automatically. A minor point from SCImago is that ISSN numbers are lacking from the exported data. In JCR the full journal titles are not exported.

The journals from JCR were matched manually against the journals from SCImago since a shared field was missing. Only a few journals from JCR were not found directly in the downloaded journals from SCImago. The journals ‘Journal of the American Medicals Information Association’, ‘Information and Management’ and ‘Journal of Scholarly Publishing’ were included in other journal categories than ‘Library and Information Science’. Furthermore it was noted that the journal ‘International Journal of the Geographical Information Science’ was included twice in the list of Library and Information Science journals at rank 5 and rank 33 again. In the processing the journal at rank 33 was dropped from the list. In the JCR the Journal of Government Information is still include albeit it was from 2005 already included in Government Information Quarterly –The calculation of IF in JCR 2006 is indeed based on only a single year of data-. Two other journals Online and Econtent included in JCR and included in Scopus were not to be found in SCImago. This is not really a great miss, since these are trade journals rather than peer reviewed scholarly journals, but this applies to some other journals included in the table as well, e.g. The Scientist and Library Journal. In the end 50 journals from SCImago and JCR in the LIS field could be matched. The full list of journals included in this little study is linked as a Google Document.

Looking at the table it is apparent that the maximum value of SJR is an order of magnitude smaller than the impact Factors. At the lower en of the scale Impact factors become zero, whereas the lowest value of SJR in this set of journals is 0.038.
In Figure 1, I have plotted the IF against the SJR. There seems to be a strong relationship between SJR and IF, albeit there are some outliers from an apparent linear relationship. Interestingly these three outliers are LIS journals on medical librarianship, they are: Journal of the American Medical Informatics Association : JAMIA, Journal of Health Communication and Journal of the Medical Library Association. MIS Quarterly is not regarded as an outlier since it clear follows lies on the trendline underlying the other datapoints.

Figure 1

I think the three outliers really illustrate the point that SJR is more a pagerank type of indicator. The three medically oriented journals receive relatively citations from highly ranked medical journals. Checking this for JAMIA in Scopus, we find citations from journals such as Pediatrics (SJR=0.528), Annals of Internal Medicine (SJR= 1.127) or BMC Bioinformatics (SJR= 0.957). The journal adhering the trendline for LIS journals receive far less of these kind of “external” citations.

Excluding the three medical journals we get a very good regression between the two parameters with an R² of 0.86. In Figure 2 the regression line is added based on the remaining 47 journals.

Figure 2

Thought this is a really cool result illustrating the difference between SJR and IF quite clearly. In a subsequent post I will look a bit more into the correlations between the various parameters a bit more.

Another bibliometrics presentation

SlideShare | View | Upload your own

Tomorrow I will give a brief presentation on the outcomes of a citation analysis exercise we did for a chairgroup at our university a while back. I share this presentation since I contains some tips on publishing which some might find useful.

Citation analysis for research evaluation

Tomorrow, I am about to give a course on citation analysis for research evaluation. This powerpoint is the mainstay for the morning, but the course is open to any suggestions. It differs only in little details from the course given at the start of this year. The most exciting change came from Scimago, which I only discovered yesterday but has already been included in the exercises.

Top science countries

In cites just published an overview of the top ranking countries for science over all fields. The report is based on the Essential Science Indicators it is therefore based on the past 10 years of publication and citation data collected from the Thomson ISI covered journals.

The first two tables rank the countries by accumulated citations and published papers. Most interesting however is the third table in which the countries are ranked on citations per paper. After omitting the smallest countries from that list we see that Switzerland (14.32 cpp) is leading the list followed by USA, Denmark and Netherlands just in the 4th postion with 12.85 cpp. Scotland, Sweden England, Finland, Canada and Belgium complete the top 10.

THES rankings, manipulation or optimization?

From the university newspaper of Groningen we get some interesting insights in the way Groningen University has optimized their data for submission to the THES rankings. Deemed not to be important, the rector nevertheless wanted Groningen University to score better in the THES-QS rankings. For the rector, the first notation in the top 200 of the THES rankings, 173 to be exactly, was a good reason to celebrate with his subordinates.

What did they do? They concentrated on the questions of the most favourable number of students. The number of PhD students was a number they could play with. In the Netherlands PhD students are most often employed as faculty, albeit they are students as well to international standards. They contemplated on the position of the researchers in the University hospital. This would increase the number of staff considerably and thus lower the student/faculty ratio, but on the other hand this could have an important effect on the number of citations per research staff as well. Increases in staff number will lower the citations per staff. Which is detrimental to the overall performance. However, if they only could guarantee that citations to hospital staff were included in the citation counts as well?

So in Groningen they have exercised through some scenarios of number of students, number of staff, student/staff ratio and citations/staff ratio to arrive at the best combination to enhance their performance. I really do wonder if the contact between Groningen and QS -the consultants establishing the rankings- did also lead to the improvement of the search for citations by including the University Hospital for the university results. It is known from research by CWTS that searches for papers from all parts of the university are notoriously difficult. Especially to include the papers produced by staff from the teaching hospitals. In Groningen they have the feeling that it helped what they did in their contacts with QS. Well, at least it resulted in a nice picture on their university profile page.

Optimization or manipulation? It is only a thin line. If you only could make sure that all staff of your university would use the proper name of the institution in the authors affiliation. The university would gain a lot.

Vrije universiteit in the top 400

In my previous post on the THES university rankings 2007, I wrote that I suspected a mix up of names of the University of Amsterdam and the Free University of Amsterdam. However this appears not to be the case. In the recently released top 400 the Vrije universiteit Amsterdam is ranked 304 in the list of top universities. We can only guess where Tilburg university is listed.

hattip: university ranking watch

Dutch universities in the Thes university rankings 2007

The Thes university rankings 2007 are now officially released. The ranking of Dutch universities is as follows with the overall rank in the THES top 200 and the previous ranking between brackets :

  1. University of Amsterdam 48(69)
  2. Delft 63(86)
  3. Leiden 84 (90)
  4. Utrecht 89(95)
  5. Maasstricht 111(172)
  6. Eindhoven 130(67)
  7. Wageningen 148(97)
  8. Erasmus 163(92)
  9. Groningen 173(232)
  10. Twente 185(115)
  11. Radboud 195(137)

All in all 11 out of 13 regular Dutch universities are enlisted in the top 200. In the case of the missing Free University I really wonder to what extend there might be a mix up of the two universities based in Amsterdam. For Tilburg it is rather unfortunate that they didn’t make it to the list but since Tilburg is mainly a humaniora and social sciences university it can be explained. As soon as I get more details, I will post more.