Impact factors and Scimago JR compared

In December I promised to look into more detail of the newly launched Scimago Country & Journal Rank database. Scimago has attracted some attention in the blogosphere outside Spain since December and got some serious attention from Declan Butler as a news item in Nature (Subscription required).

It is too early for some thorough in-depth investigations of this new database, but the better blog reactions were at Information Research and a second time again and the Biomed Central Blog . They both had an issue of self interest to see where they where their journals were standing in this new database. We have to wait a bit longer for the reviews in the scholarly literature, I’m afraid.

Meanwhile I have looked into this database a bit more closely. In this blogpost I report some of my findings. My reason to look into this database more closely is mainly triggered by the fact that it allows us –librarians- to evaluate the rankings of a larger set of journals in a quantitative way. Impact factors have played a role in the decisions on journal subscriptions and cancellations –albeit not the sole criterion- How does the SJR compare to the impact factor is my main question.

SJR is “an indicator that expresses the number of connections that a journal receives through the citation of its documents divided between the total of documents published in the year selected by the publication, weighted according to the amount of incoming and outgoing connections of the sources.” In essence is the SJR an Pagerank type of indicator in which citations from highly ranked journals increase the ranking of the journal.

To gain more understanding SJR and I have looked at the journals in the subject category ‘Library and Information Science’. This category includes some 98 journals. It is important to note that SCImago JR has a much more refined subject categorization than included in Scopus itself. Although I speculate that this subject categorization is possibly somewhere under the hood in Scopus as well. The corresponding category in JCR is Information ‘Science & Library Science’ which contains 53 journals.

It is really easy to transfer the data from Simago JR to excel, where it always take a bit more clicks (making a marked list) and using the print export to get the data into excel. Interesting to note that in the web environment SCImago uses a European number notation with comma’s indicating the fraction and the dot indicating the thousands. On transfer to excel this is corrected automatically. A minor point from SCImago is that ISSN numbers are lacking from the exported data. In JCR the full journal titles are not exported.

The journals from JCR were matched manually against the journals from SCImago since a shared field was missing. Only a few journals from JCR were not found directly in the downloaded journals from SCImago. The journals ‘Journal of the American Medicals Information Association’, ‘Information and Management’ and ‘Journal of Scholarly Publishing’ were included in other journal categories than ‘Library and Information Science’. Furthermore it was noted that the journal ‘International Journal of the Geographical Information Science’ was included twice in the list of Library and Information Science journals at rank 5 and rank 33 again. In the processing the journal at rank 33 was dropped from the list. In the JCR the Journal of Government Information is still include albeit it was from 2005 already included in Government Information Quarterly –The calculation of IF in JCR 2006 is indeed based on only a single year of data-. Two other journals Online and Econtent included in JCR and included in Scopus were not to be found in SCImago. This is not really a great miss, since these are trade journals rather than peer reviewed scholarly journals, but this applies to some other journals included in the table as well, e.g. The Scientist and Library Journal. In the end 50 journals from SCImago and JCR in the LIS field could be matched. The full list of journals included in this little study is linked as a Google Document.

Looking at the table it is apparent that the maximum value of SJR is an order of magnitude smaller than the impact Factors. At the lower en of the scale Impact factors become zero, whereas the lowest value of SJR in this set of journals is 0.038.
In Figure 1, I have plotted the IF against the SJR. There seems to be a strong relationship between SJR and IF, albeit there are some outliers from an apparent linear relationship. Interestingly these three outliers are LIS journals on medical librarianship, they are: Journal of the American Medical Informatics Association : JAMIA, Journal of Health Communication and Journal of the Medical Library Association. MIS Quarterly is not regarded as an outlier since it clear follows lies on the trendline underlying the other datapoints.

Figure 1

I think the three outliers really illustrate the point that SJR is more a pagerank type of indicator. The three medically oriented journals receive relatively citations from highly ranked medical journals. Checking this for JAMIA in Scopus, we find citations from journals such as Pediatrics (SJR=0.528), Annals of Internal Medicine (SJR= 1.127) or BMC Bioinformatics (SJR= 0.957). The journal adhering the trendline for LIS journals receive far less of these kind of “external” citations.

Excluding the three medical journals we get a very good regression between the two parameters with an R² of 0.86. In Figure 2 the regression line is added based on the remaining 47 journals.

Figure 2

Thought this is a really cool result illustrating the difference between SJR and IF quite clearly. In a subsequent post I will look a bit more into the correlations between the various parameters a bit more.