A few weeks ago my eye caught a tweet on the subject of academic search engine optimization
— Matteo Cavalleri (@physicsteo) January 10, 2014
The nicely styled PDF referred to in the tweet was from Wiley. Wiley has been quite active in this area. In my book mark list I have somewhere the link to their webpage on optimizing your research articles for search engines (SEO), somewhere tucked away on their author services section. And a link to the article “Search engine optimization and your journal article: Do you want the bad news first?” on their Exchange blog. Wiley is not the only publishers dealing with this subject, here is an example on academic search engine optimization from Elsevier and another example from Sage. I bet there are other examples from publishers to be found.
The major advise is to use the right keywords. Use these keywords in your title, and repeat them throughout your abstract. Contextually repeated as they say. Do mention some synonyms for those keywords as well and please do make use of the key words fields in the article as well. They emphasize to use Google Trends or Google Adwords to find the right keywords, but that is ill-advised for academic search engine optimization in my opinion. When selecting keywords for academic search engine optimization it is better to use keyword systems, ontologies or thesauri from you subject area, because experienced researchers will use this terminology to search for their information as well. So in the biomedical area it is obvious to consult the mesh browser, but when you are in the agriculture or ecology field of research the CAB thesaurus is the first choice for selecting the appropriate keywords. The Wiley SEO tips ends with the advise to consistent with your own name (and affiliation, your lab deserves to be named properly as well), and don’t forget to cite your previous work.
The role of the editors in Academic Search Engine Optimization
In their short PDF the Wiley team mentions to use headings as well “Headings for the various sections of your article tip off search engines to the structure and content of your article. Incorporate your keywords and phrases in these headings wherever it’s appropriate.” A nice suggestion but in practice this is hardly ever in the hands of the individual author. Scholarly articles tend to have a rather fixed structure. The IMRAD structure, Introduction, Methods, Results and Discussion being the most common. In such a case the author has no space to add headings in the right position in their paper. But research by Hamrick et al. showed that papers with callouts, tend to have higher number of citations. A “callout” is a phrase or sentence from the paper, perhaps paraphrased, that is displayed prominently in a larger font. The journal which they investigated abandoned the practice to use callouts, but after their article this practice was reinstated again. A decision like that, is an editorial decision. And it is recommended for all journals to help the readers with pointers in the form of callouts, and benefit from the affects it can have as academic search engine optimization as well. My favourite Wiley journal, JASIST, certainly doesn’t make systematic use of callouts.
The other topic on which the editorial board has an important say is the layout of the reference lists in their journals. I have pleaded many times before for a reduced number of specifications of reference lists. It looks like the first task an editorial board of a newly established journal embarks upon is, is to formulate yet another exotic variation of the many different styles specifying the layout of the reference list. The point however, that these definitions hardly make use of the possibilities of academic search engine optimization, or search engine optimization whatsoever, most often they forget to include linking options in the reference list altogether. Older instructions to authors have not caught up with the present time yet. In the html version of the scholarly articles links are included as part of the journal platform software, but in the PDF versions of the articles the URLs are often forgotten altogether. Where DOIs are linkable in the webpage, in most instances DOIs in the PDF version are most often presented in the form of doi:10.1002/asi/etc. It is even explicitly stipulated in the APA style and many others to reference a DOI as doi: which goes against the advice of the DOI governing body. These bad practices results in the fact that DOI’s included in the PDF versions of the reference list don’t link. Which is a complete and utter waste of SEO opportunity. So academic search engine optimization is badly broken in this area.
The role of publishers in Academic Search Engine Optimization
Publishers have their role in supporting the editorial boards in resolving the two previously mentioned items. But they should also have a careful look into the PDF files they produce at the moment as well. At this moment the Google Webmaster has only a few pointers to PDF optimization. To mention a few interesting ones: Links should be included in the PDF (this means again DOIs as links rather than doi: statements) since they are treated as ordinary links. And the last point is important as well “How can I influence the title shown in search results for my PDF document” The title attribute in the PDF is used! And the anchor text. On publishers site this is most often “PDF”. If they only would use the title as anchor text on their website it would work in their advantage. Albeit not mentioned on the Google webmaster blogpost, since it is probably too obvious, if the file had only the name of the title it certainly would help the SEO for the PDF, and it would help all those scientists who download all the PDF files for their research to sort out what file is what about. Was 123456.pdf about the genetics or genomes, or was that in 234567.pdf? Clear titles would help both researchers as well as search engines to work out what it is all about.
And whilst publishers are on the subject of PDF optimization they might as well complete the other attributes for PDF files, such as authors, keywords and summary. If it is not now, another search engine might make use of those attributes another day. You might as well be prepared. Researchers, using reference management tools, can also benefit from those metadata attributes. Ross Mounce has some interesting blogposts about the researchers need for good metadata in PDFs. Theoretically a little effort since all that metadata is in the databases already, so make use of it to optimize your PDFs for academic search engine optimization or service to your most loyal users who have so far put up with a load of bad PDFs.
Hamrick, T. A., R. D. Fricker, and G. G. Brown. 2010. Assessing what distinguishes highly cited from less-cited papers published in Interfaces. Interfaces, 40(6): 454-464. http://dx.doi.org/10.1287/inte.1100.0527. OA version:http://faculty.nps.edu/tahamric/docs/citations%20paper.pdf
Related: Google and the academic Deep Web