donderdag 16 juli 2015

Some thoughts about subjects

[I wrote an internal memo, which I would like to share on this platform, although some statements were previously published in earlier blogs]

Nowadays libraries operate in a time in which tremendous changes occur. The familiar financial foundation of every library has been removed and replaced by a much more weaker one. The search expertise of library users is, increasingly, becoming a reflection of the search methodology used to do a Google search or a Bing search. Especially the libraries associated to universities and other research facilities strongly present themselves as a participant in performing research; as suppliers and managers of data. And last but certainly not least the type of the collection offered by libraries is rapidly changing, from paper to electronic files made available in any form whatsoever. And as such contributing to difficult technicalities and a legal world which could be described as a world of quicksand.

Nowadays, in this hectic world with budget cuts, it is of the utmost importance for libraries to clearly present themselves and their collections to their users and coming users. And there are a lot of ways to do so. Clear websites, simple but solid library software, being topical and actual, be there where your users are (Facebook, Twitter), connecting to users through the medium of newsletters and alerting systems, etc. Less obvious is to bring parts of the collection in the limelight, including the 'old-fashioned' parts; books and journals. The library of the Peace Palace is one of the libraries which try to draw attention to specific parts of their collections. On a regular basis specific components, called research guides with actual and relevant bibliographical data, are placed in the foreground.

Libraries are also adding subject headings to the standard metadata of their documents, thus enriching the collections they manage. With this extra metadata users are able to locate relevant information in a more specific way. Unfortunately, this effort is not fully used by the patrons in the library. Just a very small percentage of OPAC queries use subject indices and those users who do, hardly never combine different subject headings. So how to increase the 'return on this investment'? I think the supposed disinterest of our users can be attributed to ignorance; most of them simply don't know subject headings exist or at least don't know what can be done with them. I'll give an example to show what I mean. In our search log I detected two different users both searching with the simple word 'genocide'. Both switched the search index from 'title words' to 'all words', so both knew how to use the index system of our library software, but neither of them bothered to search while using the 'subject headings' index, which of course gives a more reliable outcome.

You can try to change this behaviour by simple instruction and/or by showing how our subject headings appear in results after a search. Not by showing how users embed subject headings in their searches -this is hardly done, like I said- but by showing which subject headings appear in a set of results, generated by more common search types. I decided to try the last, so in trying to explain why using subject headings is important, I actually use the end, not the start of this route. The most informative and still compact method of presenting this kind of data is the one which uses an interactive map.

The software to make this possible is Gephi, an open source program, so freely available. Gephi is usually used to visualize strong or even weak relations between persons or websites, but I thought it could be possible with subject headings too. Simply imagine there is a strong relation between the subject headings in the metadata belonging to one document and a weaker relation between the same subject headings belonging to different documents.

The knife cuts both ways if a larger set of results is collected to be used in Gephi. Not only the subjects headings more or less strongly related to one another are shown, also the different subject areas, huge and small, could be visualized. I decided to collect all viewed titles in our OPAC in June 2015. Almost all titles did have subject headings and these subject headings were stored in a file which can be dealt with by Gephi. All in all I collected 2900 different used subject headings (nodes) and 72500 different relations (edges) between these subject headings all with their own weights. (This is not the place to explain the intricacies of Gephi, but, if you really want to know, please search for 'Gephi' on the Internet. There is a lot of information available.)

Creating maps with Gephi is one thing, but making them available on the internet is another. Luckily Gephi allows users to create plugins, which can be used to create different layouts and statistical or relational models. It is also possible to create plugins which can be used to export the maps and building blocks of these maps. The Oxford Internet Institute: (University of Oxford) together with JISC: created such a plugin with which it is possible to export relevant data and scripts using just Javascript. So all browsers using Javascript will be able to present clickable maps, no browser extensions needed.

In short, after clicking below mentioned link, you will see smaller clusters of subject headings indicating interest in more specific subject areas like 'Environment' or 'Nato and Ethics', but also some huge clusters referring to more general subjects like 'Human Rights' or 'European Union'. It is possible to zoom in and out using the little zoom toolbar below the map, or to select one cluster for more detailed inspection using the Group Selector (to the left). Clicking one occurence in the map gives a lot of information about the chosen subject heading, like detailed, statistical information about strength or weight and other subject headings with which it was combined (popup to the right). This way it is indicated which subject headings where combined to describe the contents of different but related publications or giving a hint to start searching using combined subject headings with the restrict[] option in our OPAC.

Please visit to see and use the map which gives an overview of the data mentioned above. You need more information? Questions? Contact Aad Janson at a.janson at ppl dot nl