woensdag 29 oktober 2014

Subjects by keywords.

Each library buys books, journals or access to the e-version of these, files, databases, etc. In each library, there is a specific focus on a particular field of interest or -more often- multiple fields of interest. In the library of the Peace Palace 'documents' are acquired in the field of international law. Of course it is possible to recognize a large number of sub fields within this vast subject: international criminal law, human rights, diplomacy and so on.

For the users of the library, it is important to know which of these areas are covered and whether therefore it is worthwhile to use that library when you yourself are dealing with such a subject. One of the tools the library is using, is a system of sending regular 'alerts'. Interested scholars, students and other interested parties can be informed about the most recent acquisitions, once a week. There are around 1.000 subscribers to this service. They receive a weekly overview, based on a single criterion. The consequence of using a single general criterion is of course that the outcome may be huge. Especially topics like 'European Union' or 'Public International Law' may contain quite a lot of bibliographical references.

But what if you have a way of presenting the acquisitions in one month, not on the basis of one single general criterion, but where combinations of keywords assigned to each title play a role? What if relationships between those keywords can be visualized on the website in stead of emailed to a subscriber? In an attempt to make that possible I created a clickable map, which can be found on http://www.ppl.nl/september.

To create this map I used the visualization tool 'Gephi', which is especially strong in showing the links between the building blocks of the map. So I consider, for this purpose, the keywords as building blocks. On above mentioned web address, the acquisitions from the month of September are recorded, not in the form of boring title lists, but in the form of assigned keywords and relationships between those keywords. It is still all about numbers, as the strength of a relation is determined by the amount of occurrences of keyword pairs.

Of course in a batch of thousands of titles some areas in the map should be indicated by large blobs of tightly connected keywords. These blobs refer to the core businesses of the library. At the edges of the map, smaller subareas appear. In other words, large areas of the map show acquisitions that are always extremely important to the library. They can be recognized, because the largest circles in the map appear over there, surrounded by a large amount of closely packed smaller circles.

Subtopics are located outside the center of the map and can be considered as subjects which are farther away from the core business. So the area in the upper right is characterized by the keyword 'History', especially the history of the First World War. Typical related keywords are: 'Military History', 'Massacres', 'Ethnic Minorities', 'Russian Empire', etc. In the bottom left there is an area that relates to commerce, trade, international commercial arbitration, etc.

It must be stressed that subareas differ each month. It all depends on what is going on in the world. Highlighted in the last couple of months of this year is of course the commemoration of the start of the First World War. But other current events can also lead to a temporary increase in attention, such as transboundary pollution of the environment, disease outbreak, cyber warfare, sporting events, etc.

Clicking on a circle produces an 'Information Pane' where additional information can be found on that keyword, like how many times it occurs, but also the related keywords -those that are used in combination with the clicked keyword- are mentioned. So, with a few clicks it is easy to get a good impression about different topics and topic areas.

Finally, you are invited to use the map by hovering over the various items and view the information in the right pane. Are you looking for a specific topic? You can search by using the Search box in the left panel or zoom in on a specific group or cluster by 'Group Selector'.

woensdag 22 oktober 2014

MOOC: learning and instruction: Tableau and library use.

On October 20, 2014 a new MOOC course "Data, Analytics and Learning" ,hashtag #dalmooc, started. I take part in this course. In addition to a new approach with regard to the process of learning itself, there is also the usual presentation of the course, thus with video, text, references to relevant literature and assignments.

One of the first tools asked to be studied is Tableau, a tool to examine, analyze and present data in an visually appealing and very informative manner.  Luckily, participants to the course get a code which can be used to install the full desktop version of the software, at least during the course (until January 2015). To be clear, Tableau is not free software.

After Googling around looking for instructions, manuals and the like, I realized there were no example files shown or used in Tableau instruction videos, which could be related to libraries, or more specifically, to OPACs used in libraries. As I work in the library of the Peace Palace I thought to collect some library data and use it in Tableau, just as an exercise.

Every time our link resolver is used, some data is stored in a database, We use a MongoDB database for this purpose. At the time of this writing we have collected a little more than half a million of these documents. We store among other things a time stamp, country of user based on ip, general subject information and short bibliographic information. Although I know that it is possible to connect Tableau to a MongoDB server using a special ODBC connector, I will still use an excel file -to keep things simple- in Tableau to generate some also very simple graphics.

The file contains just country, general subject in coded format, i.e. a number, day number and will be limited to link resolver use in 2014. With these we still have some 370.000 rows!

We see the most populair subject, not surprisingly, is 'European Union (42)' the second popular subject is 'Human Rights (60)' and not so popular is 'Mutual Cooperation in Criminal Matters (147)'.

Let us focus on 'Human Rights' and see which countries are the most interested in this subject, based on the number of clicks per country.

For contrast the subject 'History' and please remember the sizes of the dots are relevant to the subject, they do not represent actual numbers:

This is all very interesting, because we have here an indication of what our users look for and where they come from. But also interesting is, to see whether the library staff takes the interests of the patrons into account while acquiring documents for their collection? That is a subject for another blog.

To conclude. With Tableau it is easy to understand what is important or interesting to students and scholars using our library OPAC and / or link resolver. And I just scratched the surface of this software....

dinsdag 21 oktober 2014