|Thesauri and interoperability with other vocabularies |
Part 1: Thesauri for information retrieval
|First published||part1: 15 August 2011|
part2: 15 March 2013
|Latest version||part1: First Edition|
part2: First Edition
|Organization||International Organization for Standardization (ISO)|
|Committee||ISO/TC 46/SC 9|
|Series||01.140.20 (INFORMATION SCIENCES)|
|Base standards||ISO 2788, ISO 5964, BS 8723|
|Related standards||ANSI/NISO Z39.19|
ISO 25964 is the international standard for thesauri, published in two parts as follows:
ISO 25964 Information and documentation - Thesauri and interoperability with other vocabularies Part 1: Thesauri for information retrieval [published August 2011] Part 2: Interoperability with other vocabularies [published March 2013]
It was issued by ISO, the International Organization for Standardization, and its official website  is maintained by its secretariat in NISO, the USA National Information Standards Organization. Each part of the standard can be purchased separately from ISO or from any of its national member bodies (such as ANSI, BSI, AFNOR, DIN, etc.). Some parts of it are available free of charge from the official website.
The first international standard for thesauri was ISO 2788, Guidelines for the establishment and development of monolingual thesauri, originally published in 1974 and updated in 1986. In 1985 it was joined by the complementary standard ISO 5964, Guidelines for the establishment and development of multilingual thesauri. Over the years ISO 2788 and ISO 5964 were adopted as national standards in several countries, for example Canada, France and UK. In the UK they were given alias numbers BS 5723 and BS 6723 respectively. And it was in the UK around the turn of the century that work began to revise them for the networking needs of the new millennium. This resulted during 2005 - 2008 in publication of the 5-part British Standard BS 8723, as follows:
BS 8723 Structured vocabularies for information retrieval - Guide Part 1: Definitions, symbols and abbreviations Part 2: Thesauri Part 3: Vocabularies other than thesauri Part 4: Interoperability between vocabularies Part 5: Exchange formats and protocols for interoperability
Even before the last part of BS 8723 was published, work began to adopt and adapt it as an international standard to replace ISO 2788 and ISO 5964. The project was led by a Working Group of ISO's Technical Committee 46 (Information and documentation) Subcommittee 9 (Identification and description) known as “ISO TC46/SC9/WG8 Structured Vocabularies”.
ISO 2788 and ISO 5964 were withdrawn in 2011, when they were replaced by the first part of ISO 25964. The second part of ISO 25964 was issued in March 2013, completing the project.
ISO 25964 is for thesauri intended to support information retrieval, and specifically to guide the choice of terms used in indexing, tagging and search queries.
The primary objective is thus summarised in the introduction to the standard as:
"If both the indexer and the searcher are guided to choose the same term for the same concept, then relevant documents will be retrieved."
Whereas most of the applications envisaged for ISO 2788 and ISO 5964 were databases in a single domain, often in-house or for paper-based systems, ISO 25964 provides additional guidance for the new context of networked applications, including the Semantic Web. A thesaurus is one among several types of controlled vocabulary used in this context.
A thesaurus compliant with ISO 25964-1 (as Part 1 is known) lists all the concepts available for indexing in a given context, and labels each of them with a preferred term, as well as any synonyms that apply. Relationships between the concepts and between the terms are shown, making it easy to navigate around the field while building up a search query. The main types of relationship include:
In multilingual thesauri equivalence also applies between corresponding terms in different natural languages. Establishing correspondence is not always easy, and the standard provides recommendations for handling the difficulties that commonly arise.
ISO 25964-1 explains how to build a monolingual or a multilingual thesaurus, how to display it, and how to manage its development. There is a data model to use for handling thesaurus data (especially when exchanging data between systems) and an XML schema for encoding the data. Both the model and the schema can be accessed 24/7, free of charge, on the official website hosted by NISO. The standard also sets out the features you should look for when choosing software to manage the thesaurus.
ISO 25964-2 deals with the challenges of using one thesaurus in combination with another, and/or with some other type of controlled vocabulary or knowledge organization system (KOS). The types covered include classification schemes, taxonomies, subject heading schemes, ontologies, name authority lists, terminologies and synonym rings. Within a single organization it is common to find several different such KOSs used in contexts such as the records management system, the library catalogue, the corporate intranet, the research lab, etc. To help users with the challenge of running a single search across all the available collections, ISO 25964-2 provides guidance on mapping between the terms and concepts of one thesaurus and those of the other KOSs. Where mapping is not a sensible option, the standard recommends other forms of complementary vocabulary use.
Similarly on the Internet there is an opportunity to make a simultaneous search of repositories and databases that have been indexed with different KOSs, on an even wider scale. Interoperability between the different networks, platforms, software applications, and languages (both natural and artificial) is reliant on the adoption of numerous protocols and standards. ISO 25964-2 is the one to address interoperability between structured vocabularies, especially where a thesaurus is involved.
Since Part 1 of ISO 25964 was published it has been adopted by the national standards bodies in a number of countries. For example, The British Standards Institution (BSI) in the UK has adopted it and labelled it unchanged as BS ISO 25964-1. At the time of writing similar consideration is under way for Part 2. The American standard ANSI/NISO Z39.19 - Guidelines for the Construction, Format, and Management of Monolingual Controlled Vocabularies covers some of the same ground as ISO 25964-1. It deals with monolingual lists, synonym rings and taxonomies as well as thesauri, but does not provide a data model, nor address multilingual vocabularies or other aspects of interoperability, such as mapping between KOSs. Where the two standards overlap, they are broadly compatible with each other. NISO is actively involved in both standards, having participated in the work of developing ISO 25964 as well as running its secretariat. The W3C Recommendation SKOS (Simple Knowledge Organization System) has a close relationship with ISO 25964 in the context of the Semantic Web. SKOS applies to all sorts of “simple KOSs” that can be found on the Web, including thesauri and others. Whereas ISO 25964-1 advises on the selection and fitting together of concepts, terms and relationships to make a good thesaurus, SKOS addresses the next step - porting the thesaurus to the Web. And whereas ISO 25964-2 recommends the sort of mappings that can be established between one KOS and another, SKOS presents a way of expressing the mappings when published to the Web.