You are here

Thesauri and Vocabulary Control - Bibliography

Publications on Thesaurus Construction and Use
with references to facet analysis, taxonomies, ontologies, topic maps, and related issues
An edited collection of resources originally compiled by Leonard D. Will and Sheena E. Will and presented on
The collection was updated by the TaxoBank editorial staff in October 2013. While many of the original resources are no longer available, the remaining items have historical interest and lasting value.
This is a list of printed and electronic publications about the principles of constructing and using information retrieval thesauri. At the end are a few references on the following:
·         facet analysis
·         search interfaces that provide for thesaurus use by combining terms from multiple facets, or multiple characteristics of division within a facet
·         lists of thesauri
·         taxonomies, the currently fashionable term for classification schemes
·         ontologies, a development of thesauri, supporting a greater number relationship types between concepts, intended to be useful to software implementing aspects of the semantic web
·         topic maps, a way of structuring such an ontology, with the addition of links between concepts and information resources.  

Thesaurus structure and use

About Thesauri. Jessica L. Milstead. Brookfield, CT: The JELEM Company, 2000.
A brief introduction to what a thesaurus is, why an organization may need one, and the process of thesaurus construction.
After the Dot-Bomb: Getting Web Information Retrieval Right This Time. Marcia J. Bates.  First Monday 7, no.7(July 2002).
In the excitement of the "dot-com" rush of the 1990's, many Web sites were developed that provided information retrieval capabilities poorly or sub-optimally. Suggestions are made for improvements in the design of Web information retrieval in seven areas. Classifications, ontologies, indexing vocabularies, statistical properties of databases (including the Bradford Distribution), and staff indexing support systems are all discussed. - [Author's abstract]

Art & Architecture Thesaurus. Toni Petersen, Director. 2nd ed. New York, Oxford: Oxford University Press, 1994.
ASIS&T Thesaurus of Information Science, Technology, and Librarianship, 3d ed. Alice Redmond-Neal and Marjorie M. K. Hlava, eds. Medford, New Jersey: Information Today, 2005. Price: $49.95, $39.95 ASIS&T members; Book with CD-ROM $79.95, $63.95 ASIS&T members. Fourth Edition is available online here.
An essential resource for indexers, researchers, scholars, students, and practitioners in the field. An optional CD-ROM includes the complete contents of the print thesaurus along with Data Harmony's Thesaurus Master software.- [Publisher’s abstract]
The thesaurus includes over 1650 preferred terms, 1194 nonpreferred terms, and 16 branches. Scope notes and definitions of ambiguous terms are given. Contains an alphabetical listing, hierarchical listing, and rotated (KWIC) listing.
Australian Pictorial Thesaurus (APT) Sydney: State Library of New South Wales, 2000.  
"The purpose of the Australian Pictorial Thesaurus is to provide Australian terms for indexing Australian pictorial collections and a controlled vocabulary for searching across image databases on the Internet. It is intended that the APT will become a national standard for describing pictorial materials. All non-abstract topic terms are arranged hierarchically within five narrower terms [categories or tables: events & activities; ideas & concepts; manufactured objects; natural objects; people; places & structures]. Terms for abstract ideas and concepts are arranged following the Dewey Classification Scheme in a separate table [not available on the Web site].
The Australian Pictorial Thesaurus was developed through a joint project sponsored by the National Library of Australia, Australian Museums Online (AMOL) and led by the Council of Australian State Libraries (CASL). CASL is the peak body representing the eight State and Territory Libraries throughout Australia. AMOL is an initiative of the Cultural Ministers Council and its Heritage Collections Council. CASL, AMOL and the National Library are now sponsoring the ongoing management and development of the APT." - [Extracted from Web pages]
Although constructed for the indexing of pictorial material, this is a well-structured general thesaurus of 15,000 terms arranged hierarchically which may be of use for any collection of objects and images. The "about APT" page gives sound and useful advice not only on the use of this thesaurus but on indexing using any thesaurus.

British Standards Institution - Structured Vocabularies for Information Retrieval
The British Standards Institution established an array of standards pertaining to thesauri and other structured vocabularies, with the International Organization for Standards (ISO) adopting them as ISO standards. The ISO versions have been adopted in several countries.
BS 8723-1:2005 (and BS 8723-2:2005, BS 8723-4:2007, DD 8723-5:2008) was replaced by ISO 2788 and ISO 5964. Subsequently they were replaced by ISO 25964 Part 1: Thesauri for information retrieval and Part 2 Thesauri and interoperability with other vocabularies.
Classification Criteria for Historical Astronomical Instrumentation.  F. Bònoli, M. Calisi, and P.Ranfagni. Atti del XVI Congresso Nazionale di Storia della Fisica e dell'Astronomia, Centro Volta, Villa Olmo, Como, 24 - 25 Maggio 1996 / a cura di Pasquale Tucci
A discussion of the problems of variant names for historical astronomical instruments and a suggested outline classification by function.
Controlled Vocabularies: Recommended Reading. [Ottawa]: Library and Archives Canada. 

Criteria for the Evaluation of Thesaurus Software. Jochen Ganzmann. International Classification, 17(1990): 148-157.

Essential Classification. Vanda Broughton. London: Facet Publications, 2004.
This paperback was published in October 2004 and represents the most recent addition to classification text books. The title may be a little misleading as the book also covers subject analysis and word based approaches such as lists of subject headings and thesauri. Thus, the book serves as a rather comprehensive introduction to the whole field of knowledge organization in libraries - with good practical overviews, summaries, exercises, a glossary, an index and a bibliography with suggestions for further reading. - [Extract from review by T. Henriksen.]

Essential Thesaurus Construction. Vanda Broughton. London: Facet Publishing, 2006.
This practical text examines the criteria relevant to the selection of a subject management system, describes the characteristics of some common types of subject tool, and takes the novice step-by-step through the process of creating a system for a specialist environment. The methodology employed is a standard technique for the building of a thesaurus that incidentally creates a compatible classification or taxonomy, both of which may be used in a variety of ways for document or information management. - [Publishers' Web site]

Exploiting LCSH, LCC, and DDC to Retrieve Networked Resources: Issues and Challenges. Lois Mai Chan. Washington, D.C.: Library of Congress, December 2000.

Extracting Value from Automated Classification Tools: the Role of Manual Involvement and Controlled Vocabularies. Kat Hagedorn. Ann Arbor, Michigan: Argus Associates, March 2001.
Automated classification tools can't solve today's large-scale web and intranet indexing challenges alone. Neither can humans. But solutions that integrate human expertise with software products such as Interwoven's Metatagger and Autonomy's Categorizer can provide real value and savings. After a brief introduction to automated classification, this white paper discusses the benefits and limitations of manual, automated, and hybrid approaches. It explores the opportunities for leveraging controlled vocabularies and thesauri to produce more effective indexing solutions. – [Author's summary]

Faceted Classification and Logical Division in Information Retrieval. Jack Mills. Library Trends, 52(Winter 2004): 541-570.
The main object of the paper is to demonstrate in detail the role of classification in information retrieval (IR) and the design of classificatory structures by the application of logical division to all forms of the content of records, subject and imaginative. The natural product of such division is a faceted classification. The latter is seen not as a particular kind of library classification but the only viable form enabling the locating and relating of information to be optimally predictable. A detailed exposition of the practical steps in facet analysis is given, drawing on the experience of the new Bliss Classification (BC2). The continued existence of the library as a highly organized information store is assumed. But, it is argued, it must acknowledge the relevance of the revolution in library classification that has taken place. It considers also how alphabetically arranged subject indexes may utilize controlled use of categorical (generically inclusive) and syntactic relations to produce similarly predictable locating and relating systems for IR. – [Author's abstract]

From Data to Knowledge Through Concept-Oriented Terminologies: Experience with the Medical Entities Dictionary. James J. Cimino. Journal of the American Medical Informatics Association, 7(2000): 288-297.
Knowledge representation involves enumeration of conceptual symbols and arrangement of these symbols into some meaningful structure. Medical knowledge representation has traditionally focused more on the structure than the symbols. Several significant efforts are under way, at local, national, and international levels, to address the representation of the symbols though the creation of high-quality terminologies that are themselves knowledge based. This paper reviews these efforts, including the Medical Entities Dictionary (MED) in use at Columbia University and the New York Presbyterian Hospital. A decade's experience with the MED is summarized to serve as a proof-of-concept that knowledge-based terminologies can support the use of coded patient data for a variety of knowledge-based activities, including the improved understanding of patient data, the access of information sources relevant to specific patient care problems, the application of expert systems directly to the care of patients, and the discovery of new medical knowledge. The terminological knowledge in the MED has also been used successfully to support clinical application development and maintenance, including that of the MED itself. On the basis of this experience, current efforts to create standard knowledge-based terminologies appear to be justified. - [Author's abstract]
GenThes: A General Thesaurus Browser for Web-Based Catalogue Systems: A Step Towards Component Based Catalogue Systems. Ralf Nikolai, Ralf Kramer, Marc Steinhaus, Paolo Plini, and Bruno Felluga.  
Thesauri have been proven means to identify documents in libraries for centuries. In this paper, we show how this approach can be combined with most recent Internet technologies. The Java-based general thesaurus browser GenThes is able to handle several heterogeneous, multilingual thesauri. With the General European Multilingual Environmental Thesaurus (GEMET), GenThes is currently being used with several environmental catalogue systems. Feedback from users reveal that this approach greatly facilitates search and retrieval as compared to free-text only search. Performance which is crucial in Web applications as well has been improved by reducing the transfer volume of data and code. The software architecture of GenThes supports easy configuration and adaption to the individual needs of different systems that it is connected to. Ongoing work as well addresses to use GenThes as a query expansion module for multilingual search in distributed document collections. - [Authors' abstract]
Guidelines for Forming Language Equivalents: A Model Based on the Art & Architecture Thesaurus. International Terminology Working Group, sponsored by The Getty Information Institute. This publication is a product of the Getty Information Institute. After the dissolution of the Getty Information Institute in June 1999, CHIN was asked to host the publication on the CHIN Web site.
Guidelines for the Construction, Format, and Management of Monolingual Controlled Vocabularies. Bethesda, Maryland: NISO Press, 2010.
NISO Z39.19 – 2005 (R2010) was developed by the National Information Standards Organization and reaffirmed May 13, 2010 by the American National Standards Institute.
"Presents guidelines and conventions for the contents, display, construction, testing, maintenance, and management of monolingual controlled vocabularies. It focuses on controlled vocabularies that are used for the representation of content objects in knowledge organization systems including lists, synonym rings, taxonomies, and thesauri." - [NISO standards]. - Care should be taken in interpreting this standard, because it uses some concepts with a wider meaning than the conventional one (as incorporated in the British Standard BS8723, for example). Especially confusing is the use of the word "facet" in Z39.19 to refer to fields of a catalogue record (or "metadata elements") of a document rather than keeping the meaning to facets of a subject. It gets it right in the glossary. - [LDW]
I Say What I Mean, but Do I Mean What I Say? Paul Miller. Ariadne 23(22 March 2000).
In this paper, I'll take a look at some of the issues surrounding the use of controlled terminology, report on the recent MODELS 11 workshop which attempted to tackle some of them, and outline some of the recommendations for future work which arose from that workshop and a similar one held by the North American National Information Standards Organization (NISO) at the end of 1999. - [Introduction]

Indexing Languages and Thesauri: Construction and Maintenance. Dagobert Soergel. Los Angeles, CA: Melville, 1974. (Wiley Information Science Series)

Indexing Resources on the WWW: Database Indexing, Controlled Vocabularies and Thesauri. Mary Sue Stephenson. Vancouver: University of British Columbia, School of Library, Archival and Information Studies, 2002.
A bibliography of Web sources arranged under the following headings: constructing and using thesauri and term lists; links to online thesauri, term lists and classification schemes; online classification schemes; database indexing guidelines and discussions; relational indexing; automatic indexing; information architecture.
ISO 25964 – The International Standard for Thesauri and Interoperability with Other Vocabularies. Geneva: International Organization for Standardization, 2011 and 2013.
The international standard for thesauri is published in two parts: ISO 25964-1 Thesauri for information retrieval [published August 2011] and ISO 25964-2 Interoperability with other vocabularies [published March 2013]. These standards replace the British Standard BS 8723. Information available here.

Investigating Automated Assistance and Implicit Feedback for Searching Systems. Bernard J. Jansen, Michael D. McNeese, School of Information Sciences and Technology, The Pennsylvania State University. Proceedings of the American Society for Information Science and Technology, 41(2005): 280-286. Available online.
Information retrieval systems offering personalized automated assistance have the potential to improve the searching process. There has been much work in this area for several years on a variety of systems. However, there has been little empirical evaluation of automated assistance to determine if it is of real benefit for searchers. We report the results of empirical evaluation to investigate how searchers use implicit feedback and automated assistance during the searching process. Results from the empirical evaluation indicate that searchers typically use multiple implicit feedback actions, usually bookmark - copy. The most commonly utilized automated assistance was for query refinement, notable the use of the thesaurus. We discuss the implications for Web systems and future research. - [Author's abstract]
Integrating Thesaurus Relationships into Search and Browse in an Online Photograph Collection. Michelle Dalmau et al. Library Hi Tech, 23(2005): 425-452.
Design/methodology/approach – Surveys controlled vocabulary structures and their utility for catalogers and end-users. Reviews research literature and usability findings that informed the specifications for integration of the controlled vocabulary structure into search and browse functionality. Discusses database functions facilitating query expansion using a controlled vocabulary structure, and web application handling of user queries and results display. Concludes with a discussion of open-source alternatives and reuse of database and application components in other environments.
Findings – Affirms that structured forms of browse and search can be successfully integrated into digital collections to significantly improve the user's discovery experience. Establishes ways in which the technologies used in implementing enhanced search and browse functionality can be abstracted to work in other digital collection environments. - [Author's abstract]

Metadata: Cataloging by Any Other Name; Metadata Projects and Standards; Sources of Metadata Information on the Web. Jessica Milstead and Susan Feldman. Online (January 1999).

Networked Knowledge Organization Systems/Services (NKOS). A group who are discussing the data and functional model for enabling Knowledge Organization Systems such as thesauri, classification systems, and gazetteers as distributed, interactive services on the Internet. -
The archive of the NKOS electronic discussion list is available here.
NISO Z39.19: Standard for Structure and Organization of Information Retrieval Thesauri. Jessica L. Milstead. Paper presented at the Taxonomic Authority Files Workshop, Washington, D.C., June 23, 1998.
Online Construction of Alphabetic Classaurus: A Vocabulary Control and Indexing Tool. F. J. Devadason. Information Processing and Management, 21(1985): 11-26.
Classaurus is a faceted hierarchic scheme of terms with vocabulary control features. It is a system of terms having separate hierarchic schedules of the Elementary Categories: Discipline, Entity, Property, and Action, together with their respective Species/Types, Parts and Special Modifiers. Also there are separate schedules for the Common Modifiers: Form, Time, Environment, and Place. Each of the terms in these hierarchic schedules is enriched with synonyms, quasi synonyms etc. The hierarchic schedules constituting the systematic part is supplemented by an alphabetical index of chain entries. Classaurus is used in the formulation of subject headings in general, and in particular, subject headings according to the Postulate based Permuted Subject Indexing (POPSI) language. For the construction of classaurus the POPSI language itself provides guidelines. A set of programs have been developed to construct a classaurus using as input, subject headings formulated according to POPSI language which are enriched with certain codes to denote the different Elementary Categories, their Species, Parts, Special Modifiers and other Common Modifiers of different kinds. The resulting classaurus has hierarchic schedules but terms in an array are arranged only alphabetically. The hierarchic schedules constitute the Systematic part of the classaurus. The system generates an alphabetic Index Part to the Systematic Part, in which for each term its broader terms are kept to its right hand side successively along with a code to denote the schedule to which the term belongs. To find out the position of a term in the Systematic Part, the whole entry for the term in the Alphabetic Part is taken and the sequence of the terms in it is reversed. Using the code for the schedule in the entry, the appropriate hierarchic schedule is selected. The schedule is then searched using the broader terms successively as keys until the term in question is reached, wherein all the hierarchically related terms could be found, including synonyms, quasi-synonyms etc. Both the Systematic Part and the Alphabetical Index Part are printed out for manual reference and also kept as direct access files for on-line access and on-the-spot updating and building up of the classaurus while inputting new subject headings formulated for this purpose. - [Abstract from publishers' web site; full text available for purchase online]
Organizing Information: Metadata and Controlled Vocabularies. Ray R. Larson. Berkeley, CA: University of California, Berkeley. School of Information Management and Systems [SIMS], 1998. PowerPoint presentation given to SIMS Affiliates Meeting, December 1998.
An overview of the field, including definitions; origins and uses of controlled vocabularies for information retrieval; metadata; types of indexing languages, thesauri and classification systems; process of design and development of thesauri. - [From introductory slide]

Powering Search: The Role of Thesauri in New Information Environments. Ali Shiri. Medford, NJ: Information Today, 2012. (ASIS&T monograph series).

Practical Taxonomies: Hard-Won Wisdom for Creating a Workable Knowledge Classification System. Sarah L. Roberts-Witt.  
Discussion of the importance of controlled vocabularies and classification systems in a business information environment, and some pointers to work on developing these.

The Rise of Ontologies or the Reinvention of Classification. Dagobert Soergel. Journal of the American Society for Information Science, 50(1999): 1119-1120.
Classifications/ontologies, thesauri, and dictionaries serve many functions, which are summarized in this note. As a result of this multiplicity of functions, classifications - often called ontologies - are developed in many communities of research and practice. Unfortunately, there is little communication and mutual learning; thus, efforts are fragmented, resulting in considerable reinvention and less than optimal products. - [Author's abstract]
Software Support for Thesaurus Construction and Display. Dagobert Soergel. In: Proceedings of the 5th ASIS SIG/CR Classification Research Workshop. Held at the 57th ASIS Annual Meeting, Oct. 16-20, 1994, Alexandria, VA. Silver Spring, MD: American Society for Information Science, Special Interest Group / Classification Research, 10(1994): 157-184. (Advances in Classification Research, v. 5)
 Dr Soergel has written many other papers on thesauri and related topics. A list of these is given in his CV.
Systems of Knowledge Organization for Digital Libraries: Beyond Traditional Authority Files. Gail Hodge. Washington, D.C.: Council on Library and Information Resources, 2000. (CLIR publication, pub 91).
This report examines the use of knowledge organization systems - schemes for organizing information and facilitating knowledge management - in a digital environment.
Knowledge organization systems serve as bridges between a user's information needs and the material in a collection. Examples of such systems include term lists, such as dictionaries; classification schemes, such as ... [the Dewey Decimal Classification]; and relationship lists, such as thesauri. These and other types of knowledge organization systems, which vary in complexity, structure, and function, can improve the organization of digital libraries and facilitate access to their content. -[Publisher's abstract, amended.]
The useful Taxonomy of knowledge organization sources/systems from this report has been separately adopted as a draft

A Taxonomy Primer. Amy J. Warner. Ann Arbor, Michigan: Lexonomy, 2002.
Despite its title, this is a brief introduction to the principles of constructing thesauri and other forms of controlled vocabulary.

Thesauri and Controlled Vocabularies. National Library of Canada, 2001.
Includes a section on terminology used in the federal government of Canada as well as links to thesauri, classification schemes, controlled vocabularies, geographic names, subject clusters and taxonomies. There are also compendia compiled by experts in the field, construction and management tools, related research and projects, as well as standards. - [Message from Fay Hjartarson to DIGLIB: Digital Libraries Research mailing list, 2001-10-12]

The Thesaurus: Review, Renaissance, and Revision. Sandra K. Roe and Alan R. Thomas, eds. New York, London: Haworth, 2004. Co-published simultaneously as Cataloging and Classification Quarterly, 37:3/4(2004).
A collection of twelve papers by thesaurus specialists, with many references and an overall index, as follows:
·         Introduction, by Sandra K. Roe and Alan R. Thomas
·         The thesaurus: a historical viewpoint, with a look to the future, by Jean Aitchison and Stella Dextre Clarke
·         Teach yourself thesaurus: exercises, readings, resources, by Alan R. Thomas
·         A practical exercise in building a thesaurus, by James R. Shearer
·         Thesaurus construction: key issues and selected readings, by Marianne Lykke Nielsen
·         Thesaurus consultancy, by Leonard Will
·         Thesaurus evaluation, by Leslie Ann Owens and Pauline Atherton Cochrane
·         User comprehension and searching with information retrieval thesauri, by Jane Greenberg
·         Distributed thesaurus web services, by Eric H. Johnson
·         Tools of the trade : vocabulary management software, by Melissa A. Riesland
·         Multilingual subject access: the linking approach of MACS, by Patrice Laundry
·         An interview with Dr. Amy J. Warner (June 2003), by Alan R. Thomas and Sandra K. Roe
Thesaurus Construction: [an introductory tutorial]. Tim Craven. London, Ontario: University of Western Ontario, Graduate School of Library and Information Science.
Step by step guidance on constructing a thesaurus, including collecting terms, deciding on the forms of terms, and choosing the appropriate relationships with which to link terms. Contains a glossary and quizzes.
Thesaurus Construction and Maintenance Guidelines. RBMS Bibliographic Standards Committee. Chicago: American Library Association, Association of College and Research Libraries, Rare Books and Manuscripts Section, 1998 (RBMS manual; 13).
Guidance on the choice of options and local decisions in applying NISO Z39.19 [erroneously referred to as Z39.13] to the development of existing thesauri and the construction of additional controlled vocabularies for rare books and manuscripts.

Thesaurus Construction and Use: a Practical Manual. Jean Aitchison, Alan Gilchrist, David Bawden. 4th ed. London: Aslib, 2000.
Comment on 3rd ed: The latest edition of the standard handbook for anyone setting out to construct a thesaurus. Based on the British, ISO and ANSI standards, it also includes an introduction to the principles of faceted classification which is helpful when constructing a thesaurus linked to such a scheme. Contains a full and up-to-date list of references, though references to Internet resources are few.
Thesaurus-Enhanced Search Interfaces. Ali Asghar Shiri, Crawford Revie, and Gobinda Chowdhury. Journal of Information Science, 28:2(2002): 111-122.
User interfaces to information retrieval systems play a major role in assisting users to search, browse and retrieve information relevant to their needs. This paper provides a review of a category of information retrieval interfaces that are enhanced by incorporating standard thesauri as part of their searching and browsing facilities. A brief account of the rationale behind the integration of thesauri as search aids in such interfaces is provided, based on research literature related to information searching behaviour, information retrieval interface evaluation, search term selection and query expansion. Two categories of search interfaces enhanced with thesauri are examined: those associated with research-based programmes and commercial web-based interfaces to bibliographic databases. Six commercial web-based databases are compared in terms of their thesaurus interface features. It is concluded that, although the number of thesaurus-enhanced interfaces is growing, few studies have focused on user interaction with these interfaces or fully explored the ways in which they can assist users in the search process. - [Authors' abstract.]
Thesaurus for Graphic Materials. TGM I: Subject Terms. TGM II: Genre and Physical Characteristic Terms. Compiled and edited by Prints and Photographs Division, Library of Congress. Washington, D.C.: Library of Congress, Cataloging Distribution Service, 1995.
This is the revised edition of two works: LC thesaurus for graphic materials, compiled by Elisabeth Betz Parker. LC, 1987. Descriptive terms for graphic materials, compiled and edited by Helena Zinkham and Elisabeth Betz Parker. LC, 1986.

Thesaurus Guide: Analytical Directory of Selected Vocabularies for Information Retrieval, 2nd ed. 1992. Prepared by EUROBrokers for the Commission of the European Communities. Luxembourg: European Communities, 1993. (EUR/92/14006); (Rapports EUR 14006). 
Describes over 600 structured vocabularies that have appeared in at least one of the official languages of the European Communities, including those published in the USA and Canada. This publication was accessible as a database on the European Union's ECHO service but it was withdrawn in early 1998 because it had not been updated for some years.
Toward Human-Computer Information Retrieval. Gary Marchionini. Bulletin of the American Society for Information Science and Technology. 32:5(2006): 20-22.
Encourages the development of systems that interact more with users, giving users control and assistance in search formuation and refinement and helping them to make full use of system capabilities.
Understanding Metadata. Bethesda, MD: NISO, 2004. 
Understanding metadata is a revision and expansion of Metadata made simpler: a guide for libraries published by NISO Press in 2001.

Use of Thesauri in the Full-Text Environment. Jessica L. Milstead. Indian Head, MD: The JELEM Company, 1998. Based on a paper presented at the 34th Clinic on Library Applications of Data Processing.
Utilizing Faceted Structures for Information Systems Design. Uta Priss, Elin Jacob. Proceedings of the 62st Annual Meeting of ASIS (1999): 203-212.
A brief investigation of the structure of the websites of three schools of library and information science by seeing how easy it is to find answers to five typical questions. This is followed by a discussion of the advantages of a faceted thesaurus structure, from a mathematical point of view. Unfortunately the terminology used differs from that used in the library and information science community, so that though the ideas presented are valid there may be some confusion in interpreting them and relating them to discussions elsewhere. Other publications by Dr Uta Priss are listed on her Web site. - [LDW]

Vocabulary as a Central Concept in Library and Information Science. Michael Buckland. Preprint of paper published in Digital libraries: interdisciplinary concepts, challenges, and cpportunities. Proceedings of the Third International Conference on Conceptions of Library and Information Science (CoLIS3, Dubrovnik, Croatia, 23-26 May 1999, edited by T. Arpanac et al. Zagreb: Lokve, (1999): 3-12. 
The nature and role of vocabulary in information systems is examined. "Vocabulary" commonly refers to the stylized adaptation of natural language to form indexes and thesauri. Much of bibliographic access, filtering, and information retrieval can be viewed as matching or translating across vocabularies. Multiple vocabularies are simultaneously present. A simple query in an online catalog normally involves at least five distinct vocabularies: those of the authors; the cataloger; the syndetic structure; the searcher; and the formulated query.
Vocabulary can be defined as the range (or repertoire) of values in any field of bibliographic description and, in a more extended sense, the range of types in a set at any level (word, field, collection, and library). Digital libraries can be represented by a simple recursive model composed of sets ("collections") and two kinds of operation on the sets.
Vocabulary problems are central to the economics of digital libraries because unfamiliar vocabulary reduces search effectiveness. Issues of identity are central to Library and Information Science because of the indexical role of vocabulary. Vocabulary is a central component in digital libraries. Problems inherent in vocabulary help explain the nature and history of conceptions of Library and Information Science. - [Author's abstract]

Vocabulary Control for Information Retrieval. F. W. Lancaster. 2d ed. Arlington, VA: Information Resources Press, 1986

Vocabulary Links: Thesaurus Design for Information Systems. Seminar by Dr. Bella Hass Weinberg; synthesis by Fred Brown. KeyWords, American Society of Indexers, November/December 1998.
A seminar presented on April 3, 1998, at the Association of the Bar of the City of New York.
Web Services for Controlled Vocabularies. Diane Vizine-Goetz, Andrew Houghton, Eric Childress. Bulletin of the American Society for Information Science and Technology. 32:5(2006):  9-12.
"... we present an approach for using Web services to interact with controlled vocabularies. Services are implemented within a service-oriented architecture (SOA) framework. SOA is an approach to distributed computing where services are loosely coupled and discoverable on the network. A set of experimental services for controlled vocabularies is provided through the Microsoft Office (MS) Research task pane (a small window or sidebar that opens up next to Internet Explorer (IE) and other Microsoft Office applications). The research task pane is a built-in feature of IE when MS Office 2003 is loaded. The research pane enables a user to take advantage of a number of research and reference services accessible over the Internet. Web browsers, such as Mozilla Firefox and Opera, also provide sidebars which could be used to deliver similar, loosely-coupled Web services." - [From authors' introduction]

Workshop on Electronic Thesauri: Planning for a Standard. Report by Jessica Milstead. Bethesda, MD: National Information Standards Organization, 1999.
NISO (The National Information Standards Organization), APA (The American Psychological Association); ASI (The American Society of Indexers); and ALCTS (Association for Library Collections and Technical Services) sponsored an invitational workshop on November 4-5, 1999 in Washington, DC to investigate the desirability and feasibility of developing a standard for electronic thesauri. The review of ANSI/NISO Z39.19-1993(R1998) standard (Guidelines for the Construction, Format, and Management of Monolingual Thesauri) recommended such an investigation. - [first paragraph of report]

Workshop on the Compilation, Maintenance, and Dissemination of Taxonomic Authority Files (TAF). June 22 - 23, 1998; Washington, D.C. Sponsored by The National Science Foundation; The American Libraries Association, Association for Library Collections and Technical Services, Cataloging and Classification Section, Task Force on a Forum for Natural History Cataloging Issues; The United States Geological Survey, Biological Resources Division. San Francisco: Californian Academy of Sciences, 2001.
The compilation, maintenance, and dissemination of taxonomic authority files has much in common with authority control as conceived and practiced in the library cataloging and information retrieval communities. The purpose of this workshop is to provide a forum in which members of the systematics and library communities can describe their respective domains, and identify the concepts, practices, and technologies that could be shared to promote consistency in the cataloging, indexing, and retrieval of biological information. The two-day workshop consisted of 18 presentations by invited speakers, questions and comments for speakers, and discussion at the end of each session. - Index page
I. Examples of taxonomic authority compilation projects
II. A framework for authority control: from shared vocabularies to the cooperative cataloging of common data objects
III. Authorities, gazetteers, and thesauri
IV. Data structures for taxonomic names and classifications
V. Technologies for accessing and replicating authorities
WWW -- Wealth, Weariness or Waste: controlled vocabulary and thesauri in support of online information access. David Batty. D-Lib Magazine, November 1998.
This article offers some thoughts on the problems of access to information in a machine-sensible environment, and the potential of modern library techniques to help in solving them. It explains how authors and publishers can make information more accessible by providing indexing information that uses controlled vocabulary, terms from a thesaurus, or other linguistic assistance to searchers and readers. - [Author's abstract.]

Zthes: a Z39.50 Profile for Thesaurus Navigation. Mike Taylor. Ver. 0.4. Washington, D.C.: Library of Congress, Z39.50 International Standard Maintenance Agency, 15 Nov 2000.
This document describes an abstract model for representing and searching thesauri - semantic hierarchies of terms as described in ISO 2788 - and specifies how this model may be implemented using the Z39.50 protocol. It also suggests how the model may be implemented using other protocols and formats. ... As an example, an appendix defines an XML DTD for thesaurus terms based on the model, and includes an example XML document using that DTD. This profile does not mandate any relationship between a thesaurus and a database. The model is that terms from any thesaurus database may be used to search any other database (called a target database). - [Introduction]
Facet analysis
All About Facets and Controlled Vocabularies. Karl Fast, Fred Leise, and Mike Steckel. Boxes and Arrows, 9 December 2002.
This is the first in a series of articles . . . We intend to explain both facets and the more general concept of controlled vocabularies. We want to make the subject accessible to those who don't have advanced degrees in library and information science. Furthermore, we want to show how these concepts can be applied to solve information architecture problems for the Web and other digital information environments.
The concept of faceted classification is decades old, and controlled vocabularies go back even further. Consequently a great deal has already been written about the subject. But these writings are not always helpful to the practicing Information Architect. Some are too simple, others too academic. Most are hard to find, and many were written decades before this Web thing happened. - [Introduction]
The Essential Elements of Faceted Thesauri. Louise Spiteri. Cataloging & Classification Quarterly, 28:4(1999): 31-52.
The goal of this study is to evaluate, compare, and contrast how facet analysis is used to construct the systematic or faceted displays of a selection of information retrieval thesauri. More specifically, the study seeks to examine which principles of facet analysis are used in the thesauri, and the extent to which different thesauri apply these principles in the same way.
A measuring instrument was designed for the purpose of evaluating the structure of faceted thesauri. This instrument was applied to fourteen faceted information retrieval thesauri. The study reveals that the thesauri do not share a common definition of what constitutes a facet. In some cases, the thesauri apply both enumerative-style classification and facet analysis to arrange their indexing terms. A number of the facets used in the thesauri are not homogeneous or mutually exclusive. The principle of synthesis is used in only 50% of the thesauri, and no one citation order is used consistently by the thesauri. - Author's abstract

Facet Analysis: Using Faceted Classification Techniques to Organize Site Content and Structure. Louise Gruenberg. PowerPoint presentation in ASIS&T 2002 Information Architecture Summit, Baltimore, MD., March 15-17, 2002.
An introduction to the principles of faceted classification and the construction of faceted schemes.

Faceted Access: A Review of the Literature. Amanda Maple. Middleton, WI: Music Libraries Association. Bibliographic Control Committee, 1995. (BCC95/WG FAM/2)
A paper presented during the Music Library Association Annual Meeting, 10 February 1995, at the open meeting sponsored by the Working Group on Faceted Access to Music.

Faceted Classification: A Guide to Construction and Use of Special Schemes, 1st ed. Prepared by B. C. Vickery for the Classification Research Group. Reprinted with additional material. London: Aslib, 1968.

Faucet Facets: A Few Best Practices for Designing Multifaceted Navigation Systems. Jeff Veen. San Francisco: Adaptive Path, 2002.
"So often we assume that Web sites should be hierarchically organized. We talk about a "home page" that offers "top-level navigation" so that users can "drill down" to the content. It's as if we're programmed to think top down. But what about information that isn't as easily structured this way? Sometimes, content has many attributes that have different importance to different users. A hierarchy assumes everyone approaches these attributes the same way, but that's often not the case." - Introduction.
How to Make a Faceted Classification and Put It on the Web. William Denton. Toronto: University of Toronto, November 2003.
"I have given a seven- step model for the creation of a faceted classification, and five questions to ask and four principles to follow when building a facet-based web interface. Faceted systems are very powerful, and their increasing popularity on the web is no surprise. They will only become more common, so it is important to design and deploy them well. All the benefits of a faceted classification can be fully realized on the web, giving users power that they have not had with simpler web-based systems or with faceted systems on paper." - Author's conclusion.

Putting Facets on the Web: An Annotated Bibliography. William Denton. Toronto: University of Toronto, October 2003.
Includes items categorized as "recommended", "background", "not relevant", "example Web sites" and "mailing lists".

Ranganathan and the Net: Using Facet Analysis to Search and Organise the World Wide Web. David Ellis and Ana Vasconcelos. Aslib Proceedings, 51:1(January 1999): 3-10.

A Simplified Model for Facet Analysis: Ranganathan 101. Louise Spiteri. Canadian Journal of Information and Library Science, 23(April-July 1998): p.1-30.

The Use of Facet Analysis in Information Retrieval Thesauri: An Examination of Selected Guidelines for Thesaurus Construction. Louise F Spiteri. Cataloging & Classification Quarterly, 25:1(1997): 21-37.
Facet analysis has been used in the construction of faceted thesauri since the publication of the Information Retrieval Thesaurus of Education Terms in 1968. In spite of the growth of the number of    faceted thesauri since then, there appears to be little consensus among thesaurus designers regarding           how the principles of facet analysis are to be used in thesauri. An examination of various national and             international guidelines for thesaurus construction reveals that they emphasize primarily the construction of alphabetical thesauri, but provide little guidance in the use of facet analysis in thesauri. - [Journal abstract]
Search interfaces that illustrate use of thesaurus facets
There is an important distinction, not always recognized, between the following two approaches. The second of these is better called searching by parameters or characteristics rather than by facets.
1.    Combining multiple facets of a compound subject, such as people, organizations, abstract concepts, places, objects, materials and so on.
2.    Combining multiple characteristics of division of concepts within a single facet, e.g. within a materials facet there may be a concept of wines, subdivided into several arrays, not mutually exclusive, each headed by a node label such as <wines by color>, <wines by sweetness>, <wines by origin>, <wines by price> and so on.

Aduna software. Amersfoort, The Netherlands, Administrator Nederland B.V.
A slide presentation of the Aduna AutoFocus "open source facet-driven enterprise search solution" is available here.

FLAMENCO Search Interface Project. Berkeley: University of California, 2002.
The Flamenco search interface framework has the primary design goal of allowing users to move through large information spaces in a flexible manner without feeling lost. A key property of the interface is the explicit exposure of category metadata, to guide the user toward possible choices, and to organize the results of keyword searches. The interface uses hierarchical faceted metadata in a manner that allows users to both refine and expand the current query, while maintaining a consistent representation of the collection's structure. This use of metadata is integrated with free-text search, allowing the user to follow links, then add search terms, then follow more links, without interrupting the interaction flow.
FLAMENCO stands for FLexible information Access using MEtadata in Novel COmbinations.

FACET Project: Thesaurus Based Access to Multimedia Collections - Faceted Retrieval Tools. School of Computing, University of Glamorgan.
Original work is unavailable as of 10/2013, but project is described in 2004 paper:
This project investigates the ways in which access to a thesaurus can be used by searchers to formulate searches of an indexed collection, and ways in which search terms can be expanded automatically to include additional terms more or less closely linked to the initial term by relationships in the thesaurus structure. A Web-based demonstration of the query expansion system is available.
Faceted Metadata Search and Browse. Search Tools Consulting, 2003.
Examples and discussion of faceted search interfaces given under “background topics”.

FacetMap. Travis Wilson. Austin, TX: Travis Wilson, [200?].
A simple on-line demonstration of how three facets from a thesaurus can be combined by a searcher to successively refine a query. Visualization of facets is shown here.
International Children's Digital Library: a joint project of The University of Maryland and The Internet Archive
The International Children's Digital Library (ICDL) is a 5-year research project to develop innovative software and a collection of books that specifically address the needs of children as readers. The search interface allows selection from 13 facets, alone or in combination, with multiple selections from the same facet when required. Needs Java, a broadband connection and a reasonably fast machine with at least 256k memory.
Social Care Online. London: Social Care Institute for Excellence (SCIE).
An interesting example of a site that displays thesaurus hierarchies and allows multiple selections. A few aspects could be improved, for example: (1) the numbers of postings are not updated dynamically as terms are selected; (2) terms can be pasted into a search box (limited to AND operators) only from the "simple search" interface and not the "advanced search"; (3) related term (RT) links are shown even when there are no related terms to display; (4) it is not possible to display two separate hierarchies simultaneously (though the help file indicates that it is). The site shows good possibilities, though, and if improvements such as those noted above could be implemented it would be one of the best.
Lists of thesauri
Networked Knowledge Organization Systems (NKOS) have drafted a registry of attributes that may be used when describing thesauri, classifications and other types of knowledge organization system.
The following lists can be used to find many general and specialist thesauri.
The University of Toronto Library maintains a Subject Analysis Systems (SAS) Collection which is the North American Clearinghouse for classification schemes, subject heading lists, and thesauri, and the major world collection in the English language, including multilingual thesauri containing English language sections. These are included in the library catalog, and records may be retrieved by doing a subject search for "subject headings". Searching for "classification" also retrieves many relevant items.

Classification Schemes and Thesauri On-line. Anne Betz.
Annotated lists of thesauri (by subject and language), classification schemes and thesaurus management software. "The thesaurus collection is based on the results of the interdisciplinary seminar 'Terminology documentation and multilingual thesauri' held in summer 1998 and in which participated students from both the Department of LIS and the Department of Languages" of the Fachhochschule Köln. - Last update: 11/12/98.

Controlling Your Language: Links to Metadata Vocabularies. TASI (Technical Advisory Service for Images).
Introduction to the concepts of thesauri, subject headings, word lists, classifications and name authority lists, and links to more than 70 vocabulary sources.

RBMS Thesauri for Use in Rare Book and Special Collections Cataloging. RBMS Bibliographic Standards Committee. Chicago: American Library Association, Association of College and Research Libraries, Rare Books and Manuscripts Section.
Thesauri titles are: Binding terms; Printing & publishing evidence; Genre terms; Provenance evidence; Paper terms; Type evidence

MARC Code List: PART IV: Term, Name, Title Sources. Library of Congress.
Subject/index sources contains a list of works which are sources of subject headings or index terms, along with the code assigned to each source work. The purpose of this list is to allow the source of the heading or term in a field of a MARC record to be designated by a code in that field.

Taxonomy Warehouse
This free service aims to provide a comprehensive directory of taxonomies, thesauri, classification schemes and other authority files from around the world, plus information about taxonomy references, resources and events.
The Accidental Taxonomist. Heather Hedden. Medford, NJ: Information Today, 2010.

The Accidental Taxonomist is the most comprehensive guide available to the art and science of building information taxonomies. Heather Hedden – one of today’s leading writers, instructors, and consultants on indexing and taxonomy topics – walks readers through the process, displaying her trademark ability to present highly technical information in straightforward, comprehensible English. Drawing on numerous real-world examples, Hedden explains how to create terms and relationships, select taxonomy management software, design taxonomies for human versus automated indexing, manage enterprise taxonomy projects, and adapt taxonomies to various user interfaces. The result is a practical and essential guide for information professionals who need to effectively create or manage taxonomies, controlled vocabularies, and thesauri. - [Publisher's information]

Where's My Stuff?: Taxonomy and Lexicon as Keys to Access. Mary Chitty. Newton Upper Falls MA: Cambridge Healthcare Institute, 2002.
Outline of a talk given to the [USA] Special Libraries Association, 10th June 2002. Includes links to a bibliography on taxonomies and related topics.

Taxonomy Software to the Rescue. Paola Maio. Online Journalism Review, 2001-10-12
Brief review of automatic classification / categorization software, with links to the Web sites of the following suppliers: Autonomy, EoExchange, Inxight, Mohomine, Quiver, Semio, Verity.

Value of Organised Knowledge. Jack Bryar. CMS Watch, 21 January 2002.
General information about taxonomies as used in business applications, with reference to the use of XML tags.

Ontology Building: A Survey of Editing Tools. Michael Denny., 2002-11-06.
A textual summary of what ontology editing tools are, with a tabular summary of the main features of 52 different tools.

A Survey on Ontology Tools. Amsterdam: OntoWeb Consortium, 2002-05-31.(Ontology-based information exchange for knowledge management and electronic commerce: IST-project 2000-29243 ; Deliverable 1.3).
This deliverable presents a survey of the most relevant ontology tools and semantic web technology available in our community. This survey is divided into several sections, which group different kinds of ontology tools, namely: ontology development, ontology merge, ontology evaluation, ontology-based annotation and ontology storage and querying. - [Executive summary].
Topic maps
Metadata? Thesauri? Taxonomies? Topic Maps!: Making Sense of It All. Lars Marius Garshol. Journal of Information Science, 30:4(August 2004): 378-391. Also available online here.
To be faced with a document collection and not to be able to find the information you know exists somewhere within it is a problem as old as the existence of document collections. Information Architecture is the discipline dealing with the modern version of this problem: how to organize web sites so that users actually can find what they are looking for.
Information architects have so far applied known and well-tried tools from library science to solve this problem, and now topic maps are sailing up as another potential tool for information architects. This raises the question of how topic maps compare with the traditional solutions, and that is the question this paper attempts to address.
The paper argues that topic maps go beyond the traditional solutions in the sense that it provides a framework within which they can be represented as they are, but also extended in ways which significantly improve information retrieval. - [Author's abstract].

The TAO of Topic Maps: Finding the Way in the Age of Infoglut. Steve Pepper. Oslo: Ontopio.
Topic maps are a new ISO standard for describing knowledge structures and associating them with information resources. As such they constitute an enabling technology for knowledge management. Dubbed "the GPS of the information universe", topic maps are also destined to provide powerful new ways of navigating large and interconnected corpora.
While it is possible to represent immensely complex structures using topic maps, the basic concepts of the model - Topics, Associations, and Occurrences (TAO) - are easily grasped. This paper provides a non-technical introduction to these and other concepts (the IFS and BUTS of topic maps), relating them to things that are familiar to all of us from the realms of publishing and information management, and attempting to convey some idea of the uses to which topic maps will be put in the future. - [Author's abstract].

Topic Maps - A Standard for Information Organisation. Kal Ahmed. Oxford: Techquila.
Several papers on this site include A practical introduction to topic maps and Topic map patterns for information architecture. The second of these discusses how a variety of standard information structures from the domain of information science can be represented in topic maps. Also includes a description of PSI (Published Subject Identifier) sets for thesauri, site maps and faceted classification schemes.