Chapter 5.

Chapter 5. Computerised Information Retrieval

[Table of Contents] [Previous Chapter] [Next Chapter]

5.1 Introduction

Today computers provide us with powerful tools for information handling - for collection, organisation, classification, retrieval and distribution. Computers have been used since the late 1960s for the storage of large databases such as library catalogues and bibliographic references. Development of optical storage media such as CD-ROM has given us the possibility of storing large quantities of text, graphics, pictures, and sound at a low cost. These new optical memories can function as distributed stores for encyclopaedias, databases, books etc. This has stimulated the development of local information systems. This chapter will cover three aspects of computerised information retrieval:

library catalogues
online databases
databases on CD-ROM

5.2 Types of databases

There are a number of types of databases:

Library catalogues - catalogues covering the holdings (books, reports, journals conference proceedings, etc.) of one or more library.
Bibliographic databases containing bibliographic references, with or without abstracts.
Reference databases (in addition to those mentioned under 1. and 2.), for example, current research projects, handbooks, encyclopaedias, product suppliers, etc.
Factual databases or data banks containing information, often in numerical form, which can be used directly, e.g. chemical structures, tables, terminology.
Full-text databases which contain the complete version of the text of given publications.

5.3 Computerised library catalogues

Computerised library catalogues were first introduced during the late 1960s. The online catalogue, known as the Online Public Access Catalogue, or OPAC, has gradually become more user friendly with the use of menus and simple commands. Access for users is now often in the form of a Web (World Wide Web) interface. Computerised library catalogues usually form an integral part of an automated library system, which includes circulation routines as well as acquisition processing. Computerised library catalogues contain the details of books, conference publications, reports, periodical titles, etc. Note: OPACs do not, as a rule, contain details of individual journal articles.

The computerised library catalogues allow you to:

check to see if a certain book or journal is available at the library or
see which books are available on a specific subject
see whether or not a book is currently available or out on loan.

In addition to the automated individual library catalogues, there are union catalogues, which show the holdings of a number of libraries and indicate where a given item is available. One example of a union catalogue is MELVYL - the catalogue of the nine campuses of the University of California. Another is LIBRIS (LIBRary Information System), the union catalogue of the Swedish academic and research libraries.

Today you are able to access many library catalogues by means of telnet or the World Wide Web.

5.4 From Document to Online Database

The enormous growth of published information in the 1960s resulted in long time delays between the primary publications and the appearance of the secondary indexes and abstracts which referred to them. In order to speed up the production of the secondary publications, computers began to be used in the printing process. The bibliographic reference material was stored in the form of a structured database in the computer memory. This stored information could then be used either for printing the abstracts and indexes, or for direct information retrieval via a terminal (see Figure 12). In addition, the information databases can now be stored in optical memories, such as CD-ROM, which are available for information retrieval. (see Section 5.9)

Figure 12. From Document to Secondary Publication or Database

5.5 Access to databases

Information from the primary sources has been collected together and organised under subject headings and authors in reference databases. These can be accessed in a number of ways:

searching online from a database mounted on a host computer from a commercial information retrieval service (IRS). This requires a password.
by means of a searchable compact disk CD-ROM database
from a database with WWW interface mounted either locally or available from a remote server

Online information retrieval from databases is the acquisition of information from a distant computer via a terminal or PC, involving an interactive dialogue between enquirer and computer. The computer handles a number of databases stored in electronic form, consisting of references to journal articles, conference papers, reports, books etc, which the Information Retrieval Service (IRS) or 'host' makes available to interested parties, such as university libraries, on a commercial basis.

CD-ROMs and WWW interfaces have been designed for end-users. They are relatively user-friendly and the search software is (more-or-less) self explanatory. Today, CD-ROMs often are mounted on a server so in reality the user will not be able to notice any differences between using online databases or a CD-ROM

Computer-based information retrieval involves an interactive dialogue between the enquirer and database. The computer matches any input search terms against its files, and then displays any resulting matches. These can then be printed out or downloaded by the searcher. Searches can be carried out directly by end-users or by information specialists acting as intermediaries.

Information is stored in the form of a structured database on a host computer and is available online to users by means of communication networks such as the Internet. Computerised information retrieval or online searching is carried out in the form of a dialogue in real time between the user at his/her computer terminal or personal computer (PC) and the various databases stored on a host computer. The various groups involved in computerised information retrieval are the database producers, the host vendors or system operators, institutions providing terminals or PCs, intermediaries and end-users.

5.6 Examples of databases

Examples of databases which are useful in physics and electrical engineering are:

INSPEC
PASCAL
SciSearch
CORDIS
CA File
COMPENDEX
ENERGY SCIENCE & TECHNOLOGY
Engineering and Industrial Software
INIS
NTIS
World Translations Index - WTI
Grants

5.7 The Computerised Interactive Online Search for Information

This section will describe the processes involved in an online search. In interactive computerised information retrieval, you work at a terminal or PC, in direct contact (via communication networks) with a vast amount of information stored in a central computer memory. You conduct a dialogue with the central computer, by entering enquiries via the keyboard and receiving replies on the printer and screen of the terminal or PC.

The basic search process is similar to that for manual searching, namely definition and analysis of the search question or topic, the identification of suitable search terms, the design of the search strategy, choice of appropriate databases, and the interactive online dialogue. You carry out the search in a number of stages:

Formulate the question so that it really covers your information need.
Find appropriate keywords (by means of dictionaries and thesauri) for the search concepts.
Key in the appropriate password to a suitable information system host computer, through either the academic network (remote access) or through a public communication network. This is called logging on and gives access to the information retrieval system. (Each user/group has a specific password to facilitate invoicing of charges).
The system replies with information about the databases available and you select and key in the base chosen for the search.
You then key in the terms which cover the search question.
An interactive dialogue then ensues between you (the user) and the system, in which the computer replies with information on how many references the database contains for each term keyed in.
You develop a search strategy in which terms are grouped in parameters and linked, according to Boolean logic, with AND, OR, or NOT operators (see Figure 13). This search strategy is keyed in.
You can develop the search further - enlarge by more related terms, together with synonyms - OR - , or narrow by combining several parameters - AND - strategies.
The system responds with the number of references fulfilling the prescribed conditions.
You ask for a display of sample references.
If the sample references are relevant, you can order a print-out of the total available references of this type, either directly "on-line" via the terminal or "off-line" with a print-out at the computer centre and subsequent delivery by post. (Off-line printing is used if there are many references, in order to reduce costs.) Alternatively, you can download the references to your computer disk.
You terminate the search by the log-off command and the connection to the host computer is broken.

Figure 13. AND OR and NOT Logic illustrated by a Venn Diagram

Computerised information retrieval is an interactive dialogue, and the results that you obtain are very much dependent on the careful preparation of your search. Examples of search strategies within various subject areas are given Manual of Online Search Strategies, by Armstrong & Large, 1992,[24] and in Online Searching in Science and Technology, 1991.[25]

5.7 The advantages of Computerised Information Retrieval

Computerised information retrieval has several advantages over manual methods for literature searching:

You save time.
Information stored in a database is more current than in the corresponding printed publication.
You can search for information in several subject areas during the same search, for example, information on environmental pollution can be searched for in databases covering biology, engineering and chemistry.
You can carry out a more detailed search with the help of the computer than by manual methods. In the printed publications, it is usually only possible to search under subject headings or authors, whereas the computerised system permits many more search entries such as institution, title of journal, classification code, keywords or descriptors and words included in the title/abstract. Every unit of information which has been stored in the computer is potentially searchable.

5.8 Computerised information searching

Three types of skill are necessary for carrying out interactive computerised information searches:

subject knowledge;
skill in using the PC;
knowledge of the information system to be used - the appropriate commands, type and quantity of information available, database, structure, etc.

The following factors have contributed to easier online searching for end-users:

In recent years considerable effort has been spent on making computerised information systems more "user friendly" or simpler to use. User friendly systems have been developed by a number of systems operators. These take the form of a Web-based user friendly interfaces.

There are still difficulties for the end-user. The various databases, which are produced by different organisations, are not standardised as to structure and lay-out, or indexing terms. Another problem is that, if searching is carried out at infrequent intervals, it is possible that the various commands feel unfamiliar. One way to overcome this is to make use of "refresher training." The Into Info demonstrations and exercises will be designed to help you in your searching.