Chapter 5. Computerised Information Retrieval
[Table of Contents] [Previous
Chapter] [Next Chapter]
5.1 Introduction
Today computers provide us with powerful tools for information handling
- for collection, organisation, classification, retrieval and distribution.
Computers have been used since the late 1960s for the storage of large
databases such as library catalogues and bibliographic references. Development
of optical storage media such as CD-ROM has given us the possibility of
storing large quantities of text, graphics, pictures, and sound at a low
cost. These new optical memories can function as distributed stores for
encyclopaedias, databases, books etc. This has stimulated the development
of local information systems. This chapter will cover three aspects of
computerised information retrieval:
-
library catalogues
-
online databases
-
databases on CD-ROM
5.2 Types of databases
There are a number of types of databases:
-
Library catalogues - catalogues covering the holdings (books, reports,
journals conference proceedings, etc.) of one or more library.
-
Bibliographic databases containing bibliographic references, with
or without abstracts.
-
Reference databases (in addition to those mentioned under 1. and
2.), for example, current research projects, handbooks, encyclopaedias,
product suppliers, etc.
-
Factual databases or data banks containing information, often in
numerical form, which can be used directly, e.g. chemical structures, tables,
terminology.
-
Full-text databases which contain the complete version of the text
of given publications.
5.3 Computerised library catalogues
Computerised library catalogues were first introduced during the late 1960s.
The online catalogue, known as the Online Public Access Catalogue, or OPAC,
has gradually become more user friendly with the use of menus and simple
commands. Access for users is now often in the form of a Web (World Wide
Web) interface. Computerised library catalogues usually form an integral
part of an automated library system, which includes circulation routines
as well as acquisition processing. Computerised library catalogues contain
the details of books, conference publications, reports, periodical titles,
etc. Note: OPACs do not, as a rule, contain details of individual journal
articles.
The computerised library catalogues allow you to:
-
check to see if a certain book or journal is available at the library or
-
see which books are available on a specific subject
-
see whether or not a book is currently available or out on loan.
In addition to the automated individual library catalogues, there are union
catalogues, which show the holdings of a number of libraries and indicate
where a given item is available. One example of a union catalogue is MELVYL
- the catalogue of the nine campuses of the University of California. Another
is LIBRIS (LIBRary
Information System), the union catalogue of the Swedish academic and research
libraries.
Today you are able to access many library catalogues by means
of telnet or the World Wide Web.
5.4 From Document to Online Database
The enormous growth of published information in the 1960s resulted in long
time delays between the primary publications and the appearance of the
secondary indexes and abstracts which referred to them. In order to speed
up the production of the secondary publications, computers began to be
used in the printing process. The bibliographic reference material was
stored in the form of a structured database in the computer memory. This
stored information could then be used either for printing the abstracts
and indexes, or for direct information retrieval via a terminal (see Figure
12). In addition, the information databases can now be stored in optical
memories, such as CD-ROM, which are available for information retrieval.
(see Section 5.9)
Figure 12. From Document to Secondary Publication or Database
5.5 Access to databases
Information from the primary sources has been collected together and organised
under subject headings and authors in reference databases. These
can be accessed in a number of ways:
-
searching online from a database mounted on a host computer from
a commercial information retrieval service (IRS). This requires
a password.
-
by means of a searchable compact disk CD-ROM database
-
from a database with WWW interface mounted either locally or available
from a remote server
Online information retrieval from databases is the acquisition of
information from a distant computer via a terminal or PC, involving an
interactive dialogue between enquirer and computer. The computer handles
a number of databases stored in electronic form, consisting of references
to journal articles, conference papers, reports, books etc, which the Information
Retrieval Service (IRS) or 'host' makes available to interested parties,
such as university libraries, on a commercial basis.
CD-ROMs and WWW interfaces have been designed for end-users.
They are relatively user-friendly and the search software is (more-or-less)
self explanatory. Today, CD-ROMs often are mounted on a server so in reality
the user will not be able to notice any differences between using online
databases or a CD-ROM
Computer-based information retrieval involves an interactive dialogue
between the enquirer and database. The computer matches any input search
terms against its files, and then displays any resulting matches. These
can then be printed out or downloaded by the searcher. Searches can be
carried out directly by end-users or by information specialists acting
as intermediaries.
Information is stored in the form of a structured database on a host
computer and is available online to users by means of communication networks
such as the Internet. Computerised information retrieval or online searching
is carried out in the form of a dialogue in real time between the
user at his/her computer terminal or personal computer (PC) and the various
databases stored on a host computer. The various groups involved in computerised
information retrieval are the database producers, the host vendors
or system operators, institutions providing terminals or PCs,
intermediaries and end-users.
5.6 Examples of databases
Examples of databases which are useful in physics and electrical engineering
are:
-
INSPEC
-
PASCAL
-
SciSearch
-
CORDIS
-
CA File
-
COMPENDEX
-
ENERGY SCIENCE & TECHNOLOGY
-
Engineering and Industrial Software
-
INIS
-
NTIS
-
World Translations Index - WTI
-
Grants
5.7 The Computerised Interactive Online Search for Information
This section will describe the processes involved in an online search.
In interactive computerised information retrieval, you work at a
terminal or PC, in direct contact (via communication networks) with a vast
amount of information stored in a central computer memory. You conduct
a dialogue with the central computer, by entering enquiries via the keyboard
and receiving replies on the printer and screen of the terminal or PC.
The basic search process is similar to that for manual searching, namely
definition and analysis of the search question or topic, the identification
of suitable search terms, the design of the search strategy,
choice of appropriate databases, and the interactive online dialogue.
You carry out the search in a number of stages:
-
Formulate the question so that it really covers your information need.
-
Find appropriate keywords (by means of dictionaries and thesauri) for the
search concepts.
-
Key in the appropriate password to a suitable information system host computer,
through either the academic network (remote access) or through a public
communication network. This is called logging on and gives access
to the information retrieval system. (Each user/group has a specific password
to facilitate invoicing of charges).
-
The system replies with information about the databases available and you
select and key in the base chosen for the search.
-
You then key in the terms which cover the search question.
-
An interactive dialogue then ensues between you (the user) and the
system, in which the computer replies with information on how many references
the database contains for each term keyed in.
-
You develop a search strategy in which terms are grouped in parameters
and linked, according to Boolean logic, with AND, OR, or NOT operators
(see Figure 13). This search strategy is keyed in.
-
You can develop the search further - enlarge by more related terms, together
with synonyms - OR - , or narrow by combining several parameters - AND
- strategies.
-
The system responds with the number of references fulfilling the prescribed
conditions.
-
You ask for a display of sample references.
-
If the sample references are relevant, you can order a print-out of the
total available references of this type, either directly "on-line" via
the terminal or "off-line" with a print-out at the computer centre and
subsequent delivery by post. (Off-line printing is used if there are many
references, in order to reduce costs.) Alternatively, you can download
the references to your computer disk.
-
You terminate the search by the log-off command and the connection
to the host computer is broken.
Figure 13. AND OR and NOT Logic illustrated by a Venn Diagram
Computerised information retrieval is an interactive dialogue, and the
results that you obtain are very much dependent on the careful preparation
of your search. Examples of search strategies within various subject areas
are given Manual of Online Search Strategies, by Armstrong &
Large, 1992,[24] and in Online Searching in Science and Technology,
1991.[25]
5.7 The advantages of Computerised Information Retrieval
Computerised information retrieval has several advantages over manual methods
for literature searching:
-
You save time.
-
Information stored in a database is more current than in the corresponding
printed publication.
-
You can search for information in several subject areas during the same
search, for example, information on environmental pollution can be searched
for in databases covering biology, engineering and chemistry.
-
You can carry out a more detailed search with the help of the computer
than by manual methods. In the printed publications, it is usually only
possible to search under subject headings or authors, whereas the computerised
system permits many more search entries such as institution, title of journal,
classification code, keywords or descriptors and words included in the
title/abstract. Every unit of information which has been stored in the
computer is potentially searchable.
5.8 Computerised information searching
Three types of skill are necessary for carrying out interactive computerised
information searches:
-
subject knowledge;
-
skill in using the PC;
-
knowledge of the information system to be used - the appropriate commands,
type and quantity of information available, database, structure, etc.
The following factors have contributed to easier online searching for end-users:
In recent years considerable effort has been spent on making computerised
information systems more "user friendly" or simpler to use. User friendly
systems have been developed by a number of systems operators. These take
the form of a Web-based user friendly interfaces.
There are still difficulties for the end-user. The various databases,
which are produced by different organisations, are not standardised as
to structure and lay-out, or indexing terms. Another problem is that, if
searching is carried out at infrequent intervals, it is possible that the
various commands feel unfamiliar. One way to overcome this is to make use
of "refresher training." The Into Info demonstrations and exercises will
be designed to help you in your searching.