My Content
Indexing System and Techniques : Assigned-Pre-Coordinate; Derived-Title-based; Vocabulary Control
What is Index
Pre-coordinate Indexing System
Post-Coordinate indexing
Assigned Index
Indexing Techniques
       Introduction
Derivative Indexing and Assignement Indexing
Pre-coordinate Indexing Systems (5 Print)
         Cutter's Contribution
         Kaiser's Contribution
         Chain Indexing. 1938
                 Definition and Use
                 Steps in chain Indexing
         PRECIS (Preserved Context Index System)
                 What is PRECIS
                 Objectives of PRECIS
                 Features of PRECIS
                Principles of PRECIS
                Syntax and semantic
                Role Operator
                Codes
                Input String
                Entry Structure of PRECIS
                Formats of PRECIS Index
                Index Entries
       POPSI      1979
               Major Working Concepts of POPSI
               Features of POPSI
               Steps in POPSI
       COMPASS ( Computer Aided Subject System)
Post- Coordinate Indexing
Pre-coordinate Indexing VS Post-Coordinate Indexing
        Term Entry System and Item Enty Syste
        Uniterm Indexing
        Keyword Indexing 1950
                 Key word in context Indexing (KWIC)
                          Annotation
               Advantages and Disadvantages of KWIC
               Variations of Keyword Indexing
               Other Versions
        Computeries Indexing
               Advantages and Disadvantages of computerised Indexing
               Disadvantages
               Categories of Computerised indexing Systems
Vocabulary Control
Meaning and Need
Objectives
Requirements
Entry Vocabulary and Index Vocabulary (non - preferred terms togather) (Terms on their own)
Vocabulary Control tool
                                                                                                               

 Indexing Systems and Techniques :  Assigned -      Pre-Coordinate; Port-Coordinate; Derived - Title-  based; vocabulary Control                                        




 * Science citation index 1965 

  What is Index                                                                                      

 The definition of an index is a guide, list or sign, or a number used to measure change. An example of an index is a list of employee names, addresses and phone numbers. An example of an index is a stock market index which is based on a standard set at a particular time. noun. The word "Index" is derived form the latin word 'indicare' meaning 'to paint out' or 'to show'. 

 An index is a list of all the names. subjects and ideas in a piece of written work, designed to help readers quickly find where they are discussed in the text. Usually found at the end of the text, an index doesn't just dist the content (that's what a table of contents is for), it analyses it.  

J.E.L. Farradane-1950 

• Relational Indexing 

 1963 Coates's subject  

1991 COMPASS 


 Pre-coordinate Indexing System                                                         

Cutter Contribution (Rules Dictionary catalogue 1876) 

 Kaiser's contribution (Systematic Indexing 1911) 

 Chain Indexing 1934 

 PRECIS (D. Austin 1971/1974) 

 POPSI (Genesh Bhattacharya 1964) 

 Thesaras P.M. Roger 


 Post-Coordinate indexing                                                                    

Pre-coordinate versus Port-coordinate Indexing  

Term Entry and item entry system

Uniterm Indexing -1953 

 Keyword Indexing. 

 Computerised Indexing. 

 Indexing Internet Resources/Web Indexing  

(Citation Indexing Eugene Garfield) 

 (SLC Indexing J. R. Sharp) 

 Automatic Indexing Herbert Ohlman 



 


 Assigned Index                                                                                     


 Assigned Indexing (or Assigned Indexing or Concept Indexing) in an indexing method in which the human indexer selects one or more subject headings or descriptors from a list of controlled vocabulary (e.g. subject headings lists, thesaurus, or classification schemes)  to represent the subject matter of the work. Since the Assigned Indexing uses the controlled vocabulary to give the indexing terms selected to represent the subject content of a work, so in this technique the is no need for the index terms to appear in the title or text of the document indexed. 

 In Indexing, if the terms are selected from the title or the text of a decument and used without any alteration as index terms, then this is refferred to as natural language indexing or derived indexing. Derived indexing solely relies on information which is manifest in the document, without attempting to add to this from indexer's own knowledge or other sources. By doing so we have to face the problems of natural language, If however, the selected terms are translated or encoded into authorized terms by the help of a prescribed list (e.g. Library of Congress Subject Headings, Decimal classification), then the indexing language becomes controlled or artificial. This process is called Assigned indexing. 

 If we are to use a list of word to help us in our searching, we would increase the chances of achieving successful matches if we used the same list of words to encode the appropriate words to the documents ourselves rather than rely on authors choice. In other words, we devise an indexing language and use this for both encoding operations :  input and question. Such systems are referred to as assigned indexing systems. Assigned Indexing involves  an intellectual process. Subject heading schemes, thesaurus and classification schemes are the popular forms of assigned indexing. 


 Indexing Techniques                                        

 • Introduction 

 Subject approach to information has been an area of intense study and research in the area of organisation of information resulting in the generation of new theories and the design of the corresponding  new indexing techniques based on these thories. Indexing technique actually originated from what is known as the 'back-of-the-book index'. Its objective is to show where exactly in the text of a document a particular  concept (denoted by a term) is mentioned, reffered to, defined or discussed. The 'back-of-the-book Index' may be either in the form of specific index or relative index. Specific index presents the broad topics in the form of one to-one-entry whereas the relative index is one which displays each concept in different context. The best example of such an index is the relative index of Dewey Decimal classification. But the relative index is usually unique to the text to which it points to and is quite difficult to maintain on a large scale subsequently we have seen the development of been I large begin pre-coordinate indexing model. 

 Till about the early fifties of the last century, the pre-coordinate indexing models were the only ones that had beech developed. In the subsequent  decades, the Post-coordinate indexing models were designed and developed. In the subsequent decades, the Post-coordinate indexing models were designed and developed, the physical apparatus for these index files also changed from the conventional index cards to different formats of Post coordinate indexing models. With the advent of the computer. The Keyword index models like KWIC, KWOC. and KWAC were introduced, Post-Coordinate indexing also become more amenable for computer manipulation. Most of the bibliographic databases today have indexes based on post-coordinate indexing principles. The following sections of this unit present the major developments in subject indexing  techniques for organising the index file. 


  Derivative Indexing and Assignement Indexing                               

 Indexing can be either "derived indexing" or "assigned indexing " / "assignment indexing". Derived indexing is a method of indexing in which a human indexer of computer extracts from the titte and/or text of a document one or more words or phrases to represent the subject (s) of the work, for use as headings under which entries are made. It is also known as extractive indexing. 

 In derivative indexing, terms  to be used to represent the content of the document are derived directly from the document itself. Here no attempt is made to use an indexer's own knowledge of the subject or other guides, but use only the information which is manifest in the document. Index terms are derived derictly from the title or text of the document. It requimers least intellectual effort on the part of the indexer. Mechanical devices and computers are used in abundance to carry out the burden of index preparation as well as the tasks of matching these with the questions of users put to the system. Examples of derivative indexing are keyword indexing, citation indexing, automatic indexing, etc.

 Assigned indexing is also known as "concept indexing", because it involves identifying concept(s) associated with the content of each document. lt is a method of indexing in which  a human indexer selects one or more subject headings or descriptors from a list of controlled vocabulary to represent the subject of a work. The indexing terms selected to represent the content need not appear in the tille or text of the document indexed. Here, an indexing language is designed and it is used for both indexing and searching. Some notable examples of assignment indexing are chain Indexing, PRECIS, POPSI, Classification schemes, etc. 


1. Pre-coordinate Indexing Systems (5 Print)                                     

 Basically all indexing systems are, by nature, coordinate indexing. The purpose of using a combination or coordination of component terms is to describe the contents of the documents more precisely. Many subjects can be expressed in a single term, e.g. library, cataloguing   management, etc. others are expressed as a  combination of these, e.g. cataloguing  in libraries, management of libraries etc. When the indexer assigns subject headings representing such compounds and arrangers entries in a series of classes according to the subject content of the document, the resulting system is referred to as pre-coordinate indexing system. Here, the indexer coordinates the component terms representing compounds at the input stage, i.e. at the time of indexing in anticipation of users' approach. Most of the classification schemes allow a measure of coordination either by including compound subjects or by providing facilities for creating them out of simple elements. The same is true for readymade lists of subject headings like sears list of subject headings and library of congresso subject headings. Such classification schemes and lists of subject heading can therefore be regarded as pre-coordinate indexing languages. Almost all pre-coordinate indexing models have an a priori approach in choosing their semantic paradigms such as categorisation of concepts, role indicators, relational operation, etc. Pre-coordinate indexing involves coordinating ( combining, pulling together concepts) followed by engaging in an act of synthesis to build the index entries. In such a system, the most important aspect is to determine the order of significance  by following the syntatical rules of the give indexing language. 


 1.1 Cutter's Contribution   

 Charles Ammi cutter first set forth the rules for alphabetical subject headings in a systematic way in this rules for a printed dictionary catalog in 1876. He used the term 'Subject Cataloguing" instead of 'Subject Indexing'. It was C. A. Cutter, who gave the idea of specific subject entry. He also tried to systematise  the rules for compound subject headings which consisted of more then one term as a phrase and where the subject heading was composed of a name of a subject and the name of a locality and so on. Cutter's concept of specific subject heading was quite different. from what we mean today. He had in mind a set of stock subject names and every book had to be accommodated under the most restricted subject which contained the subject of a book. For compound subject heading, cutter laid down the rule that the order of the component terms in compound subject heading should be the one that is decidedly more significant. But Cutter could not prescribe how one will come forward to decide which one is more significant. The question of significance varies from user to user. The decision in respect of "Significance" was left to the judgment of individual indexer, which was subjective. Some rules/guidelines as furnished by cutter in this rules for a printed Dictionary catalog are mentioned below: 

 A. Person versus Country Entry will be under person in case of single/personal biography. Entry will be under country in case of history, event etc. For example the biography of Sachin Tendulkar would entered under his name whereas the biography of Indira Gandhi would be entered her name and India history also. 

 B. Country versus Event Entry of an event would be under event if it is proper noun, e.g. Jalian Wala Bagh. Entry would be under country if it is common noun. e.g. Freedom fighters in India, entry would be under India. 

 C. Subject versus Country In scientific subjects, entry will be under subject qualified by Place. e.g. Oceanography India. 

 It has been stipulated to prepare entry under the name of place qualified by the subject in subject areas like History, Government and Commerce, e.g. India Moghul Period. 

 For Humanities, Literature, arts etc. adjectival form of subject headings were suggested. e.g. Indian painting, Moghul Architecture, etc.


D. Between overlapping subjects embry will be according to the importance of the subjects. It is to be pointed out here that decision regarding "importance" was left to the judgement of the indexer. 

 E. When there is a choice between different  names. Cutter prescribed the followings: 

 i. Language. - if there are two language out of  which one is English, entry will be under English. Here the rules appear to be biased towards English language. 

 ii. Synonyms - Entry will be under one word with reference to others. 

 iii. Antonyms - Entry will be under one word with reference to others. 

 F. In case of compound subject headings, cutter prescribed the following rules :  

 i. A noun preceded by an adjective, e.g. Organic Chemistry, Ancient History etc. 

 ii. A noun preceded by another noun used as an adjective, e.g. War prisoners, flower fertilisation, death penalty, etc. 

 iii. Use direct form in case of a noun connected to another noun with a preposition, e.g. patient with heart disease, Fertilisation of flowers, death penalty, etc.  

iv. Use direct form in case of phrase or sentence used as the name of a subject, e.g. Medicine as profession. 


 1.2 Kaiser's Contribution          

Kaiser started from the point where cutter left. In 1911, Julius otto Haiser (1868-1927), a special Librarian and indexer of technical literature, developed a method of subject indexing system known as systematic indexing. Kaiser defined indexing as "the process by which our information is collected and made accessible" and claimed that is constitutes "the main work of organising information". The definition highlights a cardinal tenet of his theory of systematic indexing - that, within the framework of a business library with which he was attached, the primary  aim should be not to index 'documents' but 'information', which Kaiser took to be the various 'facts and opinions' (i.e, informational units) encoded in the texts of documents. He viewed systematic  indexing as a two-step procedure : The first step was to analyse a subject so as to distinguish constituent concepts associated with the content of the given document into two fundamental categories of facets :  "Concretes" and "Processes", "Concrete" refers to things, places and abstract terms. not signifying any action or process, e.g. Gold, India Physics, etc. "Process" refers to mode of treatment of the subject by the author (e.g. Evaluation  of IR system, critical analysis of a drama), an action on process described in the document (e.g. Indexing of web documents) and an adjective related to the concrete as component of the subject (e.g. Strength of a metal) 

 Kaiser aslo laid a rule that if a subject dealing with place, double entry (concrete place - process and place - concrete - process) is to be made. For example, index extries for 'Manufacturing of Petrochemicals in West Bengal' would be PETROCHEMICALS West Bengal - Manufacturing; WEST BENGAL - Petrochemicals - Manufacturing. The second step was synthesize constituent concepts into indexing statements formulated according  to the strict rules of citation order -  concrete is to be followed by process. 

 Thus, according to Kaiser's systemic indexing, all indexing terms should be divided into fundamental categories "Concretes", "Countries" and "Processes"; which are then to by synthesized  into indexing "statements" formulated according to strict rules of citation order. Some examples of subject headings according to kaiser's systematic indexing are furnished below:

 Documents                                                   Categories                                        Subject Headings

Indexing of films                                      Concrete - Process                               Films - Indexing

Strike in India                                           Country - process                                  India - Strike

libraries in Nepal                                      Concrete - Country                              Libraries - Nepal

Manufacturing of copper in Assam      Concrete - country - process      Copper - Manufacturing - Assam


 Kaiser's systematic indexing did not make any provision for entry under the 'Process' term and as a result it failed to satisfy the user's approach by the process term. The concept of "time" was also left by Kaiser. It is to be pointed out here that Kaiser was perhaps the f irst person who gave the idea of categorisation in subject indexing. 


 1.3 Chain Indexing. 1938   

'Theory of library catalogue' Ranganathan fact analysis of subject provides a kind of representation of subjects by transforming multidimensional  relations of subject into a moderate  layer of linear representation. Ranganathan is credited with the invention of chain Indexing, an economical  system of providing access to the terms in classification schedules without replicating the hierarchical structure of the classification in the alphabetical index. Ranganathon chain Indexing technique was devised as a complementary and supplementary   tool to classification schemes. However, due to the efficiency and economy, this technique can effectively be made use of in deriving alphabetical subject indexes for any indexing/ abstracting services. First use by Madras University Library in 1936, 


 Definition and Use    

 The concept of chain is the foundation of chain Indexing. A chain is deemed to be a structural manifestation of a subject. The term 'structure' in this context refers to the parts constituting a subject and their mutual interrelationship. It is modulated sequence of sub-classes or isolates ideas. 

Since the chain expressed the modulated sequence of sub-classes more effectively in a notational scheme of classification of subjects, this method takes the class number of the document concerned as the base for deriving subject headings not only for specific subject entry but also for subject reference entries. The nature and structure of the classification  scheme used to classify the subject of the document controls the structure of the subject headings drawn according to the chain procedure. The concept of "Chain" becomes operative only after the concept of a set "link" about the structure of the subject is conceded. A chain should comprise a link of every order that lies between the first link and last link of the chain. The different types of links in chain indexing system are discussed below : 

 • Sought Links (SL) : Sought links denote the concepts (as any given stage of the chain) that the user is likely to use as access points. 

 • Unsought Links (USL) : Unsought links denote those concepts that are not likely to be used as access points by the user, 

 • False Links (FL): False links are those that really do not represent any valid concept, mostly these are connecting  symbols or indicator digits. 

 • Missing Links (ML): Missing links represent those concepts that are not available in the preferred classification scheme, these are inserted by the indexer by means of verbal extension at the chain-with-gap corresponding to the missing isolate in the chain whenever there is such a need. 


 Steps in chain Indexing   

 The following steps are to be followed in chain indexing for deriving different types of subject headings. 

 1. Construction of the class Number of the Subject of the Document 

 Classify the subject of the document by following a preferred classification scheme. A class number constructed according to a scheme for notational classification will form the basis for applying the rules for chain procedure for deriving subject headings. 

 2. Representation of the class Number in the form of a chain 

 Representation the class number in the form of a chain in which each link consists of two parts : class number and its verbal translation in standard  term or phrase used in the preferred classification scheme. 

 3. Determination of Links 

 Determine different kinds of links :  Sought Links (SL), Unsought Links (USL), False Links (FL) and Missing Links (ML). 

 4. Preparation of Specific Subject Heading 

 Derive specific subject heading for the specific subject entry from the last sought link and moving upwards by taking the necessary and sufficient sought links in a reverse rendering process. If the subject includes a space isolate, time isolate or a form isolate, break the chain into different part at the point denoting space, time and  form in the class number. In such a  situation, specific subject heading is to be derived from last SL of first part in reverse rendering process and  then by second part, third part, etc. if any, in the similar process. Then, the components of derived subject heading are to be arranged in the sequence of their derivation from each part of the chain of the class number. 

 5. Preparation of Subject Reference Headings 

 Derive subject reference heading for the subject reference from each of the upper sought links. This process continues until all the terms of upper sought links are exhausted and indexed. 

 6. Preparation of Subject Reference Entries. 

 Preparation subject reference extries or 'see also' references from each subject reference heading to its specific subject heading. When a subject  heading starts from last sought link denoting space or time or form, prepare "See" references instead of "See also" references from subject reference heading to specific subject heading. 

 7. Preparation of Cross References, if any 

 Preparation cross references (i.e. "see" references) for each alternative and synon term/ heading used in the specific as well as subject reference headings. 

 8. Alphabetisation  

Merga specific subject entries, subject references (i.e. 'see also' reference) and 'see' reference and arrange them in single alphabetical sequence.


1.4 PRECIS (Preserved Context Index System) 

 PRECIS is generally considered as one of the most ambitious late twentieth century attempts to create an idexing system from scratch. A major break through in the field of subject indexing was achieved by Classification Research Group (CRG), London and Derek Austin come out with a new method of subject indexing called PRE Served context Index System (PRECIS) in the early 1970s for the British National Bibliograph (BNB). When the BMB began in 1950, it used chain indexing system for about 20 years. However it was not, for various reasons, ideal for a computerised system, and in 1971, when BNB had developed the MARC system in the united kingdom  and was also engaged in using computers  for the production of BNB itself, chain  indexing was replaced by PRECIS. • PRECIS developed by Derek Austin in 1971 

 • PRECIS was replaced by COMPASS in 1990. 


 What is PRECIS ?  

PRECIS is a system of subject indexing in which the initial string of terms organised according to the scheme of role operators, is computer manipulated in such a way that each sought term in the string functions as the approach term while preserving the full context of the document. Entries are restructured  at every step in such a way that the user can determine from the format of the enter which term set the approach term into its context and which terms are context dependent on the approach term. 


 Objectives of PRECIS 

a. The computer, not the indexer, should produce all index entries. The indexer's responsibility is to prepare the input strings to give necessary instruction to the computers to generate index entries according to definite formats. 

 b. Each of the sought terms should find index entries and each entry should express the complete thought content / full context of the document unlike the chain procedure where only one entry is specific -  i.e. fully co-extensive with the subject of the document and others are cross references describing only one aspect of the thought content of the document. 

 c. Each of the entry should be expressive. 

 d. The system should be based on a single set of logical rules to make it consistent. 

 e. The system should be based on the concept of open-ended vocabulary, which means that terms can be admitted into the index at any time, as soon as they have been encountered in the literature. 

 f. The system must have sufficient references between semantically related terms. 


 Features of PRECIS   

 • It is more amenable to automatic: manipulation than indexing based on the notational classifications. 

 • The permuted entries read naturally, which is achieved by the prescribed order of the role operators. 

 • The terms are linked to a machine held thesaurus thereby providing possible "See" and "see also" references. 

 • PRECIS can be adapted to other language. 

 • The indexer determines the meaning of the terms codes the roles and identifies the lead terms, whereas the computer takes care of the permutations. 

 • Its subject formulation is completely independent of classification, therefore exclusively geared to no classification numbers assigned in the MARC record. 

 • Context is preserved : It presents the full subject statement at every point of index entry, by gradual inversion of the concept string thus overcoming the problem of the disappearing chain. 


 Principles of PRECIS   

 Two Principles are followed in PRECIS 

 a. Principle of context Dependency : The "context-dependency, principle may be seen as a combination of context and dependency. When this principle is followed in a PRECIS input string, each term is qualified and sets the next term into its wider context. In other words, the meaning of each term is the string depends upon the meaning of its preceding term and taken together, they all represent  the single context. Each term is hence dependent, directly or indirectly, on all the terms which precede it. 

 b. Principle of one-to-one Relationship : When the terms are organised according to the principle of context dependency, They form a one-to-one related sequence each of the terms in the string directly related to its next term. 


Syntax and semantic   

The syntax of PRECIS is based on the role operators, codes and logical rules which act as instruction to the computer. The semantics of PRECIS is handled by linking the terms to a machine - held the thesaurus there by providing possible "see" and "See also" references. 


 Role Operator 

 Role operators consist of a set of alphanumeric  notations which specifies the grammatical role or the function of the indexed term and regulates the order of terms in the imput string. Role operators and their associated rules also serve as the computer instruction for determining the format, typography and punctuation associated with each index entry. There are two kinds of role operators : primary operators and secondary operators. Primary operators control the sequence of terms in  impurt string and determine the format of index entries. Any of the secondary operators is always to be preceded by the primary operator to which it relates. 


 Codes  

Use of codes in the string brings expressiveness in the resulting index entries. Three types of codes are there :  primary, secondary and typography codes. 


 Input String 

 A set of terms arranged according to the role operators which act as  instructions to the computer for generating index entries. 


 Entry Structure of PRECIS  

The entry structure of PRECIS string consists of a two-dimensional display  rather than the one dimensional that we have been accustomed to; instead of putting everything on one line, so that the only relationship which could be shown was that of following or proceeding PRECIS uses two-line-three part entry structure as follows :  

                             Lead                                   Qualifier             

                                                 Display 

 Lead is occupied by the approach term, which is the filing word and is offered as the user's access point in the index. 


 Qualifier position is occupied by the term that sets the lead into wider context (i.e. general to specific). Together, the Lead and the Qualifiers correspond to the Heading. Terms in the heading set down in a narrower to-wider context order. When the first term of the input string appears in the Lead position, the Qualifier position is usually kept blank. 

 Display position is occupied by those additional set of qualifying terms of the PRECIS string, which rely upon the heading for their context. When the last term of the input string appears in the lead position, the Display Position becomes empty. 


 Formats of PRECIS Index 

 Index entries in PRECIS are basically generated in three formats : standard format, inverted format and predicate transformation. 

 a. Standard Format : Index entries in the standard format are generated when any of the primary operator (0), (1), and (2) or its dependent elements appear in the lead. The process of generation of index entries in the standard format has already been demonstrated under the section 11.3.4.7 of this unit. 

 b. Inverted Format : Index entries in the inverted format are generated  whenever a term coded by operation in the range from (4) to (6) or its dependent elements appear in the lead. The rule relating to the generation of index entries with this format is that - when any of the terms coded either (4) or (5) or (6) or any of their dependent element operators appear in the lead, the whole imput string is read from top to bottom and is written in the display. However, if the term appearing in the lead is last term of the input string, then it will be dropped from the display. 

 Example: A report on the feminist viewpoint on marriage. 

 Input string :  

(2) ✓ marring  

(4) ✓ feminist viewpoint  

(6) ✓ reports 


 Index Entries :  

Marriage

- Feminist viewpoint - Reports 

 Feminist viewpoint

- Marriage - Feminist viewpoint-Reports 

 Reports

 - Marriage - Feminist view point. 


c. Predicate Transformation 

 When an entry is generated under a term coded (3) that immediately follows a term coded either by (2) or (3)on (t) -  each of which introduces an action of one kind on another - the predicate transformation takes places. An input string of this kind is shown below :  Example: Planning of Libraries by architect input string :  

(1) ✓ libraries  

(2) ✓ planning $v by $w of 

 (3) ✓ architects  

Libraries 

               planning by architects. 

 Planning, Libraries        

               By architects 

 Architects 

                 Planning of Libraries


 1.5 POPSI      1979                                    

 All pre-coordinate indexing model are entirely based on the method of facet analysis. Ranganathan pointed out in a paper entitled "Subject heading and facet analysis" (Journal of Documentation) 20(3), 1964, P. 109-119) that facet analysis does not depend entirely on natational scheme of classification. The rules of chain procedure, he said, can be so framed as to implement any kind of decision about the sought first heading and the other successive heading  in conformity with principle of local variation. Since then, continues research  on this new line of thinking was going on at Documentation Research  and Training Centre (DRTC), Bangalore and a number of papers on Postulate based Permuted Subject Indexing (POPSI) based on Ranganathan's general  Theory of Library Classification come out. Dr. Ganesh Bhattacharyya f irst explained the fundamentals of subject indexing languages with an extensive theoretical background which ultimately led to the development of newer version of POPSI, forming the part of his General Theory of Subject Indexing Languages (GT-SIL). Bhattacharyya developed the POPSI through logical interpretation of the deep structure of subject indexing language (SIL). POPSI drew attention to the helpfulness of adopting a suitable device for ensuring an optimally effective organising classification  through the alphabetisation of verbal subject - propositions. It prescribes the use of apparatus words such as prepositions, conjunctions, participles etc., as and when  necessary to communicate the exact meaning of subject propositions. These words are put in parenthesis and they are ignored in alphabetisation.  Since the POPSI index of all verbal entries, filing them in one alphabetical sequence in a unipartite index is made easy. 


 Major Working Concepts of POPSI   

 i. Deep Structure of Subject Indexing Language (DS-SIL) 

 DS-SIL is the logical abstraction of the surface structures of outstanding SIL, like Cutter, Dewey, Kaiser and Ranganathan. According to the general theory of SIL, the structure of a specific SIL has been assumed to be a surface structure  of the deep structure of SIL. The DS-SIL has been presented diagrammatically as follows : 

 picture  







It appears from the above diagram that any specific subject may belong to any one of the following elementary categories (D,E,A,P) and modifier. 

 ii. Elementary Categories and Modifier 

a. Discipline (=D) refers to an elementary category that includes the conventional f ield of study, or any aggregate of such f ields, or artificially created fields analogous to those mentioned above, e.g. Physics, Biotechnology, Ocean Science, Library and Information Science, etc. 

 b. Entity (=E) refers to an elementary category that includes manifestations having perceptual correlates, or only conceptual existence, as contrasted with their  properties, and actions performed by them or on them, e.g. Energy, light, Plants, Animals, Place, Time, Environment etc. 

 c. Action (=A) refers to an elementary  category that includes manifestations denoting the concept of "doing". An action may manifest itself as self Action on External Action. For Examples : Function, Migration, etc. are self Actions; and Treatment, selection, organisation, and Evaluation, etc. are External Actions. 

 d. Property (=P) refers to an elementary  category that includes manifestations denoting the concept of "attribute" _ qualitative  on quantitative, e.g. Property, Effect, Power, Capability, Efficiency, Utility, Form, etc. 

 e. M = Modifier refers to a qualifier used to modifier any one the elementary categories D, E, A and P. It decreases the extension and increases the intension of the qualified manifestation without disturbing it's conceptual wholeness. A modifier can modify any one of the elementary categories, as well as two or more elementary categories, modifiers are of two types :  

• Common Modifier : They refer to space (e.g. Libraries in India), time (e.g. Libraries in India 19th century), Environment (e.g. Desert Birds) and Form (e.g. Encyclopedia of Physics),  common modifiers have the property of modifying a combination of two or more elementary categories. 

 • Special Modifiers : A special modifier is used to modify only one of the elementary  categories. It may be of discipline based or Entity-based or property-based or Action-based. Special modifiers can be grouped into two types :  

i. Those that require a phrase or auxiliary  words to be inserted between the term and thus forming a complex phrase e.g. cataloguing using computers, and 

 ii. Those that do not require auxiliary  words or phrase to be inserted in between the terms, but automatically form an acceptable compound term denoting species/ Type e.g. 'chemical' in 'chemical treatment'. 


 iii. Organising classification and Associative classification 

 According to the general theory of SIL, classification is a combination of both organising classification and associative classification. In other words, an indexing system is a combination of both organising classification and associative classification The tasks involved in creating an organising classification are categorisation  of concepts and their organicsing in hierarchies. In organising classification compound subject are based on genus - species, whole-part, and other inter-facet relationships. Here, classification  is used to distinguish and rank each subject from all other subject with reference to its coordinate-superordinate-subordinate-collateral (cossco) relationships.  The result of organising classification is always a hierarchy. In associative classification, a subject is distinguished from other subjects based on the reference of how it is associated with other subjects without reference to its cossco relationships. The result of associative classification is always a relative index. 

 iv. Base and core 

 In the context of constructing compound  subject heading, when the purpose is to bring together all or major portion of information relating to a particular  manifestation or manifestations of a particular elementary category, the manifestation / category is base. In case of a complex subject, any one of the subjects can be decided to be the base subject depending upon the purpose in hand. For example, for a document on 'Eye concer', 'Eye' is the base subject in an Eye Hospital Library, and 'cancer' is to be considered as the base subject for a cancer research centre.

When the purpose is to bring together within a recognised base, all or major portion of information pertaining to one or more elementary categories, the category or categories concerned is the core of the concerned base. Core lies within the base, and which one will be the base or core depends on the collection or purpose of the library. For example: In DDC, 'Medicine' is the base, and the "Human body" and it's "orgoens" constitute the core of the base.  


Features of POPSI 

 From the operational point of view, the salient features of POPSI may be grouped under three components : analysis, synthesis, and permutation.  

The work of "Analysis" and "Synthesis" is primarily basedon the postulates associated with the deep structure  of SILs for generating organising classification.  The task of analysis and synthesis is largely guided by the following popsi-table. The work of "Permutation" is based on cyclic permutation of each term of approach, either individually or in association with other terms for generating associative classification effect in alphabetical arrangement. 


 Steps in POPSI   

 The main steps in applying popsi, as follows : 

 i. Content Analysis : Involves identification of different component ideas associated with the content of the document with reference to their elementary categories and modifier. 

 ii. Formalisation : Involves preparing the formalized expression of subject statement obtained on the basis of the results of the step 1 (content analysis) according to the rules of syntax. 

 iii. Standardisation : Deciding the standard  term in the formalized expression of subject statement especially for the term having synonym, if any, this step callis for the use of classaurus. 

 iv. Modulation : Involves augmenting the standardized subject proposition by interpolating and extrapolating, as the  care may be, the successive buperordinates of each manifestation by using standard terms with indication of their synonyms. if any. [Note: A classaurus is the tool to guide the operation in steps 3 and 4 with assurance of consistency in practice.] 

 v. Entery for Organising classification :  Involves the preparation of entries organising classification by inserting appropriate notations for elementary  categories, subdivisions and modifies from POPSI table. Modulated subject proposition with appropriate notations from POPSI table sorted alpha-numerically will produce organising classification effect by juxtaposition  of entries in alphabetical sequence. 

 Approach-term selection : consists of deciding the approach- term for generating associative classification effect and of controlling synonyms. The selection of approach-term may vary for one library  approach-term may vary fore one library to another library depending upon the requirement of users. 

 vi. Preparation of Entries of Associative classification : Involves the preparation of the entries under each approach term by cyclic permutation of sought terms for generating associative classification effect in alphabetical arrangement. 

vii. Alphabetisation : Involves in arranging all the entries in alphabetical sequence. 


 COMPASS ( Computer Aided Subject System)   

 British library has been using COMPASS since January 1991. COMPASS is  simplified restructuring of PRECIS. 


 Post- Coordinate Indexing                                                                  

 All indexing systems follow the process of concept coordination to describe the contents of the documents more precisely. We have seen in pre-coordinate indexing, component concepts are coordinated according to the order of significance or citation order by following the syntactical rules of the given indexing language. But the rigidity of the citation order appears to be unsatisfactory to meet the varieties of approaches of all the users. The provision for multiple entries in pre-coordinate indexing by rotating or cycling of the component terms covers only a fraction of the possible number at of the total permutations and for this, a large portion of probable approach points is left uncovered. Consequently, the searcher har no choice but to follow  the rigid citation order specified by the given indexing language. The above noted problems stemming with the pre-coordinate  of terms with the rigidity of citation order  triggered the development of alternative indexing techniques where the component ideas of a subject are kept separately, uncoordinated by The indexer. Here, concepts/terms are coordinated at the time of searching (i.e. at output stage) by the user. A greater degree of search manipulation is available in post - coordinate indexing system since the search terms can be coordinated almost in any combination or retrieve records of information about the documents as required by the users. The indexing systems which are based on this basic principle are called post-coordinate  indexing or simply coordinate indexing systems like uniform, optical coincidence card, etc. were developed. Among the different types of post-coordinate  indexing systems, Uniterm System developed by Mortimer Taube is considered as the most popular Post-Coordinate indexing model, 


 Pre-coordinate Indexing VS Post-Coordinate Indexing                    

 Subject of decuments are not simple. There are compound and complex subjects dealing with multiple numbers of concept. when there is more than one concept in the document the order in which we cite the concepts and their relationship  to one another become important. Both Pre- and Post-coordinate indexing system are, by nature, coordinate indexing, but the coordination is done in two different  stages. The following table furnishes the points of differences between the two systems. 

 Pre coordinate indexing system                                                    Post coordinate Indexing system


1. Coordination of component terms is carried        1. Component concepts (denoted by the terms) of a

out at the time of indexing (i.e. at input stage)          subject are kept separately uncoordinated by the

 in anticipation of the user's approach.                   indexer and the user does the coordination of concepts

                                                                                  at the time of searching (i.e. at the output stage)


2. The most important aspect of this Indexing          2. Rigidity of the significance order is very much

system is to determine the order of significance         absent in this system.

by following the syntactical rules of the given 

Indexing language. 


3. It is non-manipulative. The searcher has no          3. It is manipulative. The searcher has wide options

choice but to try to predict the citation order             for free manipulation of the classes at the time of

specified by the indexer.                                            searching in order to achieve whatever logical

                                                                                 operation are required.


4. In this Indexing system, both the indexer and          4. This Indexing system does not require the 

the searcher are required to understand the                  indexer and searcher to understand the mechanism

mechanism of the system the indexer for arriving        of the system. However, the operational aspects

at the most preferred citation order and the                   need to be understood by term.

searcher for formulating an appropriate search 

strategy in order to achieve the highest possible 

degree of matching of concepts. 


                       


 Term Entry System and Item Enty Syste     

 In Term Entry System, we prepare entries for a document under each of the appropriate  subject headings, and file these entries alphabetically. Here, terms are posted on the item (i.e. Term on Item System). In this type of post-coordinate indexing, the number of entries for a document. searching of twe files (Term Profile and Document Profile) is required in this system. uniterm and Peek-a-boo are examples of these. 

 It is possible to take the opposite approach and make a single entry for each item, using a physical form which permits access to the entry from all appropriate headings. A system which works in this way is called an Item Entry System. Here, items are posted on the term (i.e. item on Term System). In this type of post-coordinate indexing, single entry is made for each item. Item entry system involves the searching of one f ile (i.e. Term profile) only. Edge-notched card is an example of item entry system.


 Uniterm Indexing  

 Uniterm indexing system was devised by Mortimer Taube in 1953 to organise a collection of documents at the Armed Services Technical Information Agency (ASTLA) of atomic energy commission, washington. Uniterm is a post coordinate indexing system based on term entery principle. Here, component term (uniterm) is independent of all other terms and serves as a unique autonomous access point to all relevant items in the collection. 


 Keyword Indexing 1950 

 Keyword indexing is based on the usage of natural language terminology for generation  the index entries. The term 'keyword' refers to a significant or memorable word (also called "catchword") that servas as a key in denoting the subject taken mainly from the title of the document (so, it is called title index) and sometimes from the abstract or text of the document. Common words like articles (a, an, the) and conjunctions (and, or, but) are not treated as keywords because it is inefficient  to do so. This system is also known  as natural or free indexing language. It is to be pointed out here that the concept  of keyword indexing is not new and it existed in the nineteenth century as a "catchword indexing". With the introduction  of computers in Information retrieval in the 1950, Hams Peter Luhn, an IBM engineer, presented a computer - produced index in 1958 that became known as KWIC (Key word In context) indexing. 


 Key word in context Indexing (KWIC) 

 The production of a KWIC index by H.P. Luhn is the earliest example of an automatic index produced using computers to perform repetitive tasks associated with subject indexing. It was a great step forward in the techniques of automatic indexing. utilizing the capabilities of computers, the KWIC method speedy and with a minimum of intellectual effort produces indexes derived solely form the titles of the documents to be analysed. All significant  or key words of titles/ title like phrases are alphabetised mechanically and then printed out in turn following a format which emphasises the selected word. The computer uses the "stop-word" list in order to ignore all syntactical words such as articles; prepositions etc. and select the remaining words in the title as indexing words. The remaining words of the title are arranged to stand in the context of their original appearance. The use of 'stop-word list' reduces the volume of the index. 

 The result of the machine manipulation is an index of key terms printed in alphabetical order, together with the text immediately surrounding middle position while the rest of the title printed on either side. The alphabetical  f iling is done on the basis of the key word printed in bold letters. Chemical Titles (of Chemical Abstract Service) and BASIC (Biological Abstracts Subject in context) are faithful adaptation of Luhn's KWIC indexing. 

 Let us consider the following title "chemical treatment of cancer in the hospitals of chema" to demonstrate the index entries generated according to KWIC indexing. 

 Cancer in the hospitals of Chennai / chemical  treatment of 614 

 Chemical treatment of cancer in the hospitals of chennai 614 

 Chennai / Chemical treatment of cancer in the hospitals of 614 

Hospitals of chennai / chemical treatment of cancer in the 614 


 Annotation 

 a. Title in the above KWIC inder has been rotated in such a way that each keyword serves as the approach term and comes in the beginning by notation followed by rest of the title; 

 b. Last word and first word of the title are separated by using a symbol say, stroke [/] (sometime and asterisk * is used) in an entry.  In some computer-produced KWIC indexes, keywords are positioned at middle of the entry. 

 c. Keywords are printed in bold type face to bring prominency in the approach term; 

 d. Identification / Location code 614 is given at the right end of each entry, and 

 e. Entries are arranged alphabetically by keyword. 


 Advantages and Disadvantages of KWIC 

 KWIC method offers the following advantages :  

i. The detection of formulaic expressions and repeated word patterns; 

 ii. Simplicity; 

 iii. Speed 

 iv. maximisation of computer use and 

 v. minimisation of the indexers role. 

 The most common type of complaint against the KWIC Indexing  method is the lack of terminology control as it is entirely dependent upon title /abstract/ text of the document. Apart from this, KWIC has basically two problems :  

i. KWIC shows sentences which contain distant dependency as different context and, 

ii. KWIC also shows sentences which have different word order as different context. 

 The effects of computer's inability to resolve these problems led to the redundancy,  scatter of references throughout the index, haphazard groupings and retrieval losses because the user is forced to guess at the terminology the outhor actually used. The disadvantages of KWIC -type inder can be summarized as follows :

- Large number of index entries under a given keyword, which provokes difficulties in searching.

-Lack of significant words in titles (therefore the title and the abstract are often used as a source for indexing to increase the depth and range of indexing)

- No cross references, which make it difficult to find synonyms, spelling variants and inflections in the index.

- Relatively high computing time, due to superfluous non-significant index entries;

- No combination of keywords.

- Lack of consistency in the indexing terms, because different authors can use different form of words to communicate the same idea, or give different meanings to the same word or phrase.

- Referencers are not grouped under a convenient heading. 

- Redundancy of the index. 


 Variations of Keyword Indexing 

 A number of varieties of keyword index apper in the literature and they differ only in terms of their formats but indexing techniques and principle remain more or less same. Some important  various of keyword indexing are discussed below :  

i. Key word out of context (KWOC)  

ii. Key - word augmented-in - context index (KWAC) 


 Other Versions 

 In addition variations in keyword indexing a number of varietes of keyword of keyword index are available and they differ only in terms of their formats but indexing techniques and principles are more or less the same. They are :  

i. Key-word-with-context Index (KWWC) 

 ii. Key-Term Alphabetical Index (KEYTALPHA) 

 iii. Word and Author Index (WADEX) 

 iv. Key - Letter-In-Context ladex (KLIC)



Computeries Indexing 

 Computerised indexing can be defined as the process whereby a computer is used to process a natural language text that is already in machine-readable form so that indexing term are allocated to its content without direct human intervention. 

 The features of computerised indexing are :  

i. Computerised indexing starts with words. 

 ii. Word association prompts the linking of target words in a search statement. 

 iii. Computers scan text and create 'inverted f ile' which associates words in the file with  position in the texts. 

 iv. Matches words in a search statement against 'inverted files' to identify texts that have words in common. 

 v. Computer algorithms are used to carry out the above operations. 

 vi. Humans do the programming and set the parameters for indexing. 

 vii.Computation techniques used include word frequency and keyword analysis. 

 viii. Computerised indexing cannot replace human on manual indexing, rather complimentary.


 Advantages and Disadvantages of computerised Indexing 

 Advantages of computerised indexing are as follow :  

i. it is as effective as human indexing. 

 ii. It is cost effective compared to expensive human indexing. 

 iii. Maintains consistency in indexing. 

 iv. Indexing time is reduced. 

 v. Help searchers find information quickly. 

 vi. Can be applied to large volumes of texts where human indexing becomes impossible  (e.g. indexing web pages) 

 vii. Retrieval effectiveness can be achieved. 


 Disadvantages   

 i. Not flexible. 

 ii. Not precise when looking at unique materials. 

 iii. Not able to adapt new technology. 

 iv Not able to do the conceptual analysis of the content of the document. 

 v. Not a term occur several times in a document will always be a significant term. 


 Categories of Computerised indexing Systems 

 Category - 1 : Online Database Indexing. 

 Category - 2 : Optical Disk based Database Indexing. 

 Category - 3 : OPAC based Indexing. 

 Category - 4 : Indexing Internet Resources. 


 Vocabulary Control                

You have learnt that subject cataloguing or subject indexing are the process used for describing the subject matter of documents. Subject cataloguing / indexing involves assigning terms to represent what the document is about. The complete set of index terms in a subject indexing system may be referred to as the vocabulary  or index language of that system. In the case of most libraries, documents are arranged on shelves while the index is likely to comprise entries within a card catalogue representing those documents and providing access to them under the selected inder terms to represent their subject matter. In other situations, the index will be in machine readable form (on magnetic tap or disk) or microfilm, or in printed book form. In most cases the vocabulary used will be a controlled vocabulary i.e. a limited set of terms that must be used by indexers and searchers. The controlled vocabulary is most often used to standardize descriptors or subject headings representing the contents of documents or subject interest profiles of users (i.e. search strategies and users profile used in SDI). In principle, however, such a tool can be applied in situations in which the standardization of terminology is needed. In this section, we shall study some of its important feature. 


 Meaning and Need 

 The term vocabulary control refers to a limited set of teal that must be used to index documents, and to search for these documents, in a particular system. It may be defined as a list of terms showing their relationships and used to represent the specific subject of a document. A certain degree of structure is introduced in a controlled vocabulary so that terms whose meanings are related are brought together or linked in some way. An uncontrolled vocabulary, is an unlimited set of terms drawn from natural language and used for describing the contents of documents. A natural language is 'natural' in the sense that it grows freely, free from any  control whatso ever. Therefore, it is hardly possible to keep a natural language clear of synonyms and homonyms. If we use natural language for subject indexing, subject matter may be described by any one of the words or phrases, without limitation, such as those occurring in documents themselves. However, certain problems in scotching do arise when no control is imposed on the vocabulary. This is because of the fact that a natural language contains a large number of synonyms, quasi- synonyms, homonyms, acronyms, ambiguous terms, etc. Hence, if vocabulary control is not exercised, different indexers of the same indexer might use different terms for the same concept on different occasions for indexing the documenti dealing with the same subject and also use a different set of terms for representing the same subject at the time of searching.This, in turn, it would result in a 'mis-match' and thus affect information retrieval. Thus, the need exists for vocabulary  control. In short, we can say that the need for vocabulary control arises over come following problems :  

a. occurrence of imprecisely defined words. 

 b. rapidly changing terminology. 

 c. numerous synonyms for a term, and 

 d. problem of homographs. 


 Objectives 

 There are basically two objectives for having a controlled vocabulary :  

a. to promote the consistent representation of the subject matter of documents by  indexers and searchers, thereby avoiding the dispersion of related documents,  through control of synonymous and nearly synonymous expression and by distinguishing among homographs; and 

 b. to facilitate the conduct of a comprehensive  search, by bringing together in same way, the terms that are most closely related semantically. 

 The first of these obyectives is achieved by controlling the terminology in various ways. First, the form of term is controlled, whether this involves grammatical form, spelling, singular and plural from, abbreviations or compound form of terms. Second, a choice is made between two or more synonyms, near- synonyms and quasi- synonyms. Third, homographs are distinguished. The control of synonyms is achieved simply by choosing one of the possible alternatives as the 'preferred term' and referring to this term (by using "see" on "use" references) from the variants under which certain users  may be likely to approach. It should be obvious that the synonym selected as the preferred term (i.e. the searched for) must be the one under which the majority  of users are likely to look first. Sometimes 'quasi-synonyms' are treated in the same way as synonyms (i.e. one is chosen and the reference is made from the other). The term "quasi symonyms' is not very precise. Many authors consider the quasi- synonyms as the antonyms that represent opposite extreme on continum values. An example is the pair of word - "roughness" and "smoothness" clearly; "roughness" may be regarded as merely the absence of smoothness and vice versa. 

 The controlled vocabulary also distinguishes among homographic (i.e. words with identical spelling but different meanings usually by means of a parenthetical qualifer or scope note. Thus CRANE (Bird) tells us that the term is to be used exclusively for a type of birds and not as a lifting equipment or any other possible context. 

 By controlling synonyms, near synonyms and quasi - synonyms and by distinguishing among homographs, the vocabulary control device avoids the dispersion of like subject matter and the collocation of unlike subject matter. In this way, it helps to achieve the objectives of consistent representation of subject matter in indexing and searching. 

 The second objective of  vocabulary control is to link together terms that are semantically related in order to facilitate the conduct of comprehensive searches. For a hierarchically related (in a formal genus-species  relationship) term, it will also reveal semantic relationship across hierarchies. 


 Requirements  

The important requirements of a controlled vocabulary are as follows :  

a. It should have 'warrant' derived from the terminology of literature and the information  needs of the actual or potential users. That is to say, a term is justified (warranted) if (i) Literature on this subject is known to exist, and (ii) requests for  information on this subject (denoted by the term in question) are likely to bé made fairly frequently. 

 b. It must be sufficiently specific to allow the conduct of the great majority of searchers at an acceptable level of  precision. This implies that the level of specificity will vary over the vocabulary, some subject areas being developed in greater detail than others. A vocabulary developed by National Library of Medicine would need only a few general terms in Mathematics while one developed by the American Mathematical Society would need only a few general terms of a medical nature. 

 c. It should be sufficiently  pre-coordinates, to avoid most problems of false coordination  and incorrect term relationships. One way of achieving this, and also of economizing on the absolute size of the vocabulary, is through the use of subheadings. 

 d. It should promote consistency in indexing and searching by the control of synonyms, near synonyms and quasi-synonyms. 

 e. It should reduce terminological ambiguity  through the separation of homographis and through the definition of terms whose meaning or scope would otherwise be unclear. 

 f. It should assist the indexer and searcher in the selection of the most appropriate derms needed to represent a particular subject concept through its hierachical and cross-reference structure. 



 Entry Vocabulary and Index Vocabulary (non - preferred terms togather) (Terms on their own) 

 Terms linked by equivalence relationships are rather different from those in the other two groups (hierarchical and associative relationship). Here we select one preferred term and in our index, whereas the other two kinds of relationships occur between terms which are both used in the index. In our indexing language we will, therefore, have both : the preferred terms, which are used for indexing, and the non-preferred terms which are not. The preferred terms, on their own form the 'index vocabulary', while the preferred and the non - perferred terms together form the "entry vocabulary" The entry vocabulary is very important. There will be many occasions when we decide, for one reason or another the "not to use a particular term", but to use the one already in the index vocabulary, instead. The terms in the entry vocabulary, should reflect not only literary warrant but also enquiry warrant. In other words, not only those terms found in the literature, but also those used by readers looking for information. We must be aware of the terms used by the users of our information retrieval system as well as those used by the authors whose works we are indexing. 


 Vocabulary Control tool   

i. Subject Heading List : Subject heading has been defined as a word or group of words indicating a subject undef which all materials dealing with the same theme is entered in a catalogue or bibliography, or is arranged in a file. 

 List of Subject Headings - General Principles :  The general principles that guide the indexer in the choice and rendering of subject headings from the standards list of subject headings the are discussed in the following sub-sections 

 a. Specific and Direct Entry : This principle states that "a document be assigned directly under the most specific subject and accurately and precisely represents its subject content. 

b. Common Usage : This principle states that "the word used to express a subject must represent common usage." 

 c. Uniformity : This principle states that "One uniformity term must be selected from several synonyms and this term must be applied consistently to all documents on the topic. The heading  chosen must also be unambiguous". 

 d. Consistent and Current Teminology :  This principle states that "a list of subject headings may incorporate current terminology." In such a situation a subject authority file is to be maintained. Once a heading is changed, every record that was linked to the old heading can be linked to the new heading and this decision is recorded in the subject authority file. 


 ii. Thesaurus : According to ODLIS (online Dictionary for library and information Science) "thesaurus as an alphabetically arranged lexicon of terms comprising the specialized vocabulary of an  academic discipline or field of study showing the logical and semantic relations among terms, particularly a list of subject headings or descriptors used as preferred terms in indexing the literature of the field". 

 Thesaurus is an organized list of terms from a specialized vocabulary arranged to facilitate the selection of index terms as well as search terms. 

 The oldest living example of a thesaurus is the "Roget's Thesaurus" was developed by Peter Mark Roget in 1852 to provide alternate terms for a given concept and is divided into two parts viz. classified and alphabetical. 

 Helen Brownson first used the term "Thesaurus" in information retrieval context in 1957 during the dorking conference on classification. 

 A thesaurus differs from a conventional authority list such as sear's list, in that the terms are not necessarily alone but may be coordinated with other terms. The relationships between the terms are clearly defined by use of the following standard abbreviation :  

i. SN : Scope Note 

 ii. UF : Used For 

 iii. BT : Brooder Term 

 iv. RT : Related Term 

 v. SA : See also 


The alphabetical listing of index terms in a thesaurus consists of following types of terms :  

a. Descriptor : that can be used as index terms to describe concepts contained in a document, this is also known as preferred term. 

 b. Non-descriptor : that connot be used as index terms but appear in the thesaurus to expand entry words of the indexing language. They are also known as "non preferred terms". 


 Structure of thesaurus : the internal form of individual entries and the arrangement of various entries in relation to one another constitute the structure of thesaurus. Cross references make explicit the way in which entries relate to each other in a network of concepts. Each entry in a thesaurus consists of a pack of terms, which are related to it in different ways. The different terms in the entry are displayed  in the following format. 


 iii. Thesaurus - Facet : Concept of "Thesaurus  Facet" has been developed by Jean Aitchison.  Thesaurus facet is basically a faceted classification, integrated with a thesaurus. Thesaurus - Facet consists of tulo sections a. faceted classification scheme, and b. alphabetical thesaurus. Here, the thesaurus replaces the alphabetical subject index, which normally follows the schedules in a conventional faceted classification. Terms appear twice once in the schedule and once in the alphabetical thesaurus, the link between two locations being the notation or class number. It can be used in both pre and post coordinate indexing systems. 


 vi. Classaurus : classaurus is developed by Ganesh Bhattacharya at DRTC that incorporates  in itself features of both a faceted classification scheme as well as that of a conventional alphabetical thesaurus. It is an elementary category-based (faceted) systematic scheme of hierarchical classification  in verbal plane incorporating all the necessary and sufficient features of a  conventional information retrieval thesaurus.











                                                                     Notes                                                                









                                                                     Question                                                           

1. Vocabulary in a database is controlled by 
i. Thesaurus files                           ii. Import files 
iii. Standard files                           iv. Authority files 
Codes : 
A. (i) and (ii) are correct                  B. (iii) and (iv) are correct 
C. (i) and (iii) are correct                 D. (ii) and (iii) are correct
Ans: 

2. Science Citation Index is published by 
A. Thomson Reuters                  B. H.W. Wilson 
C. Whitaker                               D. R.R. Bowker
Ans: 

3. The idea of Thesaurofacet was developed by 
A. G. Bhattacharya                 B. S. R. Ranganathan 
C. Jean Aitchison                    D. Derek Austin
Ans: 

4. At present by which system B.N.B. derives Subject Index Entries? 
A. POPSI                        B. PRECIS 
C. Chain Indexing          D. KWIC
Ans: 

5. The origin of the word ‘Vocabulary’ is from a word ‘Vocabulariam’ in 
A. Greek       B. Latin 
C. French      D. None of these
Ans: 

6. Indexing system in which the coordination of terms is done at the search stage was first introduced by A. S.R. Ranganathan               B. Derek Austin 
C. Morfiner Taube                   D. H.P. Luhn
Ans: 

7. Which of the model is not based on the analysis of the subject? 
A. Citation Indexing                B. PRECIS 
C. Chain Indexing                   D. Uniterm Indexing
Ans: 

8. The concept of ‘Stopword’ list is relevant in the context of 
A. Uniform Indexing               B. Citation Indexing 
C. Chain Indexing                   D. Keyword Indexing
Ans: 

9. “Web of Science” is 
A. A Citation Index                     B. A Bibliography 
C. An Abstracting Service          D. All of the above
Ans: 

10. The Credit for the invention of computer based citation indexing goes to 
A. F. W. Lancaster                 B. H. P. Luhn 
C. E. Garfield                        D. C. A. Mooers
Ans: 

11. Read the following example and indicate the name of the indexing system used : ‘Remuneration of teachers in French universities.’ 
The index headings are set as the following two lines : France Universities l Teachers l Remuneration Universities l France Teachers l Remuneration 
A. Chain procedure             B. POPSI 
C. PRECIS                         D. KWIC
Ans: 

12. Match the following : 
          List – I                         List – II 
a. Chain indexing              i. Derek Austin 
b. Relational                     ii. S R Ranganathan indexing 
c. Subject Indexing          iii. J.E.L. Farradane 
d. PRECIS                        iv. E. J. Coats
Codes: 
    a b c d 
A. i ii iv iii 
B. ii iii iv i 
C. i ii iii iv 
D. ii iv i iii
Ans: 

13. Assertion (A): An indexing language is an artificial language as it uses controlled vocabulary. Reason (R): It provides different relationships between terms. 
A. Both (A) and (R) are false 
B. Both (A) and (R) are true 
C. (A) is true but (R) is false 
D. (A) is false but (R) is true
Ans: 

14. Assertion (A): An indexing language is much more than a list of index terms that are acceptable to users. 
Reason (R): An indexing language helps users discriminate between terms and reduces ambiguity in the language. 
Codes : 
A. Both (A) and (R) are true and (R) is not the correct explanation 
B. Both (A) and (R) are true and (R) is the correct explanation 
C. (A) is false, but (R) is true 
D. (A) is true, but (R) is false
Ans: 

15. Assertion (A): An indexing language is an artificial language and it uses controlled vocabulary. Reason (R): Controlled vocabulary provides relation between and among terms. 
Codes : 
A. Both (A) and (R) are true 
B. (A) is true, but (R) is false 
C. (A) is false, but (R) is true 
D. Both (A) and (R) are false
Ans: 

16. Arrange the following Indexing; Systems in the order of their origin : 
i. POPSI 
ii. PRECIS 
iii. Chain Indexing 
iv. KWIC 
Codes : 
A. (iii) (ii) (i) (iv) 
B. (ii) (i) (iii) (iv) 
C. (i) (ii) (iv) (iii) 
D. (iv) (i) (ii) (iii)
Ans: 

17. Who invented ‘POPSI’? 
A. A. Neelameghan               B. S. R. Ranganathan 
C. Ganesh Bhattacharya        D. None of these
Ans: 

18. POPSI is a 
A. Key-word based indexing system                         B. Post-coordinate indexing system 
C. Pre-coordinate indexing system                            D. Citation indexing system
Ans: 

19. What is PRECIS? 
A. Post-coordinate indexing system                     B. Pre-coordinate indexing system 
C. Preserved context indexing system                  D. None of these
Ans: 

20. What is the full expansion of POPSI? 
A. Postulates of permuted subject Indexing 
B. Postulate based permuted subject Indexing
C. Postulate operated subject Indexing 
D. None of these
Ans: 

21. KWIC indexing technique is based on 
A. citation                     B. title 
C. abstract                    D. full text
Ans: 

22. Which is not the part of SCI? 
A. Citation Index                         B. Source Index 
C. Permuterm Subject Index       D. Alphabetical Author Index
Ans: 

23. Whose name is related with the Science Citation Index? 
A. Eugene Garfield                    B. J. E. L. Farradane 
C. J. Keiser                                 D. Frank Shepard
Ans: 

24. Who developed the chain indexing? 
A. D. J. Foskett                         B. E. J. Coates 
C. S. R. Ranganathan                D. D. Austin
Ans: 

25. What is the alternative name of chain procedure? 
A. Class Index Entries           B. Sought link 
C. Chain Indexing                 D. None of the above
Ans: 

26. Dr. Ranganathan designed which of the following methods for deriving subject headings? 
A. List of subject headings        B. Chain procedure 
C. POPSI                                   D. PRECIS
Ans:

27. What is pre-coordinate indexing? 
A. It coordinates the terms before searching 
B. It coordinates the terms at the time of preparing index 
C. It does not coordinate 
D. It coordinates the terms after searching
Ans: 

28. What are the two main kinds of methods of indexing?
A. Pre-Coordinate and Post-coordinate Indexing             B. Chain and Uniterm Indexing 
C. PRECIS and POPSI Indexing                                      D. Primary and Secondary Indexing
Ans: 

29. The index entries and references are brought together in PRECIS indexing by 
A. sequence                  B. alphabetical 
C. letter by letter          D. word by word
Ans: 

30. Which of the following is the post-coordinate indexing? 
A. PRECIS                B. SLIC 
C. Peek-a-boo            D. POPSI
Ans: 

31.  The role operators in PRECIS are arranged by 
A. by postulates                                              B. autonomy of indexer 
C. schedules of a classification scheme         D. system provided by concerned author
Ans: 

32. M. Taube propounded which type of indexing in 1950? 
A. Chain indexing                          B. Post-coordinate indexing 
C. Pre-coordinate indexing            D. Uniterm indexing
Ans: 

33. Who developed Uniterm indexing system?
A. Calvin Moores                  B. Allen Kent  
C. H. P. Luhn                         D. M. Taube
Ans: 

34. The interposed operators in PRECIS are represented by 
A. numerals                          B. Greek symbols 
C. Roman smalls                  D. Roman symbols
Ans: 

35. Who developed the concept of coordinate indexing? 
A. E. J. Coates                 B. C. A. Cutter 
C. M. Taube                     D. S. R. Ranganathan
Ans: 

36. Whose name is related with PRECIS indexing? 
A. M. Taube                     B. Deric Austin 
C. Ranganathan               D. E. J. Coates
Ans: 

37.  The KWIC indexing method is based on 
A. title               B. full text 
C. abstracts        D. citation
Ans: 

38. Who developed KWIC indexing system? 
A. H. P. Luhn                             B. Allen Kent 
C. Calvin Moores                       D. M. Taube
Ans: 

39.  Who developed chain indexing? 
A. H. P. Luhn                  B. E. J. Coates 
C. D. Austin                    D. S. R. Ranganathan
Ans: 

40. The role operator zero (0) is meant for what in PRECIS? 
A. Action              B. Agent of transitive action 
C. Location           D. Agent of intransitive action
Ans: 

41. Which of the following are the types of Thesauri? 
A. Controlled and Free language Thesauri           B. Real and artifactual Thesauri 
C. Subject and Language Thesauri                       D. Primary and Secondary Thesauri
Ans: 

42. The indexes and indexing services fall under 
A. computer service                           B. selective dissemination of service 
C. general library service                   D. current awareness service
Ans: 

43. In which order, the list of words is arranged in Thesaurus? 
A. Alphabetical              B. Numerical 
C. Classified                  D. Dictionary wise
Ans: 

44. Why it is essential to control over the terms in the indexing language? 
A. For controlling the synonymous terms             B. For representing the relation in the concepts 
C. For both the above                                            D. For none of the above
Ans: 

45. Which one is the third step in POPSI? 
A. Display of component term           B. Display of subject index entries 
C. Selection of approach term            D. Display of material
Ans: 

46. Which name is given to controlled vocabulary?
A. Classification               B. Dictionary 
C. Thesaurus                     D. Vocabulary control 
Ans: 

47. What is that thesauri that allow only one term to denote a concept for the purpose of indexing and searching? 
A. Primary Thesauri                            B. Language Thesauri 
C. Controlled Thesauri                        D. Free Language Thesauri
Ans: 

48. In which of the following categories Thesauri is counted? 
A. Controlled indexing languages            B. Basic indexing languages 
C. Natural indexing languages                  D. None of the above
Ans: 

49. Match the following : 
        List - I                              List - II 
a. Key Word Indexing      i. J. R. Sharp 
b. Citation Indexing         ii. H. P. Luhn 
c. Uniterm Indexing         iii. E. Garfield 
d. SLIC Indexing              iv. M. Taube 
Codes : 
    (a) (b) (c) (d) 
A. (ii) (iii) (iv) (i) 
B. (iii) (iv) (i) (ii) 
C. (iv) (i) (iii) (ii) 
D. (i) (iii) (ii) (iv)
Ans: 

50. Match the following : 
         List – I                                                                       List – II 
a. Asian Recorder                                                    i. Location of  specific of specific volume of journal 
b. Union Catalogue of Revolution Scientific Serials       ii. Articles on Green  
c. Books-in-Print                                                              iii. Obituary of Dev Anand 
d. Social Science Index                                                    iv. Availability of  books 
Codes : 
      (a) (b) (c) (d) 
A. (iv) (iii) (i) (ii) 
B. (i) (iv) (ii) (iii) 
C. (iii) (i) (iv) (ii) 
D. (ii) (iv) (iii) (i)
Ans: 

51. A type of indexing where terms are coordinated prior to searching 
A. Post coordinate indexing                 B. Pre coordinate indexing 
C. Uniterm indexing                            D. None of the above
Ans: 

52. Cranifield research studies and carried out to study 
A. Catalogue                   B. Classification 
C. Indexing                     D. Abstracting
Ans: 

53.  Eugene Garfield is primarily associated with 
A. Coordinate Indexing                   B. PRECIS 
C. Citation Indexing                        D. POPSI
Ans: 

54. Match the following 
             List - 1                           List - 2 
I. POPSI                             a. P M Roget 
II. Citation                         b. H P Luhn Indexing 
III. Key word                     c. Eugene Garfield Indexing 
IV. Thesaurus                     d. G. Bhattacharyya 
Codes : 
A. I - b, II - d, III - a, IV - c 
B. I - c, II - d, III - a, IV - b 
C. I - d, II - c, III - b, IV - a 
D. I - a, II - b, III - d, IV - c
Ans: 

55. The following are used as tools for vocabulary control in indexing 
1. Dictionary                             2. Thesaurus 
3. List of Subject Headings      4. ISBD 
A. 1 and 2 are correct            B. 1 and 3 are correct 
C. 2 and 3 are correct            D. 2 and 4 are correct
Ans: 

56.           List - I                           List - II 
   (Types of Indexing)                   (Author) 
a. Chain Indexing                     1. Derek Austin
b. Uniterm                                2. Eugene Indexing Garfield 
c. PRECIS                                3. S R Ranganathan 
d. Citation                                4. Mortimer Indexing Taube
Code : 
      (a) (b) (c) (d) 
A. 3 4 1 2 
B. 3 1 2 4 
C. 1 2 3 4 
D. 2 3 4 1 
Ans: 

57.            List - I                                                 List - II 
            (Publishers)                                        (Publications) 
a. Library Trends                                               1. NISSAT 
b. Information Today and Tomorrow                 2. Institute for  Scientific  Information (ISI) 
c. Science Citation Index                                   3. University  Microfilm International (UMI) 
d. Dissertations Abstracts International             4. University of  Illinois 
Code : 
    (a) (b) (c) (d) 
A. 4 1 2 3 
B. 1 2 3 4 
C. 2 3 4 1 
D. 3 4 1 2
Ans: 

58. Which one of the following is not correctly matched? 
A. Books in Print - Trade Bibliography 
B. Cummulative Book Index - Books published  in English language 
C. British Books in Print - R.R. Bowker  
D. National Bibliography - INB 
Ans: 

59. If two citations are cited together, it is known as 
A. Double citation              B. Twin citation 
C. Co-citation                     D. Controlled citation
Ans: 

60. Match the following 
      List-I                  List-II 
a. Arrangement      i. Accuracy, Reliability 
b. Authority           ii. Clear & Legible type faces 
c. Index                 iii. Alphabetical or Classified 
d. Typography       iv. ‘see’ & ‘see also’ references 
Codes : 
    (a) (b) (c) (d) 
A. (iii) (i) (ii) (iv)
B. (i) (ii) (iii) (iv) 
C. (iii) (ii) (i) (iv) 
D. (ii) (iii) (iv) (i)
Ans: 

61. Pre-coordinate indexing system is followed in 
i. Chain Indexing            ii. POPSI 
iii. UNITERM                iv. PRECIS 
A. (i) and (iv) are correct 
B. (i) (ii) and (iii) are correct 
C. (i) (ii) and (iv) are correct 
D. (ii) and (iv) are correct
Ans: 

62. Match the following 
         List-I                            List- II 
a. Classarus                  i. Jean Aitchison 
b. Relative Index         ii. J. L. Ferradane 
c. Thesauro facet         iii. S. R. Ranganathan 
d. Relational                iv. G. Bhattacharya Indexing 
                                     v. Melville Dewey 
Code : 
      (a) (b) (c) (d)  
A. (iii) (i) (ii) (v) 
B. (iv) (v) (i) (ii) 
C. (ii) (v) (i) (iii) 
D. (iii) (ii) (i) (v)
Ans: 

63. Match the following 
                List- I                        List-II 
(Publication)                (Institution/ System) 
a. AGRINDEX                     i. INSPEC 
b. ATOMINDEX                  ii. National Library of Medicine 
c. Physics Abstracts             iii. AGRIS 
d. INDEX MEDICUS          iv. IMS 
Code : 
     (a) (b) (c) (d) 
A. (iv) (i) (iii) (ii) 
B. (i) (iii) (iv) (ii) 
C. (iii) (iv) (i) (ii) 
D. (i) (ii) (iii) (iv)
Ans: 

64. Match the following 
                List - I                    List - II 
a. Subject indexing          i. P.M. Rogget 
b. Keyboard indexing      ii. H. P. Luhn 
c. Automated indexing     iii. H. Ohlman 
d. SLIC indexing              iv. M.E. Sears  
                                          v. S.L. MeNold 
Codes : 
    (a) (b) (c) (d) 
A. (iii) (ii) (iv) (ii) 
B. (iv) (i) (ii) (iii) 
C. (ii) (i) (v) (iv) 
D. (v) (iii) (iv) (ii)
Ans: 

65.