The Architecture of the Common Indexing Protocol (CIP)
RFC 2651

Document Type RFC - Proposed Standard (August 1999; No errata)
Last updated 2013-03-02
Stream IETF
Formats plain text pdf html bibtex
Stream WG state (None)
Document shepherd No shepherd assigned
IESG IESG state RFC 2651 (Proposed Standard)
Consensus Boilerplate Unknown
Telechat date
Responsible AD (None)
Send notices to (None)
Network Working Group                                           J. Allen
Request for Comments: 2651                                WebTV Networks
Category: Standards Track                                    M. Mealling
                                                 Network Solutions, Inc.
                                                             August 1999

         The Architecture of the Common Indexing Protocol (CIP)

Status of this Memo

   This document specifies an Internet standards track protocol for the
   Internet community, and requests discussion and suggestions for
   improvements.  Please refer to the current edition of the "Internet
   Official Protocol Standards" (STD 1) for the standardization state
   and status of this protocol.  Distribution of this memo is unlimited.

Copyright Notice

   Copyright (C) The Internet Society (1999).  All Rights Reserved.

Abstract

   The Common Indexing Protocol (CIP) is used to pass indexing
   information from server to server in order to facilitate query
   routing. Query routing is the process of redirecting and replicating
   queries through a distributed database system towards servers holding
   the desired results. This document describes the CIP framework,
   including its architecture and the protocol specifics of exchanging
   indices.

1. Introduction

1.1. History and Motivation

   The Common Indexing Protocol (CIP) is an evolution and refinement of
   distributed indexing concepts first introduced in the Whois++
   Directory Service [RFC1913, RFC1914]. While indexing proved useful in
   that system to promote query routing, the centroid index object which
   is passed among Whois++ servers is specifically designed for
   template-based databases searchable by token-based matching.  With
   alternative index objects, the index-passing technology will prove
   useful to many more application domains, not simply Directory
   Services and those applications which can be cast into the form of
   template collections.

Allen & Mealling            Standards Track                     [Page 1]
RFC 2651                  The CIP Architecture               August 1999

   The indexing part of Whois++ is integrated with the data access
   protocol. The goal in designing CIP is to extract the indexing
   portion of Whois++, while abstracting the index objects to apply more
   broadly to information retrieval. In addition, another kind of
   technology reuse has been undertaken by converting the ad-hoc data
   representations used by Whois++ into structures based on the MIME
   specification for structured Internet mail.

   Whois++ used a version number field in centroid objects to facilitate
   future growth. The initial version was "1". Version 1 of CIP (then
   embedded in Whois++, and not referred to separately as CIP) had
   support for only ISO-8895-1 characters, and for only the centroid
   index object type.

   Version 2 of the Whois++ centroid was used in the Digger software by
   Bunyip Information Systems to notify recipients that the centroid
   carried extra character set information. Digger's centroids can carry
   UTF-8 encoded 16-bit Unicode characters, or ISO-8859-1 characters,
   determined by a field in the headers.

   This specification is for CIP version 3.  Version 3 is a major
   overhaul to the protocol.  However, by using of a short negotiation
   sequence, CIP version 3 servers can interoperate with earlier servers
   in an index-passing mesh.

   For unclear terms the reader is referred to the glossary in Appendix
   A.

1.2 CIP's place in the Information Retrieval world

   CIP facilitates query routing. CIP is a protocol used between servers
   in a network to pass hints which make data access by clients at a
   later date more efficient. Query routing is the act of redirecting
   and replicating queries through a distributed database system towards
   the servers holding the actual results via reference to indexing
   information.

   CIP is a "backend" protocol -- it is implemented in and "spoken" only
   among network servers. These same servers must also speak some kind
   of data access protocol to communicate with clients. During query
   resolution in the native protocol implementation, the server will
   refer to the indexing information collected by the CIP implementation
   for guidance on how to route the query.

   Data access protocols used with CIP must have some provision for
   control information in the form of a referral. The syntax and
   semantics of these referrals are outside the scope of this
   specification.

Allen & Mealling            Standards Track                     [Page 2]
RFC 2651                  The CIP Architecture               August 1999

2. Related Documents

   This document is one of three documents. This document describes the
   fundamental concepts and framework of CIP.

   The document "MIME Object Definitions for the Common Indexing
   Protocol" [CIP-MIME] describes the MIME objects that make up the
Show full document text