Internet-Draft                                       H. Alvestrand

draft-alvestrand-directory-defs-00.txt

                                                       EDB Maxware

Target Category: Informational

                                                      October 1999

                                               Expires: April 2000







Definitions for talking about directories



Status of this Memo

     The file name of this memo is draft-alvestrand-directory-defs-
     00.txt

     This document is an Internet-Draft and is in full conformance with
     all provisions of Section 10 of RFC2026.

     Internet-Drafts are working documents of the Internet Engineering
     Task Force (IETF), its areas, and its working groups.  Note that
     other groups may also distribute working documents as Internet-
     Drafts.

     Internet-Drafts are draft documents valid for a maximum of six
     months and may be updated, replaced, or obsoleted by other
     documents at any time.  It is inappropriate to use Internet-
     Drafts as reference material or to cite them other than as "work
     in progress."

     The list of current Internet-Drafts can be accessed at
     http://www.ietf.org/ietf/1id-abstracts.txt

     The list of Internet-Draft Shadow Directories can be accessed at
     http://www.ietf.org/shadow.html.


Abstract

When discussing systems for making information accessible through the
Internet in standardized ways, it may be useful if the people
discussing have a common understanding of the terms they use.

This document is not intended to be either comprehensive or definitive,
but is intended to give some aid in mutual comprehension when
discussing information access methods to be incorporated into Internet
Standards-Track documents.


1. Introduction and basic terms

We suggest using the following terms for the remainder of this
document:

- Information: Something for which one can imagine multiple worlds
  where the item in question has different values. The fact of which
  particular value is true for this world is information.
Definitions for talking about directories        Harald Alvestrand

draft-alvestrand-directory-defs-00.txt          Expires April 2000


  This definition is extremely abstract, and intentionally so, but on a
  philosophical level, it's closely related to Shannon's signal-theory
  definition of "information".

- Datastore: Amount of information that is accessible through one or
  more access methods.

- User: Entity that may (try to) access information in a datastore.
  Note that no assumption is made that the user is animal, vegetable or
  mineral.

- Access method: Well-defined series of operations that will cause
  information known to a datastore to also be known to the user.

- Site: Entity that hosts all or part of a datastore, and makes it
  available through one or more access methods. A site may in various
  contexts be a machine, a datacenter, a network of datacenters, or a
  single device.




2. Dimensions of classification


2.1 Uniqueness and scope

Some information systems are global, in the sense that only one can
sensibly exist in the world.

Others are inherently local, in that each locality, site or even box
will run its own information store, independent of all others.



The following terms are suggested:

- Global datastore: A datastore that there can be only one of in the
  world. The world itself is a prime example; the public telephone
  system's number assignments is another.

- Local datastore: A class of datastore of which multiple instances can
  exist, each with information relevant to that particular datastore,
  with no need for coordination between them. ((( better term needed
  )))

- Centralized datastore: A datastore where all access to data has to
  pass through some single point of control (site).

- Distributed datastore: A datastore that is not centralized.

- Replicated datastore: A distributed datastore where all sites have
  the same information

- Cooperative datastore: A distributed datastore where not all sites
  have all the information, but where mechanisms exist to get the info
  to the requester, even when it is not available to the site
  originally asked



draft-alvestrand-lang-tags-v2-01.txt                     [Page 2]


Definitions for talking about directories        Harald Alvestrand

draft-alvestrand-directory-defs-00.txt          Expires April 2000


2.2 Search, Lookup, Query and Notify

A central consideration when describing datastores is the types of
method they offer to find information.

The chief classifications are:

- Lookup datastores require the user to know or guess some exact value
  before asking for information, sometimes called a "lookup key" and
  sometimes called a "name".
  They usually return a single piece of information as a response.

- Search datastores require the user to know some approximate value of
  some information. They usually return zero, one or more responses
  that match the information supplied according to some algorithm.

An orthogonal dimension has to do with time:

- Query datastores will answer a request with a response, and once that
  is over with, will do nothing more.

- Notify datastores will get a request from an user to have information
  returned at some later time when it becomes available, current or
  whatever, and will respond at that time with a notification that
  information is available.

- Subscription datastores are like notify datastores, but will transfer
  the actual information when available.



2.3 Consistency models

Consistency (or the lack thereof) is a property of distributed
datastores; for this particular discussion, we ignore the subject of
semantically inconsistent data (such as an assertion that a man is
blind and has a valid driver's license), and focus on the problem of
consistency where inconsistency is defined as having the same request,
using the same credentials, be answered with different data at
different sites.

Distributed datastores may have:

- Strict consistency, where the problem above never arises

- Strict internal consistency, where the replies always reflect a
  consistent picture of the total datastore, but some sites may reflect
  an earlier version of the datastore than others

- Loose, converging consistency, where different parts of the datastore
  may be updated at different times as seen from a single site, but the
  process is designed in such a way that if one stops making changes to
  the datastore, all sites will sooner or later present the same
  information

- Inconsistency, where no guarantee can be made whatsoever

One interesting variant is subset consistency, where the system is
consistent (according to one of the definitions above), but not all

draft-alvestrand-lang-tags-v2-01.txt                     [Page 3]


Definitions for talking about directories        Harald Alvestrand

draft-alvestrand-directory-defs-00.txt          Expires April 2000


questions will be answered at all sites; possibly because different
sites have different policies on what they make available (NetNews), or
because different sites only need different subsets of the "whole
picture" (BGP).


2.4 Security models

It's harder to describe security models in a few sentences than other
properties of information systems.

Some thoughts, though:

On trust in information: Why do we trust a piece of information to be
correct?

- Because it's in the datastore (and therefore must have been
  authorized).
  This is perimeter (or Eggshell) integrity.

- Because it contains internal integrity checks, usually involving
  digital signatures by verifiable identities
  This is item integrity; the granularity of the integrity and the
  ability to do integrity checks on the relationships between objects
  is extremely important and extremely hard to get right, as is
  establishing the root of the trust chain.

- Because it fits other available information, and causes the right
  things to happen when I use it.
  This is hopeful integrity.

Which integrity model to choose is a matter of evaluating the cost of
implementing the integrity, the cost of having the integrity break on
you, and the impact of cost of doing business.

On access to information, the usual categories apply:

- Open access: Anyone can get the information.

- Access because of what you are: Limited to "same network",
  "physically present" or "resolvable DNS name"

- Access because of who you are (in theory): username/password,
  certificatesà..
  These are then backed up by a layer specifying what the identity you
  have proven yourself to be has access to

- Access because of what you have: hardware tokens, smartcards,
  certificates, capability keysà.
  In this case, access is given to all who can present that credential,
  without caring about their identity.

The most common approaches are identity-based and open access.


2.5 Update models

Two words about update models:



draft-alvestrand-lang-tags-v2-01.txt                     [Page 4]


Definitions for talking about directories        Harald Alvestrand

draft-alvestrand-directory-defs-00.txt          Expires April 2000


- Read-only datastores have no standard means of changing the
  information in them. This is usually accomplished through some other
  interface than the standard interface.

- Read-mostly datastores are designed based on a theory that reads will
  greatly outnumber updates; this may, for instance, be reflected in
  relatively slow consistency-updating protocols.

- Read-write datastores assume that the updates and the read operations
  are of the same order of magnitude.


3. Classification of some real systems


3.1 The Domain Name System

The DNS is a global lookup datastore with loose, converging consistency
and query capability only.

It is either strictly read-only or read-mostly (with Dynamic DNS), has
an open access model, and mainy perimeter integrity (some would say
hopeful integrity). DNSSEC aims to give it item integrity.

If one opens up the box and looks at the relationship between primary
and secondary nameservers, that can be seen as a limited form of notify
capability, but this is not available to end-users of the total system.


3.2 The (imagined) X.500 Global Directory

X.500 was intended to be a global search datastore with loose,
converging consistency.

It was intended to be read-mostly, perimeter secure and query-capable.


3.3 The Global BGP Routing Information Database

The Global or top-level BGP routing information database is a global
read-write datastore with loose, converging subset consistency (not all
routes are carried everywhere) and very limited integrity control,
mostly intended to be perimeter integrity based on "access control
based on what you are".


3.4 The NetNews system

NetNews is a global read-write datastore with loose (non-converging)
subset consistency (not all sites carry all articles, and article
retention times differ). Between sites it offers subscription
capability; to users it offers both search and lookup functionality.


3.5 SNMP MIBs

An SNMP agent can be thought of as a local, centralized datastore
offering lookup functionalty.

With SNMPv3, it offers all kinds of access models, but mostly "access
because of what you have" seems popular.


draft-alvestrand-lang-tags-v2-01.txt                     [Page 5]


Definitions for talking about directories        Harald Alvestrand

draft-alvestrand-directory-defs-00.txt          Expires April 2000


3.6 The MBONE

MBONE can be thought of as a highly transient, read-write datastore
with subscription capability.





4. Security Considerations

Security is a very relevant question when considering information
access systems.

Some issues to consider are:

- Controlled access to information

- Controlled rights to update information

- Protection of the information path from provider to consumer

- With personal information, privacy issues

- Interactions between multiple ways to access the same information


5. Character set considerations

@


6. Acknowledgements


7. Author's Address

Harald Tveit Alvestrand
EDB Maxware
Pirsenteret
N-7462 TRONDHEIM
NORWAY

EMail: Harald.Alvestrand@maxware.no

Phone: +47 73 54 57 97
@

     References



Appendix A: List of language tags







draft-alvestrand-lang-tags-v2-01.txt                     [Page 6]