IETF media feature registration WG                        Graham Klyne
Internet draft                               Content Technologies Ltd.
                                                            5 May 1998
                                              Expires: 5 November 1998


             An algebra for describing media feature sets
              <draft-ietf-conneg-feature-algebra-01.txt>

Status of this memo

  This document is an Internet-Draft.  Internet-Drafts are working
  documents of the Internet Engineering Task Force (IETF), its areas,
  and its working groups.  Note that other groups may also distribute
  working documents as Internet-Drafts.

  Internet-Drafts are draft documents valid for a maximum of six
  months and may be updated, replaced, or obsoleted by other
  documents at any time.  It is inappropriate to use Internet-Drafts
  as reference material or to cite them other than as ``work in
  progress''.

  To view the entire list of current Internet-Drafts, please check
  the "1id-abstracts.txt" listing contained in the Internet-Drafts
  Shadow Directories on ftp.is.co.za (Africa), ftp.nordu.net
  (Northern Europe), ftp.nis.garr.it (Southern Europe), munnari.oz.au
  (Pacific Rim), ftp.ietf.org (US East Coast), or ftp.isi.edu
  (US West Coast).

  Copyright (C) 1998, The Internet Society

Abstract

  A number of Internet application protocols have a need to provide
  content negotiation for the resources with which they interact [1].
  A framework for such negotiation is described in [2].  Part of this
  framework is a way to describe the range of media features which
  can be handled by the sender, recipient or document transmission
  format of a message.  A format for a vocabulary of individual media
  features and procedures for registering media features are
  presented in [3].

  This document describes an algebra which can be used to define
  feature sets which are formed from combinations and relations
  involving individual media features.  Such feature sets are used to
  describe the media feature handling capabilities of message
  senders, recipients and file formats.









Klyne                                                         [Page 1]


Internet draft                                              5 May 1998
An algebra for describing media feature sets


Table of contents

1. Introduction.............................................2
  1.1 Structure of this document ...........................3
  1.2 Discussion of this document ..........................4
  1.3 Ammendment history ...................................4
  1.4 Unfinished business ..................................4
2. Terminology and definitions..............................4
3. Media feature values.....................................5
  3.1 Complexity of feature algebra ........................5
  3.2 Sufficiency of simple types ..........................6
     3.2.1 Unstructured data types..........................6
     3.2.2 Cartesian product................................6
     3.2.3 Disciminated union...............................7
     3.2.4 Array............................................7
     3.2.5 Powerset.........................................8
     3.2.6 Sequence.........................................8
4. Feature set predicates...................................8
  4.1 An algebra for data file format selection ............9
     4.1.1 Describing file format features..................10
       4.1.1.1 Feature ranges                               10
       4.1.1.2 Feature combinations                         11
          (A) Additional predicates                         11
          (B) Meta-features as groupings of other features  12
     4.1.2 Content, sender and recipient capabilities.......12
  4.2 Conclusion and proposal ..............................12
5. Indicating preferences...................................13
  5.1 Combining preferences ................................13
  5.2 Representing preferences .............................14
6. Feature set representation...............................15
  6.1 Text string representation ...........................16
  6.2 ASN.1 representation .................................17
7. Security considerations..................................18
8. Copyright................................................19
9. Acknowledgements.........................................19
10. References..............................................19
11. Author's address........................................21



1. Introduction

  A number of Internet application protocols have a need to provide
  content negotiation for the resources with which they interact [1].
  A framework for such negotiation is described in [2].  A part of
  this framework is a way to describe the range of media features
  which can be handled by the sender, recipient or document
  transmission format of a message.







Klyne                                                         [Page 2]


Internet draft                                              5 May 1998
An algebra for describing media feature sets


  Descriptions of media feature capabilities need to be based upon
  some underlying vocabulary of individual media features. A format
  for such a vocabulary and procedures for registering media features
  are presented in [3].

  This document defines an algebra which can be used to describe
  feature sets which are formed from combinations and relations
  involving individual media features.  Such feature sets are used to
  describe the media handling capabilities of message senders,
  recipients and file formats.

  The feature set algebra is built around the principle of using
  feature set predicates as mathematical relations which define
  constraints on feature handling capabilities.  The idea is that the
  same form of feature set expression can be used to describe sender,
  receiver and file format capabilities.  This has been loosely
  modelled on the way that the Prolog programming language uses Horn
  Clauses to describe a set of result values.

  In developing the algebra, axamples are given using notation drawn
  from the C and Prolog programming languages.  A syntax for
  expressing feature predicates is suggested, based on LDAP search
  filters.

1.1 Structure of this document

  The main part of this draft addresses the following main areas:

  Section 2 introduces and references some terms which are used with
  special meaning.

  Section 3 discusses constraints on the data types allowed for
  individual media feature values.

  Section 4 introduces and describes the algebra used to construct
  feature set descriptions with expressions containing media
  features.  The first part of this section contains a development of
  the ideas, and the second part contains the conclusions and
  proposed algebra.

  Section 5 introduces and describes extensions to the algebra for
  indicating preferences between different feature sets.

  Section 6 contains a description of recommended representations for
  describing feature sets based on the previously-described algebra.










Klyne                                                         [Page 3]


Internet draft                                              5 May 1998
An algebra for describing media feature sets


1.2 Discussion of this document

  Discussion of this document should take place on the content
  negotiation and media feature reagistration mailing list hosted by
  the Internet Mail Consortium (IMC):

  Please send comments regarding this document to:

      ietf-medfree@imc.org

  To subscribe to this list, send a message with the body 'subscribe'
  to "ietf-medfree-request@imc.org".

  To see what has gone on before you subscribed, please see the
  mailing list archive at:

      http://www.imc.org/ietf-medfree/

1.3 Ammendment history

  00a       11-Mar-1998
            Document initially created.

  01a       05-May-1998
            Mainly-editorial revision of sections describing the
            feature types and algebra.  Added section on indicating
            preferences.  Added section describing feature predicate
            syntax.  Added to security considerations (based on fax
            negotiation scenarios draft).

1.4 Unfinished business

  .  Array values: are they needed? (section 3.2.4)

  .  Use of unknown data types for feature values (section 6)

  .  Is a test for presence of a feature required? (section 6)

  .  Should ASN.1 representation be pursued?  If so, should it be
     aligned with LDAP filter representation? (section 6.2)


2. Terminology and definitions

  Feature Collection
            is a collection of different media features and
            associated values.  This might be viewed as describing a
            specific rendering of a specific instance of a document
            or resource by a specific recipient.






Klyne                                                         [Page 4]


Internet draft                                              5 May 1998
An algebra for describing media feature sets


  Feature Set
            is a set of zero, one or more feature collections.

  Feature set predicate
            A function of an arbitrary feature collection value which
            returns a Boolean result.  A TRUE result is taken to mean
            that the corresponding feature collection belongs to some
            set of media feature handling capabilities defined by the
            predicate.

  Other terms used in this draft are defined in [2].


3. Media feature values

  This document assumes that individual media feature values are
  simple atomic values:

  .  Boolean values

  .  Enumerated values

  .  Numeric values

  More complex media feature values might be accommodated, but they
  would (a) be undesirable because they would complicate the algebra,
  and (b) are not necessary.

  These statements are justified in the following sub-sections.

3.1 Complexity of feature algebra

  Statement (a) above is justified as follows: predicates constructed
  as expressions containing media feature values must ultimately
  resolve to a logical combination of feature value tests.

  A full range of simple tests for all of the data types listed above
  can be performed based on just two fundamental operations: equality
  and less-than.  All other meaningful tests can be constructed as
  predicates incorporating these two basic tests.

  For example:
     ( a != b )  iff  !( a == b )
     ( a <= b )  iff  !( b < a )
     ( a > b  )  iff   ( b < a )
     ( a >= b )  iff  !( a < b )









Klyne                                                         [Page 5]


Internet draft                                              5 May 1998
An algebra for describing media feature sets


  If additional (composite) data types are introduced, then
  additional operators must be introduced to test their component
  parts:  the addition of just one further comparison operator
  increases the number of such operators by 50%.

3.2 Sufficiency of simple types

  To justify statement (b), let us first review the range of
  composite data types that might reasonably be considered.

  In 1972, a paper "Notes on data structuring" by C. A. R. Hoare was
  published in the book "Structured Programming" [4].  This was an
  early formalization of data types used in programming languages,
  and its content has formed a sufficient basis for describing the
  data types in almost every programming language which has been
  developed.  This gives good grounds to believe that the type
  framework is also sufficient for media features.

  The data types covered by Hoare's paper are:

  .  Unstructured data types: (integer, real, enumeration, ordered
     enumeration, subranges).

  .  Cartesian product (e.g. C 'struct').

  .  Discriminated union (e.g. C 'union').

  .  Array.

  .  Powerset (e.g. Pascal 'SET OF').

  .  Sequence (e.g. C string, Pascal 'FILE OF').

  To demonstrate sufficiency of simple types for media features we
  must show that the feature-set defining properties of these
  composite types can be captured using predicates on the simple
  simple types described previously.

3.2.1 Unstructured data types

  The unstructured data types noted correspond closely to, and can be
  represented by the proposed simple value types for media features.

3.2.2 Cartesian product

  A cartesian product value (e.g. resolution=[x,y]) is easily
  captured as a collection of two or more separately named media
  features (e.g. x-resolution=x, y-resolution=y).







Klyne                                                         [Page 6]


Internet draft                                              5 May 1998
An algebra for describing media feature sets


3.2.3 Disciminated union

  A discriminated union value is an either/or type of choice.  For
  example, a given workstation might be able to display 16K colours
  at 1024x768 resolution, OR 256 colours at 1280x1024 resolution.

  These possibilities are captured by a logical-OR of predicates:
     ( ( x-resolution <= 1024 ) &&
       ( y-resolution <= 768  ) &&
       ( colours <= 16384     ) ) ||
     ( ( x-resolution <= 1280 ) &&
       ( y-resolution <= 1024 ) &&
       ( colours <= 256       ) )

3.2.4 Array

  An array represents a mapping from one data type to another.  For
  example, the availability of pens in a pen plotter might be
  represented by an array which maps a pen number to a colour.

  If the array index which forms the basis for defining a feature set
  is assumed to be a constant, then each member can be designated by
  a feature name which incorporates the index value.  For example:
  Pen-1=black, pen-2=red, etc.

  Another example where an array might describe a media feature is a
  colour palette:  an array is used to associate a colour value (in
  terms of RGB or some other colour model) with a colour index value.
  In this case is is possible to envisage a requirement for a
  particular colour to be loaded in the palette without any knowledge
  of the index which maps to it.

  In this case, the colour might be treated as a named Boolean
  attribute:  if TRUE then that colour is deemed to be available in
  the pallette

  Feature selection based on a variable array index is more
  difficult, but it is believed that this is not required for media
  selection.

  [[I cannot think of any example of feature selection which involves
  a variable index into an array.  If such a feature is presented, an
  array type could be added to the set of allowable media feature
  types, and an array selection operator added to the algebra.]]











Klyne                                                         [Page 7]


Internet draft                                              5 May 1998
An algebra for describing media feature sets


3.2.5 Powerset

  A powerset is a collection of zero, one or more values from some
  base set of values.  A colour palette may be viewed as a powerset
  of colour values, or the fonts available in a printer as a powerset
  of all available fonts.

  A powerset is very easily represented by a separate Boolean-valued
  feature for each member of the base set.  The value TRUE indicates
  that the corresonding value is a member of the powerset value.

3.2.6 Sequence

  A sequence is a list of values from some base set of values, which
  are accessed sequentially.

  A sequence can be modelled by an array if one assumes integer index
  values starting at (say) 1 and incrementing by 1 for each
  successive element of the sequence.

  Thus, the considerations described above relating to array values
  can be considered as also applying (in part) to sequence values.
  That is, if arrays are deemed to be adequately handled, then
  sequence values too can be handled.


4. Feature set predicates

  A model for data file selection is proposed, based on relational
  set definition and subset selection, using elements of the Prolog
  programming language [5] as a descriptive notation for this
  purpose.

       NOTE: The use of Prolog as a syntax for feature
       description is NOT being proposed;  rather, the Prolog-
       like notation is used to develop the semantics of an
       algebra.  Once the semantics have been developed, they
       can be mapped to some convenient syntax.

  For the purposes of developing this algebra, examples are drawn
  from the media features described in "Media Features for Display,
  Print, and Fax" [6], which in summary are:

     pix-x=n      (Image size, in pixels)
     pix-y=m

     res-x=n      (Image resolution, pixels per inch)
     res-y=m







Klyne                                                         [Page 8]


Internet draft                                              5 May 1998
An algebra for describing media feature sets


     UA-media= screen|stationary|transparency|envelope|
               continuous-long
     papersize= na-letter|iso-A4|iso-B4|iso-A3|na-legal

     color=n      (Colour depth in bits)
     grey=n       (Grey scale depth in bits)

4.1 An algebra for data file format selection

  The basic idea proposed here is that a feature capability of the
  original content, sender, data file format or recipient is
  represented as a predicate applied to a collection of feature
  values.  Under universal quantification (i.e. selecting all
  possible values that satisfy it), a predicate indicates a range of
  possible combinations of feature values).

  This idea is inherent in Prolog clause notation, which is used in
  the example below to describe a predicate
  'acceptable_file_format(File)' which yields a set of possible file
  transfer formats using other predicates which indicate the file
  formats available to the sender and feature capabilities of the
  file format, original content:

     acceptable_file_format(File) :-
       sender_available_file_format(File),
       match_format(File).

     match_format(File) :-
       pix_x(File,Px), content_pix_x(Px), recipient_pix_x(Px),
       pix_y(File,Py), content_pix_x(Py), recipient_pix_y(Py),
       res_x(File,Rx), content_res_x(Rx), recipient_res_x(Rx),
       res_y(File,Ry), content_res_y(Ry), recipient_res_y(Ry),
       colour(File,C), content_colour(C), recipient_colour(C),
       grey(File,G),   content_grey(G),  recipient_grey(G),
       ua_media(File,M),
          content_ua_media(M),
          recipient_ua_media(M),
       papersize(File,P),
          content_papersize(P),
          recipient_papersize(P).

  Essentially, this selects a set of file transfer formats from those
  available ('sender_available_file_format'), choosing any whose
  feature capabilities have a non-empty intersection with the feature
  capabilities of the original content and the recipient.










Klyne                                                         [Page 9]


Internet draft                                              5 May 1998
An algebra for describing media feature sets


4.1.1 Describing file format features

  The above framework suggests a file format is described by a set of
  feature values.  As an abstract theory, this works fine but for
  practical use it has a couple of problems:

  (a)  description of features with a large number of possibilities

  (b)  describing features which are supported in specific
       combinations

  A typical case of (a) would be where a feature (e.g. size of image
  in pixels) can take any value from a range.  To present and test
  each value separately is not a practical proposition, even if it
  were possible.  (A guide here as to what constitutes a practical
  approach is to make a judgement about the feasibility of writing
  the corresponding Prolog program.)

  A typical case of (b) would be where different values for certain
  features can occur only in combinations (e.g. allowable
  combinations of resolution and colour depth on a given video
  display).  If the features are treated independently as suggested
  by the framework above, all possible combinations would be allowed,
  rather than the specifically allowable combinations.

4.1.1.1 Feature ranges

  The first issue can be addressed by considering the type of value
  which can represent the allowed features of a data file format.
  The features of a specific data file are represented as values from
  an enumeration (e.g. ua_media, papersize), or a numeric values
  (integer or rational).  The description of allowable file format
  feature needs to represent all the allowable values.

  The Prolog clauses used above to describe file format features
  already allow for multiple enumerated values.  Each acts as a
  mathematical relation to select a subset of the set of file values
  allowed by the preceding predicates.

  Section 3 of this document describes proposed media feature value
  types.

  For numeric feature values, a sequence of two numbers to represent
  a closed interval is suggested, where either value may be replaced
  by an empty list to indicate no limiting value.  Thus:

     [m,n]  => { x : m <= x <= n }
     [m,[]] => { x : m <= x }
     [[],n] => { x : x <= n }






Klyne                                                        [Page 10]


Internet draft                                              5 May 1998
An algebra for describing media feature sets


  The following Prolog could be used to describe such range matching:

     feature_match(X,[[],[]]).
     feature_match(X,[L,[]]) :- L <= X.
     feature_match(X,[[],H]) :- X <= H.
     feature_match(X,[L,H])  :- L <= X, X <= H.
     feature_match(X,X).

  (This example strectches standard Prolog, which does not support
  non-integer numbers.  The final clause allows 'feature_match' to
  deal with equality matching for the normal enumerated value case.)

4.1.1.2 Feature combinations

  Representing allowed combinations of features is trickier.  Two
  possible approaches might be considered:

  (a)  use additional predicates to impose relationships between
       features.

  (b)  allow meta-features which are groupings of other features.

(A) Additional predicates

  If x- and y- resolutions were to be constrained to square or semi-
  square aspect-ratios, the following predicates might be added to
  the feature set description:

     ( feature_match(Rx,Ry) ;
       feature_match(Rx,2*Ry) ;
       feature_match(2*Rx,Ry) ),
     feature_match(Rx,[72,600]),
     feature_match(Ry,[72,600])

  (where the last two constraints might be imposed by the 'res_x' and
  'res_y' predicates).

  Another example might be:

     ( ( feature_match(Px,640),  feature_match(Py,480) ) ;
       ( feature_match(Px,600),  feature_match(Py,800) ) ;
       ( feature_match(Px,1024), feature_match(Py,768) ) )

  This is based on the predicates 'pix_x(File,Px)', 'pix_y(File,Py)',
  'res_x(File,Rx)' and 'res_y(File,Ry)' from the initial framework
  above.)









Klyne                                                        [Page 11]


Internet draft                                              5 May 1998
An algebra for describing media feature sets


(B) Meta-features as groupings of other features

  Applying this to the above examples would replace:

     pix_x(File,Px),
     pix_y(File,Py),
     res_x(File,Rx),
     res_y(File,Ry),

  with the meta-features 'pix' and 'res':

     pix(File,[Px,Py]),
     res(File,[Rx,Ry])

  where:

     pix(File,[640, 480]).
     pix(File,[800, 600]).
     pix(File,[1024,768]).
     res(File,[Rx,Ry]) :-
       feature_match(Rx,[72,600]),
       feature_match(Ry,[72,600]),
       ( feature_match(Rx,Ry) ;
         feature_match(Rx,2*Ry) ;
         feature_match(2*Rx,Ry) ).

  On closer examination, these two options turn out to be pretty much
  the same thing:  a requirement to impose additional constraint
  predicates on a file feature set.  They differ only in where the
  predicates are applied.

  This all suggests that file format capabilities can be described by
  feature set predicates:  arbitrary logical expressions using AND,
  OR, NOT logical combining operators, and media feature value
  matching.

4.1.2 Content, sender and recipient capabilities

  It has already been suggested that these are represented as
  predicates on the feature set of a particular data file.

  Having also shown that these same predicates can represent
  constraints on feature combinations, we proceed directly to a
  proposal that everything is represented by predicates.

4.2 Conclusion and proposal

  Data file features, original content features, sender features and
  recipient features (and user features) can all be represented as
  predicates.





Klyne                                                        [Page 12]


Internet draft                                              5 May 1998
An algebra for describing media feature sets


  A key insight, which points to this conclusion, is that a
  collection of feature values can be viewed as describing a specific
  document rendered by a specific recipient.  The capabilities that
  we wish to describe, be they sender, file format, recipient or
  other capabilities, are sets of such feature collections, with the
  potential to ultimately render using any of the feature value
  collections in the set.

  This raises a terminology problem, because the term "feature set"
  has been used to mean a collection of specific feature values and a
  range of possible feature values.  Thus the more restricted
  definitions of "feature collection" and "feature set" which appear
  in the terminology section of this document.

  Original content, data files and recipients (and users) all embody
  the potential capability to deal with a "feature set".  One of the
  aims of content negotiation is to select an available data file
  format (availability being circumscribed by the original content
  and sender capabilities) whose feature set intersection with the
  recipient feature set is non-empty.  (The further issue of
  preference being deferred for later consideration.)

  The concept of a mathematical relation as a subset defined by a
  predicate can be used to define feature sets, using universal
  quantification (i.e. using the predicate to select from some
  notional universe of all possible feature collections).

  Thus, a common framework of predicates can be used to represent the
  feature capabilities of original content, data file formats,
  recipients and any other participating entity which may impose
  constraints on the usable feature sets.

  Within this framework, it is sufficient to represent individual
  feature values as enumerated values or numeric ranges.  The thesis
  in section 3 of his document, combined with a study of "Media
  Features for Display, Print, and Fax" [6], indicate that more
  complex media feature values can be handled by predicates.


5. Indicating preferences

5.1 Combining preferences

  The general problem of describing and combining preferences among
  feature sets is very much more complex than simply describing
  allowable feature sets.  For example, given two feature sets:
     {A1,B1}
     {A2,B2}







Klyne                                                        [Page 13]


Internet draft                                              5 May 1998
An algebra for describing media feature sets


  where:
     A1 is preferred over A2
     B2 is preferred over B1

  which of the feature sets is preferred?  In the absence of
  additional information or assumptions, there is no generally
  satisfactory answer to this.

  The proposed resolution of this issue is simply to assert that
  preference information cannot be combined.  Applied to the above
  example, any preference information about A1 in relation to A2, or
  B1 in relation to B2 is not presumed to convey any information
  about preference of {A1,B1} in relation to {A2,B2}.  (This approach
  was selected as being the simplest among those considered, and
  because there is no clear need for anything more).

  In practical terms, this restricts the aplication of preference
  information to top-level predicate clauses.  A top-level clause
  completely defines an allowable feature set;  clauses combined by
  logical-AND operators cannot be top-level clauses.

5.2 Representing preferences

  A convenient way to represent preferences is by numeric "quality
  values", as used in HTTP "Accept" headers, etc. (see RFC 2068 [9],
  section 3.9]).

  It has been suggested that numeric quality values, as used in some
  HTTP negotiations, are misleading and are really just a way of
  ranking options.  Attempts to perform arithmetic on quality values
  do seem to degenerate into meaningless juggling of numbers.

  Numeric quality values in the range 0 to 1 (as defined by RFC 2068
  [9], section 3.9) are used to rank feature sets according to
  preference.  Higher values are preferred over lower values, and
  equal values are presumed to be equally preferred.  Beyond this,
  the actual number used has no significance, and should not be used
  as a basis for any arithmetic operation.

  In the absence of any explcitly applied quality value, a value of
  "1" is assumed, suggesting an option which is equally or more
  preferred than any other.

  This approach can be represented in the Prolog-based framework of
  an earlier example as follows:

     match_format(File,Qvalue) :-
       match_format(File),
       Qvalue=1.






Klyne                                                        [Page 14]


Internet draft                                              5 May 1998
An algebra for describing media feature sets


     match_format(File) :-
       pix(File,[1024,768],
       res(File,[Rx,Ry]).

     match_format(File,Q) :-
       pix(File,[800, 600]),
       res(File,[Rx,Ry]),
       Qvalue=0.9.

     match_format(File,Q) :-
       pix(File,[640, 480]).
       res(File,[Rx,Ry]),
       Qvalue=0.8.

     res(File,[Rx,Ry]) :-
       feature_match(Rx,[72,600]),
       feature_match(Ry,[72,600]),
       ( feature_match(Rx,Ry) ;
         feature_match(Rx,2*Ry) ;
         feature_match(2*Rx,Ry) ).

  This example applies image preference ranking based solely on the
  size of the image, provided that the resolution constrains are
  satisfied.


6. Feature set representation

  The foregoing sections have desribed a framework and semantics for
  defining feature sets with predicates applied to feature
  collections.  This section proposes some concrete representations
  for these feature setpredicates.

  Rather than invent an all-new notation, this proposal adapts a
  notation already defined for directory access [7,8].  Observe that
  a feature collection is similar to a directory entry, in that it
  consists of a collection of named values.  Further, the semantics
  of the mechanism for selecting feature collections from a feature
  set is in most respects identical to selection of directory entries
  from a directory.

  Differences between directory selection (per [7]) and feature set
  selection described previously are:

  .  Directory selection provides substring-, approximate- and
     extensible- matching for attribute values.  Directory selection
     may also be based on the presence of an attribute without regard
     to its value.







Klyne                                                        [Page 15]


Internet draft                                              5 May 1998
An algebra for describing media feature sets


  .  Directory selection provides for matching rules which are
     dependent upon the declared data type of an attribute value.

  .  Feature selection provides for the association of a quality value
     with a top-level feature predicate as a way of ranking the
     selected value collections.

  The idea of substring matching does not seem to be relevant to
  feature set selection, and is excluded from these proposals.

  The idea of extensible matching and matching rules dependent upon
  data types are facets of a problem not addressed by this memo, but
  which do not necessarily affect the feature selection syntax.  An
  aspect which might have a bearing on the syntax would be a
  requirement to specify a matching rule explicitly as part of a
  selection expression.

  Testing for the presence of a feature may be useful in some
  circumstances, but does not sit comfortably within the semantic
  framework.  Feature sets are described by universal quantification
  over predicates, and the absence of reference to a given feature
  means the set is not constrained by that feature.  Against this, it
  is difficult to define what might be meant by "presence" of a
  feature, so this option is not included in these proposals.

6.1 Text string representation

  The text representation of a feature set is closely based on RFC
  2254 "The String Representation of LDAP Search Filters" [8],
  excluding those elements not relevant to feature set selection
  (discussed above), and adding options to associate quality values
  with top-level predicates.

  The format of a feature predicate is defined by the production for
  "filter" in the following, using the syntax notation of [10]:

     filter     = "(" filtercomp [ ";" "q=" qvalue ] )"
     qvalue     = ( "0" [ "." 0*3DIGIT ] )
                / ( "1" [ "." 0*3("0") ] )
     filtercomp = and / or / not / item
     and        = "&" filterlist
     or         = "|" filterlist
     not        = "!" filter
     filterlist = 1*filter
     item       = simple










Klyne                                                        [Page 16]


Internet draft                                              5 May 1998
An algebra for describing media feature sets


     simple     = attr filtertype value
     filtertype = equal / greater / less
     equal      = "="
     approx     = "~="
     greater    = ">="
     less       = "<="
     attr       = <Feature tag, as defined in [3]>
     value      = <Feature value, per the named feature tag>

  As described, the syntax permits a quality value to be attached to
  any "filter" value in the predicate (not just top-level values).
  But it should be noted that values which are enclosed by "not" or
  "and" constructs are not visible to the enclosing context.

  If a given feature collection is matched by more than one "filter"
  in an "or" clause, the highest associated quality value is applied.

       NOTE

       The flexible approach to allowable quality values in this
       syntax has been adopted for two reasons:  (a) to make it
       easy to combine separately constructed feature
       predicates, and (b) to allow that the mechanism used for
       quality values might, in future, be generalized to an
       extensible tagging mechanism (for example, to incorporate
       a conceivable requirement to explicitly specify a
       matching rule).

6.2 ASN.1 representation

  Should it be required, the LDAP search filter model provides the
  basis for an ASN.1 representation of a feature predicate.

  The following ASN.1 is adapted from RFC 2251 "Lightweight Directory
  Access Protocol (v3)" [7] (also contained in RFC 2254 "The String
  Representation of LDAP Search Filters" [8]) to mirror the
  adaptation of the string representation presented above

  [[The following ASN.1 fragment does not include provision for
  quality value (and possibly other parameter values) to be attached
  to a filter value.  Also, if using an ASN.1-derived representation
  it would seem more appropriate to use an ISO object identifier for
  the feature tag, and an appropriate ASN.1 type for the feature
  value.  Such changes would remove any semblance of compatibility
  with LDAP, but that may not matter.]]










Klyne                                                        [Page 17]


Internet draft                                              5 May 1998
An algebra for describing media feature sets


     Filter ::= CHOICE {
             and                [0] SET OF Filter,
             or                 [1] SET OF Filter,
             not                [2] Filter,
             equalityMatch      [3] AttributeValueAssertion,
             greaterOrEqual     [5] AttributeValueAssertion,
             lessOrEqual        [6] AttributeValueAssertion
     }

     AttributeValueAssertion ::= SEQUENCE {
             featureTag         OCTET STRING,
             featureValue       OCTET STRING
     }


7. Security considerations

  Some security considerations for content negotiation are raised in
  [1,2,3].

  The following are primary security concerns for capability
  identification mechanisms:

  .  Unintentional disclosure of private information through the
     announcement of capabilities or user preferences.

  .  Disruption to system operation caused by accidental or malicious
     provision of incorrect capability information.

  .  Use of a capability identification mechanism might be used to
     probe a network (e.g. by identifying specific hosts used, and
     exploiting their known weaknesses).

  The most contentious security concerns are raised by mechanisms
  which automatically send capability identification data in response
  to a query from some unknown system.  Use of directory services
  (based on LDAP [7], etc.) seem to be less problematic because
  proper authentication mechanisms are available.

  Mechanisms which provide capability information when sending a
  message are less contentious, presumably because some intention can
  be inferred that person whose details are disclosed wishes to
  communicate with the recipient of those details.  This does not,
  however, solve problems of spoofed supply of incorrect capability
  information.

  The use of format converting gateways may prove problematic because
  such systems would tend to defeat any message integrity and
  authenticity checking mechanisms that are employed.






Klyne                                                        [Page 18]


Internet draft                                              5 May 1998
An algebra for describing media feature sets


8. Copyright

  Copyright (C) The Internet Society 1998.  All Rights Reserved.

  This document and translations of it may be copied and furnished to
  others, and derivative works that comment on or otherwise explain
  it or assist in its implementation may be prepared, copied,
  published and distributed, in whole or in part, without restriction
  of any kind, provided that the above copyright notice and this
  paragraph are included on all such copies and derivative works.
  However, this document itself may not be modified in any way, such
  as by removing the copyright notice or references to the Internet
  Society or other Internet organizations, except as needed for the
  purpose of developing Internet standards in which case the
  procedures for copyrights defined in the Internet Standards process
  must be followed, or as required to translate it into languages
  other than English.

  The limited permissions granted above are perpetual and will not be
  revoked by the Internet Society or its successors or assigns.

  This document and the information contained herein is provided on
  an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET
  ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR
  IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
  THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
  WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


9. Acknowledgements

  My thanks to Larry Masinter for demonstrating to me the breadth of
  the media feature issue, and encouraging me to air my early ideas.

  Early discussions of ideas on the IETF-HTTP and IETF-FAX discussion
  lists led to useful inputs also from Koen Holtman, Ted Hardie and
  Dan Wing.

  The debate later moved to the IETF conneg WG mailing list, where Al
  Gilman was particularly helpful in helping me to refine the feature
  set algebra.  Several ideas for indicating preferences were
  suggested by Larry Masinter.


10. References

[1]  "Scenarios for the Delivery of Negotiated Content"
     T. Hardie, NASA Network Information Center
     Internet draft: <draft-ietf-http-negotiate-scenario-02.txt>
     Work in progress, November 1997.





Klyne                                                        [Page 19]


Internet draft                                              5 May 1998
An algebra for describing media feature sets


[2]  "Requirements for protocol-independent content negotiation"
     G. Klyne, Integralis Ltd.
     Internet draft: <draft-ietf-conneg-requirements-00.txt>
     Work in progress, March 1998.

[3]  "Content feature tag registration procedures"
     Koen Holtman, TUE
     Andrew Mutz, Hewlett-Packard
     Ted Hardie, NASA
     Internet draft: <draft-ietf-http-feature-reg-03.txt>
     Work in progress, November 1997.

[4]  "Notes on data structuring"
     C. A. R. Hoare,
     in "Structured Programming"
     Academic Press, APIC Studies in Data Processing No. 8
     ISBN 0-12-200550-3 / 0-12-200556-2
     1972.

[5]  "Programming in Prolog" (2nd edition)
     W. F. Clocksin and C. S. Mellish,
     Springer Verlag
     ISBN 3-540-15011-0 / 0-387-15011-0
     1984.

[6]  "Media Features for Display, Print, and Fax"
     Larry Masinter, Xerox PARC
     Koen Holtman, TUE
     Andrew Mutz, Hewlett-Packard
     Dan Wing, Cisco Systems
     Internet draft: <draft-masinter-media-features-02.txt>
     Work in progress, January 1998.

[7]  RFC 2251, "Lightweight Directory Access Protocol (v3)"
     M. Wahl, Critical Angle Inc.
     T. Howes, Netscape Communications Corp.
     S. Kille, Isode Limited
     December 1997.

[8]  RFC 2254, "The String Representation of LDAP Search Filters"
     T. Howes, Netscape Communications Corp.
     December 1997.

[9]  RFC 2068, "Hyptertext Transfer Protocol -- HTTP/1.1"
     R. Fielding, UC Irvine
     J. Gettys,
     J. Mogul, DEC
     H. Frytyk,
     T. Berners-Lee, MIT/LCS
     January 1997.





Klyne                                                        [Page 20]


Internet draft                                              5 May 1998
An algebra for describing media feature sets


[10] RFC 2234, "Augmented BNF for Syntax Specifications: ABNF"
     D. Crocker (editor), Internet Mail Consortium
     P. Overell, Demon Internet Ltd.
     November 1997.


11. Author's address

  Graham Klyne
  Content Technologies Ltd
  Forum 1
  Station Road
  Theale
  Reading, RG7 4RA
  United Kingdom

  Telephone: +44 118 930 1300

  Facsimile: +44 118 930 1301

  E-mail: GK@ACM.ORG


































Klyne                                                        [Page 21]