Internet Draft                                               A. Beck
                                                             M. Hofmann
   Expires: January 2002                            Lucent Technologies
   Document: draft-beck-opes-irml-01.txt
                                                          July 20, 2001
   Category: Informational


       IRML: A Rule Specification Language for Intermediary Services


Status of this Memo

   This document is an Internet-Draft and is in full conformance
   with all provisions of Section 10 of RFC2026 [1].

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six
   months and may be updated, replaced, or obsoleted by other documents
   at any time.  It is inappropriate to use Internet-Drafts as
   reference material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
        http://www.ietf.org/ietf/1id-abstracts.txt
   The list of Internet-Draft Shadow Directories can be accessed at
        http://www.ietf.org/shadow.html.


Abstract

   OPES intermediary services are a new class of applications running
   on network edge intermediary devices like caching proxies or
   dedicated service execution servers. They are described in [2] and
   [3]. These intermediary services can be executed on behalf of
   clients or content providers. In order to control the execution of
   intermediary services, both parties provide service-specific rules
   which trigger services if rule conditions are met for incoming or
   outgoing messages.

   The Intermediary Rule Markup Language (IRML) is an XML-based
   language that can be used to describe service-specific execution
   rules. It allows clients and content providers to specify when and
   how to execute OPES intermediary services.

Table of Contents

   1  Terminology ....................................................3
   2  Problem Description and Goals ..................................3
   3  IRML Syntax and Grammar ........................................3
   Beck, Hofmann         Expires January 2001                 [Page 1]


   Internet Draft                IRML                        July 2001

   3.1  The "rulemodule" Element .....................................4
   3.1.1 The "owner" Element .........................................4
   3.1.2 The "name" Element ..........................................4
   3.1.3 The "id" Element ............................................5
   3.2  The "protocol" Element .......................................5
   3.3  Examples of the "owner", "name", "id", "protocol" Elements ...5
   3.4  The "rule" Element ...........................................6
   3.4.1 The "property" Element ......................................7
   3.4.2 The "action" Element ........................................8
   3.4.3 Examples of the "rule", "property" and "action" Elements ....9
   4  Order of Service Execution .....................................9
   5  Security Considerations .......................................10
   6  Acknowledgement ...............................................10
   7  References ....................................................10
   Author's Addresses................................................11
   Appendix - IRML DTD...............................................11
   Appendix - Rule Module Examples...................................12
   Full Copyright Statement..........................................13

   Beck, Hofmann         Expires August 2001                  [Page 2]


   Internet Draft                IRML                        July 2001



1  Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [4].

   Other terminology used in this document is consistent with that
   defined and used in [2].

2  Problem Description and Goals

   The two parties that may wish to run OPES intermediary services as
   described in [2] and [3] are clients and content providers.

   Both parties must be able to express the conditions under which they
   wish to run a service. A content provider, for instance, may want to
   adapt its pages for users with small wireless devices. Web users may
   wish to have certain Web pages translated into a different language.

   These examples demonstrate the need for rules that tell the OPES
   intermediary device hosting these services when to run what service.
   Clients and content providers must provide their rules to the OPES
   intermediary or OPES admin server. They may also authorize the owner
   of an OPES intermediary (e.g. an ISP) to set up rules on the OPES
   intermediary on their behalf.

   A rule engine on the OPES intermediary device processes rules that
   apply to incoming requests/outgoing responses in order to determine
   what service modules need be executed when and in what order.

   This document defines the Intermediary Rule Markup Language (IRML)
   in an attempt to create a standard rule format that will be
   supported by vendors of OPES intermediary devices and by third
   parties offering OPES service applications.

   The Intermediary Rule Markup Language defined in this document
   serves as a standard representation of rules for OPES intermediary
   services. Since IRML is human-readable it also facilitates the
   exchange and discussion of these kind of rules between and within
   groups of rule authors.

   It is beyond the scope of this document to define a secure and
   reliable mechanism for transferring rule files to intermediary
   devices. Likewise, this document does not describe the specifics of
   how to (efficiently) process rules on the intermediary device.


3  IRML Syntax and Grammar



   Beck, Hofmann         Expires August 2001                  [Page 3]


   Internet Draft                IRML                        July 2001

   IRML is an application of XML. Thus, its syntax is governed by the
   rules of the XML syntax as defined in [5], and its grammar is
   specified by a DTD, or Document Type Definition. The IRML DTD can be
   found in Appendix A.

   Valid and well-formed IRML documents consist of one or more rule
   modules. Each rule module contains a set of rules and information
   about the rule module author. Rule modules can be authored by a
   content provider or by a client (although usually indirectly through
   an access provider). The rules contained in rule modules each
   consist of a number of conditions and a number of consequent actions
   that must be executed if the conditions are met. The conditions
   within a rule refer to message properties in the request or response
   message of a given Web transaction. They are met if the property
   value matches the pattern(s) specified in the rule condition(s).


3.1 The "rulemodule" Element


   The "rulemodule" element is the root element for all rule modules
   and MAY/MUST contain the following elements (see also IRML DTD in
   Appendix A).

3.1.1   The "owner" Element


   The "owner" element specifies the owner of the rule module. Each
   rule module can have exactly one owner.


   Attributes of "owner"

   Name         Values
   ----------------------------------------------------
   class        content provider|client


   The "class" attribute assigns a rule module owner to one of the two
   types of rule module authors: content providers and clients.


3.1.2   The "name" Element


   The "name" element contains a descriptive name for the rule module
   owner. This could be the company name for content providers and a
   customer login for clients. The name does not have to be unique
   among rule module owners.



   Beck, Hofmann         Expires August 2001                  [Page 4]


   Internet Draft                IRML                        July 2001

3.1.3   The "id" Element


   The "id" element contains an identifier for the rule module owner.
   The identifier MUST be unique within a class of rule module
   providers. The "id" element determines whether a particular Web
   transaction is relevant to a rule module and thus, whether the
   contained rules have to be processed for this Web request/response.
   For example, a rule module provided by a content provider should
   only be processed for Web request referring to Web resources owned
   by this particular content provider.

   Therefore, if the rule module owner is a content provider, the "id"
   element MUST contain the domain name(s) of the content provider. If
   a content provider owns more than one domain and the relevant rule
   module pertains to more than one of them, the "id" element MAY even
   contain more than one domain name separated by the "|" character
   (see "owner" example). The specified domain name(s) MAY also contain
   a port number. If no port number is specified, then the default port
   for the specified protocol is assumed, e.g. 80 for HTTP.

   If the rule module owner is a client, then a unique client
   identifier, e.g. a customer id, MUST be chosen in order to associate
   client rule modules with client requests. If an access provider
   assigns only static IP numbers to its customers, the "id" element
   can also contain the IP number of the module owner. Otherwise, the
   dynamic IP addresses of incoming client requests MUST be mapped to
   the unique client "id" element value in order to determine whether a
   specific rule module must be processed for a particular client
   request/server response.


3.2 The "protocol" Element


   The "protocol" element contains the name of the protocol acronym the
   rule module pertains to. Although most services operate on HTTP,
   IRML is not limited to HTTP messages. Any other message-based
   protocol that fits into the OPES framework can be used.


3.3 Examples of the "owner", "name", "id", "protocol" Elements


   <owner class="content provider">
     <name>Yahoo Inc.</name>
     <id>www.yahoo.com|dir.yahoo.com:8000</id>
   </owner>
   <protocol>http</protocol>

   <owner class="client">
     <name>abeck</name>

   Beck, Hofmann         Expires August 2001                  [Page 5]


   Internet Draft                IRML                        July 2001

     <id>205.167.45.1</id>
   </owner>
   <protocol>http</protocol>


3.4 The "rule" Element


   The "rule element" contains one or more "property" and/or "action"
   elements.

   Attributes of "rule"

   Name                 Values
   ----------------------------
   processing-point     1|2|3|4


   The "processing-point" attribute specifies at which of the four
   points in figure 1 a rule must be processed by the rule engine on
   the intermediary device. The four common processing points of an
   OPES intermediary are further defined in [6]. Implementation
   architectures for other intermediary devices might define different
   or additional processing points.

   Figure 1 shows the typical HTTP data flow between a client, an OPES
   intermediary (in this case a caching proxy), and an origin server.
   The four processing points (1-4) represent locations in the round
   trip message flow where rules can be processed and service modules
   can be executed. Note that in a caching proxy the message flow may
   skip points 2 and 3 after point 1 if the requested object can be
   served from cache.

   +--------+       +-----------+       +--------+
   |        |<------|4         3|<------|        |
   | Client |       |  Caching  |       | Origin |
   |        |       |   Proxy   |       | Server |
   |        |------>|1         2|------>|        |
   +--------+       +-----------+       +--------+

   Figure 1: Rule Processing/Service Execution Points


   Depending on the service type, rules may be processed and services
   may be executed at any of the four points outlined in figure 1. A
   virus scanning service for instance could be executed at point 3 in
   figure 1 in order to scan all Web objects for viruses before they
   can be stored in the cache. A URL-based request filtering service on
   the other hand should be executed at point 1 and a language
   translation service should probably be executed at point 4.



   Beck, Hofmann         Expires August 2001                  [Page 6]


   Internet Draft                IRML                        July 2001

3.4.1   The "property" Element


   The "property" element contains one or more other "property"
   elements and one or more "action" elements. "property" elements are
   conditions, that, if met, will lead to the execution of the service
   modules specified in the contained "action" elements. Nested
   "property" elements represent a hierarchical "AND" relationship.
   This means that an inner "property" condition can only be true, if
   the outer "property" condition is true and so forth.

   Attributes of "property"

   Name                 Values                   Default
   -----------------------------------------------------------
   name                 CDATA
   type                 (message|system|service) "message"
   matches              CDATA
   case-sensitive       (yes|no)   "no"

   The "name" attribute specifies the name of the property that is to
   be matched. The "type" attribute specifies the property type
   further. By default, properties have the type "message", that is
   they refer to a request or a response message property so that the
   specified property name refers to a protocol-specific header name.
   For HTTP messages for example, the list of protocol-specific header
   names is defined in [7]. IRML, however, is not limited to the
   message properties defined in protocol specifications. It also
   supports user-defined message properties (e.g. user-defined protocol
   headers).

   If the property "type" attribute is specified as "system", then the
   property name refers to system variables that are set by the OPES
   intermediary.

   For HTTP messages, IRML defines the following system variables:

   Property Name        Value
   --------------------------------------------------------------
   "request-line"       the first line of an HTTP request
   "response-line"      the first line of an HTTP response
   "request-host"       the host name of the origin server
   "request-path"       the relative path of the request URI


   In addition to these HTTP-specific headers, IRML also defines the
   following general system property variables:

   Property Name        Value
   --------------------------------------------------------------
   "system-date"        a timestamp using the Internet date/time
                        format as defined in [8]

   Beck, Hofmann         Expires August 2001                  [Page 7]


   Internet Draft                IRML                        July 2001

   "client-ip"          the IP number of the user agent


   The "system-date" and "client-ip" variable MUST be supported by all
   OPES intermediaries. If the OPES intermediary supports HTTP, it MUST
   also support the above listed HTTP system properties.

   If the property "type" attribute is specified as "service", then the
   property name refers to service-specific environment variables that
   can be set and modified by OPES service modules. These can be used
   by OPES service modules to maintain state information beyond a
   particular session. If these service variables are referenced in
   IRML rule conditions, then OPES service modules can dynamically
   adapt the conditions that lead to the invocation of OPES services
   without altering the actual rule module.

   Service-specific variables can also be used for the communication
   between different OPES modules, e.g. if one service module sets a
   state variable that is subsequently read by another service module.

   The "matches" attribute specifies the pattern against which the
   property value MUST be matched by the rule engine on the
   intermediary device. The "matches" pattern MUST be a regular
   expression compliant with the regular expression syntax as defined
   in [9].

   If needed, the double-quote character (") MUST be represented in any
   attribute value as "&quot;" (as specified in [5]).

   The "case-sensitive" attribute specifies whether the matching of the
   specified pattern must be performed case sensitive or not. The
   default value for this attribute is "no" meaning that pattern
   matching is case insensitive unless otherwise specified.

   If a "rule" element contains an "action" element outside of a
   "property" element, then the specified action must be performed for
   all messages that pass through the specified processing point. A
   user profiling service, for example, may have to be triggered for
   all user requests.


3.4.2   The "action" Element


   The "action" element identifies the OPES service module that is to
   be executed on the intermediary device or a dedicated service
   execution server. The "action" element does not, however, specify a
   specific instance of the OPES service module, e.g. a specific
   installation on a specific server. Instead, the OPES intermediary
   can resolve the identified OPES service to a specific instance at
   run-time in order to accomodate for system or network conditions,
   e.g. the current system load on a particular remote callout server.

   Beck, Hofmann         Expires August 2001                  [Page 8]


   Internet Draft                IRML                        July 2001


   The "action" element MUST contain an absolute URI that follows the
   URI syntax as defined in [10] and uniquely identifies an OPES
   service module including its version. The URI scheme to be used to
   identify OPES services is "opes". Note that although an OPES URI
   contains a hostname, it only serves as a unique identifier for a
   specific OPES service module.

   Any arguments to OPES service modules MAY be passed as part of the
   service module name using the standard "?"-encoding of attribute-
   value pairs used in HTTP [7].

   Only one OPES service URI MAY be specified per "action" element. A
   "property" element, however, MAY contain several "action" elements.


3.4.3   Examples of the "rule", "property" and "action" Elements


   <rule processing-point=1>
     <!- Log ALL user requests -->
     <action>opes://logmaker.com/requestlog-v1.0</action>
   </rule>

   <rule processing-point=4>
     <!- Is the requested Web resource a HTML document? -->
     <property name="Content-Type" matches="text/html">
       <!-Is the user's preferred language among the supported ones?-->
       <property name="Accept-Languages" matches="^de|^fr|^it|^es">
          <!- Invoke translation service module Babelfish -->
          <action>opes://altavista.com/babelfish?mode=quick</action>
       </property>
   </property>
   </rule>

   <rule processing-point=3>
     <!- Is the requested Web resource an executable binary file? -->
     <property name="Content-Type" matches="application/">
       <!- Invoke virus scanning service from McAffee -->
       <action>opes://mcaffee.com/scan?mode=remove</action>
     </property>
   </rule>


4  Order of Service Execution


   The order in which service modules on the intermediary device are
   executed may change the final result of OPES service processing. For
   example, an a content analyzer/filtering service executed against
   the result of a Web page translation service may produce a different
   result than a reverse execution order.

   Beck, Hofmann         Expires August 2001                  [Page 9]


   Internet Draft                IRML                        July 2001


   Up to two rule modules may have to be processed by an OPES
   intermediary per Web transaction. The order in which these rule
   modules are processed MUST reflect the order in which
   request/response messages pass by rule module authors. This means
   that for incoming requests at points 1 and 2 in figure 1, the order
   MUST be:

   1. Client rule module
   2. Content provider rule module

   For outgoing responses at points 3 and 4, the order MUST be:

   1. Content provider rule module
   2. Client rule module

   Within a single rule module, the intermediary device MUST process
   and execute all rules and actions IN THE ORDER THEY ARE SPECIFIED in
   the rule module (both within "property" and "rule" elements). If the
   rule processor determines that multiple actions must be executed for
   any given transaction, it MUST take into account that message
   property values may be modified by the execution of OPES service
   modules. This may require waiting for the completion of a triggered
   service module before rule conditions of subsequent rule can be
   evaluated.


5  Security Considerations


   Although beyond the scope of this document, it is clearly necessary
   to define a secure mechanism for transferring rule modules to
   intermediary devices. This will include authenticating and
   authorizing rule module owners and OPES intermediaries or admin
   servers. The integrity of rule modules must also be guaranteed.

   Also, a security context must be established on the OPES
   intermediary device for each rule module to ensure that rule modules
   may not execute service modules or call library functions on the
   intermediary without without being authorized to do so.

6  Acknowledgement

   The authors would like to thank all active participants in the OPES
   mailing list for their thought-provoking discussion, and many of the
   ideas, suggestions that have been incorporated into the document.
   Especially we want to ackowledge the following people for their
   helpful contributions: Lily Yang, Christian Maciocco, Mark
   Nottingham, and Michael Condry.


7  References

   Beck, Hofmann         Expires August 2001                 [Page 10]


   Internet Draft                IRML                        July 2001


   1  Bradner, S., "The Internet Standards Process -- Revision 3", BCP
      9, RFC 2026, October 1996
   2  Tomlinson, G., et al., "A Model for Open Pluggable Edge
      Services," Work in Progress Internet Draft: draft-tomlinson-opes-
      model-00.txt, July 2001.
   3  McHenry, S., et al., "OPES Use Cases and Deployment Scenarios",
      Work in Progress Internet Draft: draft-mchenry-opes-deployment-
      scenarios-00.txt, July 2001
   4  Bradner, S., "Key words for use in RFCs to Indicate Requirement
      Levels", Request for Comments 2119, Harvard University, March
      1997
   5  Bray, T., et al., Extensible Markup Language (XML) 1.0 (Second
      Edition), http://www.w3.org/TR/2000/REC-xml-20001006, October
      2000
   6  Rafalow, L., et al., "Policy Requirements for Edge Services",
      Work in Progress, Internet Draft draft-rafalow-opes-policy-
      requirements-00.txt, July 2001
   7  Fielding, R., et al., "Hypertext Transfer Protocol -- HTTP/1.1",
      Request for Comments 2616, June 1999
   8  Klyne, G., et al., "Date and Time on the Internet: Timestamps",
      Work in Progress, Internet Draft "draft-ietf-impp-datetime-
      04.txt", July 2001
   9  ISO/IEC DIS 9945-2:1992, Information technology - Portable
      Operating System Interface (POSIX) - Part 2: Shell and Utilities
      (IEEE Std 1003.2-1992); X/Open CAE Specification, Commands and
      Utilities, Issue 4, 1992
   10 Berners-Lee, T., Fielding, R. and L. Masinter, "Uniform Resource
      Identifiers (URI): Generic Syntax and Semantics", Request for
      Comments 2396, August 1998


Author's Addresses


   Andre Beck
   Markus Hofmann
   Bell Labs Research
   Lucent Technologies
   101 Crawfords Corner Rd.
   Holmdel, NJ 07733
   Phone: (732) 332-5983
   Email: {abeck, hofmann}@bell-labs.com


Appendix - IRML DTD


   <!ELEMENT rulemodule    (owner, protocol, rule+)>
   <!ELEMENT owner         (name, id)>
   <!ELEMENT name          (#PCDATA)>
   <!ELEMENT id            (#PCDATA)>

   Beck, Hofmann         Expires August 2001                 [Page 11]


   Internet Draft                IRML                        July 2001

   <!ELEMENT protocol      (#PCDATA)>
   <!ELEMENT rule          (property | action)+>
   <!ELEMENT property      (property | action)+>
   <!ELEMENT action        (#PCDATA)>
   <!ATTLIST rulemodule    name               CDATA          #REQUIRED>
   <!ATTLIST rulemodule    version            CDATA          #REQUIRED>
   <!ATTLIST owner         class  (content provider |
                                   access provider | client) #REQUIRED>
   <!ATTLIST rule          processing-point   (1|2|3|4)      #REQUIRED>

   <!ATTLIST property      name               CDATA          #REQUIRED>
   <!ATTLIST property      type               (message|system|service)
                                                             "message">
   <!ATTLIST property      matches            CDATA          #REQUIRED>
   <!ATTLIST property      case-sensitive     (yes|no)            "no">



Appendix - Rule Module Examples


   Content Provider Rule Module Example for Advertisement Insertion
   Service

   <?xml version="1.0"?>
   <rulemodule>
     <owner class="content provider">
       <name>Lucent Technologies</name>
       <id>www.lucent.com</id>
     </owner>
     <protocol>http</protocol>
     <rule processing-point="4">
       <!- Is the requested Web document the home page? -->
       <property name="Request-Path" matches="^/$|^/index.html$"
       case-sensitive="yes">
          <!-Does the user send us a cookie for user identification?-->
          <property name="Cookie" matches="UserID=">
             <action>opes://doubleclick.net/insertad</action>
          </property>
       </property>
     <rule>
   </rulemodule>


   Client Rule Module Example for Language Translation and Virus
   Scanning Service

   <?xml version="1.0"?>
   <rulemodule name="Translation" version="1.01">
     <owner class="client">
       <name>Markus Hofmann</name>
       <id>2324264</ID>

   Beck, Hofmann         Expires August 2001                 [Page 12]


   Internet Draft                IRML                        July 2001

     </owner>
     <protocol>http</protocol>
     <rule processing-point="4">
        <!- Is the requested Web resource text based? -->
        <property name="Content-Type" matches="application/">
          <action>opes://mcaffee.com/scan?mode=remove</action>
        </property>
     </rule>
     <rule processing-point="4">
       <!- Is the requested Web resource text based? -->
       <property name="Content-Type" matches="text/">
         <!- Does the top level domain of the origin host not
             equal ".de"? -> Document language is probably not
             German -> Page needs to be translated -->
         <property name="Host" matches="[^e]$|[^d][e]$|[^.][d][e]$">
           <action>opes://altavista.net/translate</action>
         </property>
       </property>
     </rule>
   </rulemodule>

   Content Provider Rule Module Example for Content Adaptation Service
   for Wireless Web Access Devices

   <?xml version="1.0"?>
   <rulemodule>
     <owner class="content provider">
       <name>Yahoo Inc.</name>
       <id>www.yahoo.com</id>
     </owner>
     <protocol>http</protocol>
     <rule processing-point="4">
       <!-Does the user have a wireless Web access device? -->
       <property name="User-Agent" matches="Nokia|Ericcson|Palm">
         <!- Is the requested Web resource text based? -->
         <property name="Content-Type" matches="text/">
           <action>opes://wapgateway.nl/transcode</action>
         </property>
       </property>
     </rule>
   </rulemodule>


Full Copyright Statement


   Copyright (C) The Internet Society (2000). All Rights Reserved.

   This document and translations of it may be copied and furnished to
   others, and derivative works that comment on or otherwise explain it
   or assist in its implementation may be prepared, copied, published
   and distributed, in whole or in part, without restriction of any

   Beck, Hofmann         Expires August 2001                 [Page 13]


   Internet Draft                IRML                        July 2001

   kind, provided that the above copyright notice and this paragraph
   are included on all such copies and derivative works. However, this
   document itself may not be modified in any way, such as by removing
   the copyright notice or references to the Internet Society or other
   Internet organizations, except as needed for the purpose of
   developing Internet standards in which case the procedures for
   copyrights defined in the Internet Standards process must be
   followed, or as required to translate it into languages other than
   English.

   The limited permissions granted above are perpetual and will not be
   revoked by the Internet Society or its successors or assigns.

   This document and the information contained herein is provided on an
   "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
   TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
   BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
   HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
   MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.



   Beck, Hofmann         Expires August 2001                 [Page 14]