Internet Draft                                            Andre Beck
                                                         Markus Hofmann
  Expires: August 2001                             Lucent Technologies
  Document: draft-beck-opes-irml-00.txt
  Updates: draft-beck-opes-psrl-00.txt                   February 2001
  Category: Informational


       IRML: A Rule Specification Language for Intermediary Services


Status of this Memo

   This document is an Internet-Draft and is in full conformance
   with all provisions of Section 10 of RFC2026 [1].


   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six
   months and may be updated, replaced, or obsoleted by other documents
   at any time.  It is inappropriate to use Internet-Drafts as
   reference material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
        http://www.ietf.org/ietf/1id-abstracts.txt
   The list of Internet-Draft Shadow Directories can be accessed at
        http://www.ietf.org/shadow.html.


Abstract

   Intermediary services are a new class of applications running on
   network edge intermediaries like caching proxies or dedicated
   application servers. They are described in [2] and [3]. These
   intermediary services can be executed on behalf of clients, access
   providers, or content providers. In order to control the execution
   of intermediary services, these parties provide service-specific
   rules that trigger services if rule conditions are met for incoming
   or outgoing messages.

   The Intermediary Rule Markup Language (IRML) is an XML-based
   language that can be used to describe service-specific execution
   rules. It allows clients, access providers, and content providers to
   specify when and how to execute intermediary services.

Table of Contents

   Status of this Memo................................................1
   Abstract...........................................................1
   1. Terminology.....................................................3
    Beck, Hofmann        Expires August 2001                 [Page 1]


   Internet Draft                IRML                    February 2001

   2. Problem Description and Goals...................................3
   3. IRML Syntax and Grammar.........................................4
   3.1. The "rulemodule" Element......................................4
   3.2 The "owner" Element............................................5
   3.2.1. The "name" Element..........................................5
   3.2.2. The "id" Element............................................5
   3.3. The "protocol" Element........................................6
   3.4. Examples of the "owner", "name", "id", "protocol" Elements....6
   3.5. The "rule" Element............................................6
   3.5.1. The "property" Element......................................7
   3.5.2. The "action" Element........................................9
   3.5.3. Examples of the "rule", "property" and "action" Elements....9
   4. Order of Service Execution.....................................10
   5. Security Considerations........................................10
   6. Acknowledgement................................................11
   7. Reference......................................................11
   Author's Addresses................................................11
   Appendix - IRML DTD...............................................12
   Appendix - Rule Module Examples...................................12
   Full Copyright Statement..........................................14
































   Beck, Hofmann         Expires August 2001                 [Page 2]


   Internet Draft                IRML                    February 2001

1. Terminology


   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [4].

   intermediary

   Intermediaries are application-aware devices located in the
   communication path between client and origin server, for example
   (caching) proxies, gateways, switches etc.

   rule module

   A rule module contains a set of rules and information about the rule
   module owner.

   rule

   Rules contain conditions and actions that are to be executed if the
   conditions are met.

   action

   The execution of a local/remote service module. Message properties
   MAY be modified as the result of the execution.

   service module

   Service modules are executable code modules that can be executed in
   a local service execution environment on the intermediary or a
   remote service execution environment on a dedicated application
   server. They may run on behalf of content providers, access
   providers, and clients.


2. Problem Description and Goals


   The three parties that may wish to run intermediary services as
   described in [2] and [3] are the same parties that are involved
   in a typical Web transaction:

   1. Client
   2. Access provider (ISP, CDN etc.)
   3. Content provider

   Each party must be able to express the conditions under which they
   wish to run a service. A content provider, for instance, may want to
   adapt its pages for users with small wireless devices. Providers of
   free Internet services may want to insert advertisements into all
   HTML pages served to their clients. Web users may wish to have
   certain Web pages translated into a different language.

   Beck, Hofmann         Expires August 2001                 [Page 3]


   Internet Draft                IRML                    February 2001


   These examples demonstrate the need for rules that tell the
   intermediary hosting these services when to run what service. The
   three parties for which services may be executed must provide these
   rules to the intermediary. A rule engine on the intermediary
   processes rules that apply to incoming requests/outgoing responses
   in order to determine what service modules need be executed when and
   in what order.

   Since the intermediary processing the rules is not necessarily
   maintained by the parties that may wish to author rules, a standard
   specification language is required.

   This document defines the Intermediary Rule Markup Language (IRML)
   in an attempt to create a standard rule format that will be
   supported by vendors of service-enabled intermediaries and by third
   parties offering network edge service applications.

   The Intermediary Rule Markup Language defined in this document also
   serves as a standard representation of rules for intermediary
   services. This facilitates the exchange and discussion of these kind
   of rules between and within groups of rule authors.

   It is beyond the scope of this document to define a secure and
   reliable mechanism for transferring rule files to intermediaries.
   Likewise, this document does not describe the specifics of how to
   (efficiently) process rules on the intermediaries.


3. IRML Syntax and Grammar


   IRML is an application of XML. Thus, its syntax is governed by the
   rules of the XML syntax as defined in [5], and its grammar is
   specified by a DTD, or Document Type Definition. The IRML DTD can be
   found in Appendix A.

   Valid and well-formed IRML documents consist of one or more rule
   modules. Each rule module contains a set of rules and information
   about the rule module provider. Rule modules can be provided by a
   content provider, an access provider, or by a client (although
   usually indirectly through an access provider). The rules contained
   in rule modules each consist of a number of conditions and a number
   of consequent actions that must be executed if the conditions are
   met. The conditions within a rule refer to message properties in the
   request or response message of a given Web transaction. They are met
   if the property value matches the pattern(s) specified in the rule
   condition(s).


3.1. The "rulemodule" Element




   Beck, Hofmann         Expires August 2001                 [Page 4]


   Internet Draft                IRML                    February 2001

   The "rulemodule" element is the root element for all rule modules
   and MAY/MUST contain the following elements (see also IRML DTD in
   Appendix A).

3.2 The "owner" Element


   The "owner" element specifies the owner of the rule module. Each
   rule module can have exactly one owner.


   Attributes of "owner"

   Name         Values
   ----------------------------------------------------
   class        content provider|access provider|client


   The "class" attribute assigns a rule module owner to one of the
   three types of rule module providers: content providers, access
   providers, and clients.


 3.2.1. The "name" Element


   The "name" element contains a descriptive name for the rule module
   owner. This could be the company name for content and access
   providers and a customer login for clients. The name does not have
   to be unique among rule module owners.


 3.2.2. The "id" Element


   The "id" element contains an identifier for the rule module owner.
   The identifier MUST be unique within a class of rule module
   providers. The "id" element determines whether a particular Web
   transaction is relevant to a rule module and thus, whether the
   contained rules have to be processed for this Web request/response.
   For example, a rule module provided by a content provider should
   only be processed for Web request referring to Web resources owned
   by this particular content provider.

   Therefore, if the rule module owner is a content provider, the "id"
   element MUST contain the domain name(s) of the content provider. If
   a content provider owns more than one domain and the relevant rule
   module pertains to more than one of them, the "id" element MAY even
   contain more than one domain name separated by the "|" character
   (see "owner" example). The specified domain name(s) MAY also contain
   a port number. If no port number is specified, then the default port
   for the specified protocol is assumed, e.g. 80 for HTTP.



   Beck, Hofmann         Expires August 2001                 [Page 5]


   Internet Draft                IRML                    February 2001

   If the rule module owner is an access provider, then the "id"
   element is of less importance since a particular intermediary is
   usually associated with only one specific access provider.

   If the rule module owner is a client, then a unique client
   identifier, e.g. a customer id, MUST be chosen in order to associate
   client rule modules with client requests. These client identifiers
   are most likely access provider-specific. If an access provider
   assigns only static IP numbers to its customers, the "id" element
   can also contain the IP number of the module owner. Otherwise, the
   dynamic IP addresses of incoming client requests MUST be mapped to
   the unique client "id" element value in order to determine whether a
   specific rule module must be processed for a particular client
   request/server response.


3.3. The "protocol" Element


   The "protocol" element contains the name of the protocol acronym the
   rule module pertains to. Although most services operate on HTTP,
   IRML is not limited to HTML messages. Any other message-based
   protocol that fits into the IRML framework can be used.


3.4. Examples of the "owner", "name", "id", "protocol" Elements


   <owner class="content provider">
     <name>Yahoo Inc.</name>
     <id>www.yahoo.com|dir.yahoo.com:8000</id>
   </owner>
   <protocol>http</protocol>

   <owner class="client">
     <name>abeck</name>
     <id>205.167.45.1</id>
   </owner>
   <protocol>http</protocol>


3.5. The "rule" Element


   The "rule element" contains one or more "property" and/or "action"
   elements.

   Attributes of "rule"

   Name                 Values
   ----------------------------
   processing-point     1|2|3|4



   Beck, Hofmann         Expires August 2001                 [Page 6]


   Internet Draft                IRML                    February 2001

   The "processing-point" attribute specifies at which of the four
   points in figure 1 the rule engine on the intermediary must process
   a rule. The four "processing-points" are derived from the Extensible
   Proxy Services Framework as described in [2] and only apply to
   caching proxies. Implementation architectures for other
   intermediaries might define different or additional "processing-
   points".

   Figure 1 shows the typical HTTP data flow between a client, a
   caching proxy, and an origin server. The four processing points (1-
   4) represent locations in the round trip message flow where rules
   can be processed and service modules can be executed. Note that the
   message flow may skip points 3 and 4 after point 1 if the requested
   object can be served from cache.

   +--------+       +-----------+       +--------+
   |        |<------|4         3|<------|        |
   | Client |       |  Caching  |       | Origin |
   |        |       |   Proxy   |       | Server |
   |        |------>|1         2|------>|        |
   +--------+       +-----------+       +--------+

   Figure 1: Rule Processing/Service Execution Points

   Point 1: Client Request
        A HTTP request from a client has been received. A possible
        cache lookup has not yet occurred.

   Point 2: Proxy Request
        The requested Web object cannot be served from the cache and
        the origin server is about to be contacted for the HTTP
        resource.

   Point 3: Origin Server Response
        The HTTP response from the origin server has been received. It
        has not yet been stored in the cache.

   Point 4: Proxy Response
        The HTTP response from the cache or the origin server is about
        to be sent back to the client.

   Depending on the service type, rules may be processed and services
   may be executed at any of the four points outlined in figure 1. A
   virus scanning service for instance could be executed at point 3 in
   figure 1 in order to scan all Web objects for viruses before they
   can be stored in the cache. A URL-based request filtering service on
   the other hand should be executed at point 1 and an ad insertion
   service will probably be executed at point 4.


 3.5.1. The "property" Element




   Beck, Hofmann         Expires August 2001                 [Page 7]


   Internet Draft                IRML                    February 2001

   The "property" element contains one or more other "property"
   elements and one or more "action" elements. "property" elements are
   conditions, that, if met, will lead to the execution of the service
   modules specified in the contained "action" elements. Nested
   "property" elements represent a hierarchical "AND" relationship.
   This means that an inner "property" condition can only be true if
   the outer "property" condition is true and so forth.

   Attributes of "property"

   Name                 Values     Default
   ----------------------------------------
   name                 CDATA
   matches              CDATA
   case-sensitive       (yes|no)   "no"

   The "name" attribute specifies the name of the message property that
   is to be matched. This can be either a request or a response message
   property. The specified property names usually refer to protocol-
   specific header names. For HTTP messages for example, the list of
   protocol-specific header names is defined in [6].

   IRML, however, is not limited to the message properties defined in
   protocol specifications. It also supports any user-defined message
   properties. Service modules for instance could add user-defined
   headers to request or response messages that would be processed by
   other service modules.

   For HTTP messages, IRML also defines the following property names
   that cannot be directly mapped to HTTP headers:

   Property Name        Value
   --------------------------------------------------------------
   "request-line"       the first line of a HTTP request
   "response-line"      the first line of a HTTP response
   "request-path"       the relative path of the request URI
   "user-id"            a value to identify a user, assigned by
                        the access provider and unique for all
                        customers of the same access provider

   In addition to these HTTP-specific headers, IRML defines environment
   properties that are independent of the used protocol:

   Property Name        Value
   --------------------------------------------------------------
   "system-date"        system date in the format "yyyymmdd"
   "system-time"        system time in the format "hhmmss"


   The values of the aboved listed environment properties MUST be
   provided by the service platform.

   The "matches" attribute specifies the pattern against which the
   property value MUST be matched by the rule engine on the

   Beck, Hofmann         Expires August 2001                 [Page 8]


   Internet Draft                IRML                    February 2001

   intermediary. The "matches" pattern MUST be a regular expression
   compliant with the basic or extended regular expression syntax as
   defined in [7].

   If needed, the double-quote character (") MUST be represented in any
   attribute value as "&quot;" (as specified in [5]).

   The "case-sensitive" attribute specifies whether the matching of the
   specified pattern must be performed case sensitive or not. The
   default value for this attribute is "no", which means that pattern
   matching is case insensitive unless otherwise specified.

   If a "rule" element contains an "action" element that is not nested
   in one or more "property" elements, then the specified action must
   be performed for all messages that pass through the specified
   processing point. A user profiling service, for example, may have to
   be triggered for all user requests.


 3.5.2. The "action" Element


   The "action" element contains a URI specifying the name, location
   and optional parameters of the service module that is to be executed
   on the intermediary or a dedicated application server.

   For local service modules, the "proxylet" scheme as defined in [8]
   MUST be used.

   If the service module resides on a dedicated application server and
   ICAP [9] is used as the transport protocol, the "action" element
   MUST contain an ICAP-URI as defined in the current version of the
   ICAP specification [9].

   In both cases, any arguments MAY be passed as part of the service
   module name using the standard "?"-encoding of attribute-value pairs
   used in HTTP [6].

   Only one service URI MAY be specified per "action" element. A
   "property" element, however, MAY contain several "action" elements.


 3.5.3. Examples of the "rule", "property" and "action" Elements


   <rule processing-point=1>
     <!- Log ALL user requests -->
     <action>requestlog</action>
   </rule>

   <rule processing-point=4>
     <!- Is the requested Web resource a HTML document? -->
     <property name="Content-Type" matches="text/html">
       <!-Is the user's preferred language among the supported ones?-->

   Beck, Hofmann         Expires August 2001                 [Page 9]


   Internet Draft                IRML                    February 2001

       <property name="Accept-Languages" matches="^de|^fr|^it|^es">
          <!- Invoke translation service on trans.net server -->
          <action>proxylet://localhost/translate?target=de</action>
       </property>
     </property>
   </rule>

   <rule processing-point=3>
     <!- Is the requested Web resource an executable binary file? -->
     <property name="Content-Type" matches="application/">
       <!- Invoke virus scanning service on mcaffee.com -->
       <action>icap://mcaffee.com/viruscheck</action>
     </property>
   </rule>


4. Order of Service Execution


   The order in which service modules on the intermediary are executed
   may change the final result of a Web transaction. For example, an ad
   insertion service executed against the result of a Web page
   translation service may produce a different result than a reverse
   execution order.

   Up to three rule modules may have to be processed by a service-
   enabled intermediary per Web transaction. The order in which these
   rule modules are processed MUST reflect the order in which
   request/response messages pass by the rule module authors. This
   means that for incoming requests at points 1 and 2 in figure 1, the
   order MUST be:

   1. Client rule module
   2. Access provider rule module
   3. Content provider rule module

   For outgoing responses at points 3 and 4, the order MUST be:

   1. Content provider rule module
   2. Access provider rule module
   3. Client rule module

   Within a single rule module, the intermediary MUST process and
   execute all rules and actions IN THE ORDER THEY ARE SPECIFIED in the
   rule module (both within "property" and "rule" elements). If the
   rule processor determines that multiple actions must be executed for
   any given transaction, it MUST take into account that message
   property values may be modified by executed service modules. This
   may require waiting for the completion of a triggered service module
   before rule conditions of subsequent rules can be evaluated.


5. Security Considerations


   Beck, Hofmann         Expires August 2001                [Page 10]


   Internet Draft                IRML                    February 2001


   Although beyond the scope of this document, it is clearly necessary
   to define a secure mechanism for transferring rule modules to
   intermediaries. This will include authenticating and authorizing
   rule module owners and service-enabled intermediaries. The integrity
   of rule modules must also be guaranteed.

   Also, a security context must be established on the intermediary for
   each rule module to ensure that rule modules may not execute service
   modules or call library functions on the intermediary without
   without being authorized to do so.

6. Acknowledgement

   The authors would like to thank all the active participants in the
   OPES mailing list for their thought-provoking discussion, and many
   of the ideas, suggestions have been incorporated into the document.
   Especially we want to acknowledge the following people for their
   helpful contributions: Lily Yang, Christian Maciocco, Mark
   Nottingham, and Michael Condry.


7. Reference

   1  Bradner, S., "The Internet Standards Process -- Revision 3", BCP
      9, RFC 2026, October 1996.
   2  Tomlinson, G., et al., "Extensible Proxy Services Framework",
      http://www.ietf.org/internet-drafts/draft-tomlinson-epsfw-00.txt,
      July 2000
   3  Hofmann, M., Beck, A., "Example Services for Network Edge
      Proxies", http://www.ietf.org/internet-drafts/draft-beck-opes-
      esfnep-01.txt
   4  Bradner, S., "Key words for use in RFCs to Indicate Requirement
      Levels", Request for Comments 2119, Harvard University, March
      1997
   5  Bray, T., et al., Extensible Markup Language (XML) 1.0 (Second
      Edition), http://www.w3.org/TR/2000/REC-xml-20001006, October
      2000
   6  Fielding, R., et al., "Hypertext Transfer Protocol -- HTTP/1.1",
      Request for Comments 2616, June 1999
   7  ISO/IEC DIS 9945-2:1992, Information technology - Portable
      Operating System Interface (POSIX) - Part 2: Shell and Utilities
      (IEEE Std 1003.2-1992); X/Open CAE Specification, Commands and
      Utilities, Issue 4, 1992
   8  Maciocco, C., Hofmann, M., "OMML: OPES Meta-data Markup
      Language", draft-maciocco-opes-omml-00.txt
   9  Elson, J., et al., "ICAP, the Internet Content Adaptation
      Protocol", http://www.ietf.org/internet-drafts/draft-elson-opes-
      icap-00.txt, December 2000



Author's Addresses


   Beck, Hofmann         Expires August 2001                [Page 11]


   Internet Draft                IRML                    February 2001

   Andre Beck
   Markus Hofmann
   Bell Laboratories
   Lucent Technologies
   101 Crawfords Corner Rd.
   Holmdel, New Jersey 07733
   Phone: (732) 332-5983
   Email: {abeck, hofmann}@bell-labs.com


Appendix - IRML DTD


   <!ELEMENT rulemodule    (owner, protocol, rule+)>
   <!ELEMENT owner         (name, id)>
   <!ELEMENT name          (#PCDATA)>
   <!ELEMENT id            (#PCDATA)>
   <!ELEMENT protocol      (#PCDATA)>
   <!ELEMENT rule          (property | action)+>
   <!ELEMENT property      (property | action)+>
   <!ELEMENT action        (#PCDATA)>
   <!ATTLIST owner         class  (content provider |
                                   access provider | client) #REQUIRED>
   <!ATTLIST rule          processing-point   (1|2|3|4)      #REQUIRED>

   <!ATTLIST property      name               CDATA          #REQUIRED>
   <!ATTLIST property      matches            CDATA          #REQUIRED>
   <!ATTLIST property      case-sensitive     (yes|no)   #DEFAULT "no">



Appendix - Rule Module Examples


   Content Provider Rule Module Example for Advertisement Insertion
   Service

   <?xml version="1.0"?>
   <rulemodule>
     <owner class="content provider">
       <name>Lucent Technologies</name>
       <id>www.lucent.com</id>
     </owner>
     <protocol>http</protocol>
     <rule processing-point="4">
       <!- Is the requested Web document the home page? -->
       <property name="Request-Path" matches="^/$|^/index.html$"
       case-sensitive="yes">
          <!-Does the user send us a cookie for user identification?-->
          <property name="Cookie" matches="UserID=">
             <action>icap://adserver.net/insertad</action>
          </property>
       </property>
     <rule>

   Beck, Hofmann         Expires August 2001                [Page 12]


   Internet Draft                IRML                    February 2001

   </rulemodule>

   Access Provider Rule Module Example for Advertisement Insertion
   Service for Free Internet Service

   <?xml version="1.0"?>
   <rulemodule>
     <owner class="access provider">
       <name>Comcast Free Internet Service</name>
       <id>www.comcast.com</id>
     </owner>
     <protocol>http</protocol>
     <rule processing-point="4">
        <!- Is the requested Web resource a HTML document? -->
        <property name="Content-Type" matches="text/html>
          <!- Is the user a customer of the free Internet service? -->
          <property name="User-Id" matches="^123[.]54[.]34[.]">
            <action>icap://adserver.com/insert_ad</action>
          </property>
       </property>
     </rule>
   </rulemodule>


   Client Rule Module Example for Language Translation and Virus
   Scanning Service

   <?xml version="1.0"?>
   <rulemodule>
     <owner class="client">
       <name>Markus Hofmann</name>
       <id>23242</ID>
     </owner>
     <protocol>http</protocol>
     <rule processing-point="4">
        <!- Is the requested Web resource text based? -->
        <property name="Content-Type" matches="application/">
          <action>icap://mcaffee.com/virus_scan?mode=respmod</action>
        </property>
     </rule>
     <rule processing-point="4">
       <!- Is the requested Web resource text based? -->
       <property name="Content-Type" matches="text/">
         <!- Does the top level domain of the origin host not
             equal ".de"? -> Document language is probably not
             German -> Page needs to be translated -->
         <property name="Host" matches="[^e]$|[^d][e]$|[^.][d][e]$">
           <action>proxylet://localhost/translate?target=de</action>
         </property>
       </property>
     </rule>
   </rulemodule>

   Content Provider Rule Module Example for Content Adaptation Service

   Beck, Hofmann         Expires August 2001                [Page 13]


   Internet Draft                IRML                    February 2001

      for Wireless Web Access Devices

   <?xml version="1.0"?>
   <rulemodule>
     <owner class="content provider">
       <name>Yahoo Inc.</name>
       <id>www.yahoo.com</id>
     </owner>
     <protocol>http</protocol>
     <rule processing-point="4">
       <!-Does the user have a wireless Web access device? -->
       <property name="User-Agent" matches="Nokia|Palm">
         <!- Is the requested Web resource text based? -->
         <property name="Content-Type" matches="text/">
           <action>icap://wapgateway.nl/transcode</action>
         </property>
       </property>
     </rule>
   </rulemodule>


Full Copyright Statement


   Copyright (C) The Internet Society (2000). All Rights Reserved.

   This document and translations of it may be copied and furnished to
   others, and derivative works that comment on or otherwise explain it
   or assist in its implementation may be prepared, copied, published
   and distributed, in whole or in part, without restriction of any
   kind, provided that the above copyright notice and this paragraph
   are included on all such copies and derivative works. However, this
   document itself may not be modified in any way, such as by removing
   the copyright notice or references to the Internet Society or other
   Internet organizations, except as needed for the purpose of
   developing Internet standards in which case the procedures for
   copyrights defined in the Internet Standards process must be
   followed, or as required to translate it into languages other than
   English.

   The limited permissions granted above are perpetual and will not be
   revoked by the Internet Society or its successors or assigns.

   This document and the information contained herein is provided on an
   "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
   TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
   BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
   HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
   MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.






   Beck, Hofmann         Expires August 2001                [Page 14]