Internet Draft A. Beck
M. Hofmann
Expires: May 2001 Lucent Technologies
Document: draft-beck-opes-psrl-00.txt November 17, 2000
Category: Informational
PSRL: A Rule Specification Language for Proxy Services
Status of this Memo
This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026 [1].
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Abstract
Proxy services are a new class of applications running on caching
proxies or dedicated application servers, preferably at the network
edge. They are described in [2] and [3]. Execution of proxy services
is triggered by certain conditions. These conditions are service
specific and have to be provided by the party on behalf of which the
affected service modules are executed.
The Proxy Service Rule Specification Language (PSRL) is an XML-based
language that can be used to describe service specific execution
rules. It allows a service provider to tell a proxy caching provider
when and how the services should be executed.
Beck, Hofmann [Page 1]
Internet Draft PSRL November 17, 2000
Table of Contents
1 Terminology....................................................2
2 Problem Description and Goals..................................3
3 PSRL Syntax and Grammar........................................4
3.1 The "rulemodule" Element.....................................4
3.2 The "owner" Element..........................................4
3.2.1 Attributes of "owner".......................................4
3.2.2 The "name" Element..........................................4
3.2.3 The "id" Element............................................5
3.3 The "protocol" Element.......................................5
3.4 Examples of the "owner", "name", "id", "protocol" Elements...5
3.5 The "rule" Element...........................................6
3.5.1 Attributes of "rule"........................................6
3.5.2 The "property" Element......................................7
3.5.3 The "action" Element........................................8
3.5.4 Examples of the "rule", "property" and "action" elements....8
4 Order of Service Execution.....................................9
5 Security Considerations........................................9
6 References....................................................10
7 Author's Addresses............................................10
A Appendix - PSRL DTD...........................................11
B Appendix - Rule Module Examples...............................11
1 Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [4].
rule module
A rule module contains a set of rules and information about the rule
module owner.
rule
Rules contain conditions and actions that are to be executed if the
conditions are met.
action
The execution of a local/remote service module or a proxy library
function. Message properties MAY be modified as the result of the
execution.
service module
Service modules are executable code modules that can be executed in
a local service execution environment on the caching proxy or a
Beck, Hofmann [Page 2]
Internet Draft PSRL November 17, 2000
remote service execution environment on a dedicated application
server. They may run on behalf of content providers, access
providers, and clients.
2 Problem Description and Goals
The three parties that may wish to run value-added proxy services
(as described in [2] and [3]) are the same parties that are involved
in a typical Web transaction:
1. Client
2. Access provider (e.g. an ISP)
3. Content provider
Each party must be able to express the conditions under which they
wish to run a service. A content provider, for instance, might want
to adapt its pages for users with small wireless devices. Providers
of free Internet services might want to insert advertisements into
all HTML pages served to their clients. Web users may wish to have
certain Web pages translated into a different language.
These examples demonstrate the need for rules that tell the caching
proxy when to run what service. These rules must be provided to the
caching proxy by the three parties on behalf of which services may
be executed. A rule engine on the caching proxy evaluates rules that
apply to incoming requests/outgoing responses in order to determine
what service modules need be executed when and in what order.
As the caching proxy processing the rules is not necessarily
maintained by the party that authors the rules, a standard
specification language is required.
This document defines the Proxy Services Rule Specification Language
(PSRL) in an attempt to create a standard rule format that will be
supported by vendors of service enabled caching proxies and by third
parties offering proxy service applications.
The Proxy Services Rule Specification Language defined in this
document also serves as a standard representation of rules for proxy
services. This facilitates the exchange and discussion of these kind
of rules between and within groups of rule authors.
It is beyond the scope of this document to define a secure and
reliable mechanism for transferring rule files to caching proxies.
Likewise, this document does not describe the specifics of how to
(efficiently) process rules on the caching proxy.
Beck, Hofmann [Page 3]
Internet Draft PSRL November 17, 2000
3 PSRL Syntax and Grammar
PSRL is an application of XML. Thus, its syntax is governed by the
rules of the XML syntax as defined in [5], and its grammar is
specified by a DTD, or Document Type Definition. The PSRL DTD can be
found in Appendix A.
Valid and well-formed PSRL documents consist of one or more rule
modules. Each rule module contains a set of rules and information
about the rule module provider. Rule modules are provided by a
content provider, an access provider, or by a client (although
usually indirectly through an access provider). The rules contained
in rule modules each consist of a number of conditions and a number
of consequent actions that must be executed if the conditions are
met. The conditions within a rule refer to message properties in the
request or response of a given Web transaction. They are met if the
property value matches the pattern specified in the condition.
3.1 The "rulemodule" Element
The "rulemodule" element is the root element for all rule modules
and MAY/MUST contain the following elements (see also PSRL DTD in
Appendix A).
3.2 The "owner" Element
The "owner" element specifies the owner of the rule module. Each
rule module can have exactly one owner.
3.2.1 Attributes of "owner"
Name Values
----------------------------------------------------
class content provider|access provider|client
The "class" attribute assigns a rule module owner to one of the
three types of rule module providers: content providers, access
providers, and clients.
3.2.2 The "name" Element
The "name" element contains a descriptive name for the rule module
owner. This could be the company name for content and access
providers and a customer login for clients. The name does not have
to be unique among rule module owners.
Beck, Hofmann [Page 4]
Internet Draft PSRL November 17, 2000
3.2.3 The "id" Element
The "id" element contains an identifier for the rule module owner.
The identifier MUST be unique within a class of rule module
providers. The "id" element determines whether a particular Web
transaction is relevant to a rule module and thus, whether the
contained rules have to be processed for this particular Web
request/response. For example, a rule module provided by a content
provider should only be processed for Web request referring to Web
resources owned by the same content provider.
Therefore, if the rule module owner is a content provider, the "id"
element MUST contain the domain name(s) of the content provider. If
a content provider owns more than one domain and the relevant rule
module pertains to more than one of them, the "id" element MAY even
contain more than one domain name separated by the "|" character
(see "owner" example). The specified domain name(s) MAY also contain
a port number. If no port number is specified, then the default port
for the specified protocol is assumed, e.g. 80 for HTTP.
If the rule module owner is an access provider, then the "id"
element is of less importance, since a particular caching proxy is
usually associated with only one specific access provider.
If the rule module owner is a client, then a unique client
identifier, e.g. a customer id, MUST be chosen in order to associate
client rule modules with client requests. If the client's access
provider does not assign dynamic IP numbers to its customers, the
"id" element can also contain the IP number of the module owner.
Otherwise, the dynamic IP addresses of incoming client requests MUST
be mapped to the unique client "id" element value in order to
determine whether a specific rule module must be processed.
3.3 The "protocol" Element
The "protocol" element contains the name of the protocol acronym the
rule module pertains to. For now, only "http" is supported. In a
future version of this document other protocols will be supported as
well.
3.4 Examples of the "owner", "name", "id", "protocol" Elements
<owner class="content provider">
<name>Yahoo Inc.</name>
<id>www.yahoo.com|dir.yahoo.com:8000</id>
</owner>
<protocol>http</protocol>
<owner class="client">
Beck, Hofmann [Page 5]
Internet Draft PSRL November 17, 2000
<name>abeck</name>
<id>205.167.45.1</id>
</owner>
<protocol>http</protocol>
3.5 The "rule" Element
The "rule element" contains one or more "property" elements.
3.5.1 Attributes of "rule"
Name Values
----------------------------
processing-point 1|2|3|4
The "processing-point" attribute specifies at which of the four
points in figure 1 a rule must be processed by the rule engine on
the caching proxy. The four "processing-points" are derived from the
Extensible Proxy Services Framework as described in [2]. Other
implementation architectures might define additional "processing-
points", which can be specified with PSRL by allowing additional
values for the "processing-point" attribute.
Figure 1 shows the typical HTTP data flow between a client, a
caching proxy, and an origin server. The four processing points (1-
4) represent locations in the round trip message flow where rules
can be processed and service modules can be executed. Note that the
message flow may skip points 3 and 4 after point 1 if the requested
object can be served from cache.
+--------+ +-----------+ +--------+
| |<------|4 3|<------| |
| Client | | Caching | | Origin |
| | | Proxy | | Server |
| |------>|1 2|------>| |
+--------+ +-----------+ +--------+
Figure 1: Rule Processing/Service Execution Points
Point 1: Client Request
A HTTP request from a client has been received. A possible
cache lookup has not yet occurred.
Point 2: Proxy Request
The requested Web object cannot be served from the cache and
the origin server is about to be contacted for the HTTP
resource.
Point 3: Origin Server Response
Beck, Hofmann [Page 6]
Internet Draft PSRL November 17, 2000
The HTTP response from the origin server has been received. It
has not yet been stored in the cache.
Point 4: Proxy Response
The HTTP response from the cache or the origin server is about
to be sent back to the client.
Depending on the service type, rules may be processed and services
may be executed at any of the four points outlined in figure 1. A
virus scanning service for instance should be executed at point 3 in
figure 1 in order to scan all Web objects for viruses before they
can be stored in the cache. A URL-based request filtering service on
the other hand should be executed at point 1 and an ad insertion
service will probably be executed at point 4.
We can imagine that in the future there will be a need to have more
processing points (at a finer granularity) than the ones mentioned
above. This will be reflected in a future version of this document.
3.5.2 The "property" Element
The "property" element contains one or more other "property"
elements and one or more "action" elements. "property" elements are
conditions, that, if met, will lead to the execution of the service
modules specified in the contained "action" elements. Nested
"property" elements represent a hierarchical "AND" relationship.
This means that an inner "property" condition can only be true, if
the outer "property" condition is true and so forth.
Attributes of "property"
Name Values
----------------------------
name CDATA
matches CDATA
The "name" attribute specifies the name of the message property that
is to be matched. This can be either a request or a response message
property. The protocol specified in the "protocol" element
determines what are legal property names. If the message property is
an HTTP request or response header, the list of legal header names
can be taken from [6].
For HTTP messages, the following property names are defined in
addition to the list of legal HTTP headers in [6]:
Property Name Refers to
--------------------------------------------------------------
"request-line" the first line of a HTTP request
"response-line" the first line of a HTTP response
"request-path" the relative path of the request URI
Beck, Hofmann [Page 7]
Internet Draft PSRL November 17, 2000
"request-body" the body of a HTTP request (POST)
"response-body" the body of a HTTP response
"user-id" a value to identify a user, assigned by
the access provider and unique for all
customers of the same access provider
The matches "matches" attribute specifies the pattern against which
the property value MUST be compared by the rule engine on the
caching proxy. The "matches" pattern MUST be a regular expression
compliant with the basic or extended regular expression syntax as
defined in [7].
If needed, the double-quote character (") MUST be represented in any
attribute value as """ or (as specified in [5]).
3.5.3 The "action" Element
The "action" element contains the name of the service module that is
to be executed on the caching proxy or a dedicated application
server. Instead of a service name the "action" element MAY also
contain the name of a built-in proxy library function.
Any arguments MAY be passed as part of the service module name,
using the standard "?"-encoding of attribute-value pairs used in
HTTP [6]. If the service module resides on a dedicated application
server and ICAP [8] will be used as the transport protocol, the
"action" element MAY contain an ICAP-URI as defined in the current
version of the ICAP specification [8].
Only one service/function/ICAP-URI MAY be specified per "action"
element. A "property" element, however, MAY contain several "action"
elements.
3.5.4 Examples of the "rule", "property" and "action" elements
<rule processing-point=4>
<!- Is the requested Web resource a HTML document? -->
<property name="Content-Type" matches="text/html">
<!-Is the user's preferred language among the supported ones?-->
<property name="Accept-Languages" matches="^de|^fr|^it|^es">
<!- Invoke translation service on trans.net server -->
<action>icap://trans.net/translate?mode=respmod</action>
</property>
</property>
</rule>
<rule processing-point=3>
<!- Is the requested Web resource an executable binary file? -->
<property name="Content-Type" matches="application/">
Beck, Hofmann [Page 8]
Internet Draft PSRL November 17, 2000
<!- Invoke virus scanning service on mcaffee.com -->
<action>icap://mcaffee.com/viruscheck?mode=respmod</action>
</property>
</rule>
4 Order of Service Execution
The order in which service modules on the caching proxy are executed
may change the final result of a Web transaction. For example, an ad
insertion service executed against the result of a Web page
translation service may produce a different result than a reverse
execution order.
Up to three rule modules may have to be processed by a caching proxy
per Web transaction. The order in which these rule modules are
processed MUST reflect the order in which the message flow reaches
the rule module owners. This means that for incoming requests at
points 2 and 3 in figure 1, the order MUST be:
1. Client rule module
2. Access provider rule module
3. Content provider rule module
For outgoing responses at points 3 and 4, the order MUST be:
1. Content provider rule module
2. Access provider rule module
3. Client rule module
Within a single rule module, the caching proxy MUST process and
execute all rules and actions IN THE ORDER THEY ARE SPECIFIED in the
rule module (both within "property" and "rule" elements). If the
rule processor determines that an action must be executed, it MUST
do so BEFORE continuing the rule matching process, since service
modules MAY modify message property values. This may influence the
result of subsequent pattern matches.
The authors of rule modules should therefore pay special attention
to the order of the "action" elements in their rule modules, as this
may have an effect on the final result.
5 Security Considerations
Although beyond the scope of this document, it is clearly necessary
to define a secure mechanism for transferring rule modules to
caching proxies. This will include authenticating and authorizing
rule module owners and caching proxies. The integrity of rule
Beck, Hofmann [Page 9]
Internet Draft PSRL November 17, 2000
modules must be guaranteed through the use of strong encryption as
they are transferred over the Internet.
Also, a security context must be established on the caching proxy
for each rule module to ensure that rule modules may not execute
service modules or call proxy library functions without without
being authorized to do so. Service modules running on the caching
proxy also must be restrained from consuming too many resources on
the caching proxy.
6 References
1 Bradner, S., "The Internet Standards Process -- Revision 3", BCP
9, RFC 2026, October 1996
2 Tomlinson, G., et al., "Extensible Proxy Services Framework",
http://www.ietf.org/internet-drafts/draft-tomlinson-epsfw-00.txt,
July 2000
3 Hofmann, M., Beck, A., "Example Services for Network Edge
Proxies", Workshop on Extensible Proxy Services Framework, San
Jose, CA, USA, September 13, 2000. Available at
http://www.cs.utah.edu/~horman/opencache/draft-hofmann-isfnep-
00.txt
4 Bradner, S., "Key words for use in RFCs to Indicate Requirement
Levels", Request for Comments 2119, Harvard University, March
1997
5 Bray, T., et al., Extensible Markup Language (XML) 1.0 (Second
Edition), http://www.w3.org/TR/2000/REC-xml-20001006,
October 2000
6 Fielding, R., et al., "Hypertext Transfer Protocol -- HTTP/1.1",
Request for Comments 2616, June 1999
7 ISO/IEC DIS 9945-2:1992, Information technology - Portable
Operating System Interface (POSIX) - Part 2: Shell and Utilities
(IEEE Std 1003.2-1992); X/Open CAE Specification, Commands and
Utilities, Issue 4, 1992
8 Elson, J., et al., "ICAP, the Internet Content Adaptation
Protocol", http://www.i-cap.org/icap_v1-25.txt, January 2000
7 Author's Addresses
Andre Beck
Markus Hofmann
Bell Laboratories
Lucent Technologies
Beck, Hofmann [Page 10]
Internet Draft PSRL November 17, 2000
101 Crawfords Corner Rd.
Holmdel, New Jersey 07733
Phone: (732) 332-5983
Email: {abeck, hofmann}@bell-labs.com
A Appendix - PSRL DTD
<!ELEMENT rulemodule (owner, protocol, rule+)>
<!ELEMENT owner (name, id)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT id (#PCDATA)>
<!ELEMENT protocol (#PCDATA)>
<!ELEMENT rule (property+)>
<!ELEMENT property (property*, action+)>
<!ELEMENT action (#PCDATA)>
<!ATTLIST owner class (content provider |
access provider | client) #REQUIRED>
<!ATTLIST rule processing-point (1|2|3|4) #REQUIRED>
<!ATTLIST property name CDATA #REQUIRED>
<!ATTLIST property matches CDATA #REQUIRED>
B Appendix - Rule Module Examples
Content Provider Rule Module Example for Advertisement Insertion
Service
<?xml version="1.0"?>
<rulemodule>
<owner class="content provider">
<name>Lucent Technologies</name>
<id>www.lucent.com</id>
</owner>
<protocol>http</protocol>
<rule processing-point="4">
<!- Is the requested Web document the home page? -->
<property name="Request-Path" matches="^/$|^/index.html$">
<!-Does the user send us a cookie for user identification?-->
<property name="Cookie" matches="UserID=">
<action>icap://adserver.net/insertad?mode=respmod</action>
</property>
</property>
<rule>
</rulemodule>
Access Provider Rule Module Example for Advertisement Insertion
Service for Free Internet Service
<?xml version="1.0"?>
<rulemodule>
Beck, Hofmann [Page 11]
Internet Draft PSRL November 17, 2000
<owner class="access provider">
<name>Comcast Free Internet Service</name>
<id>comcast</id>
</owner>
<protocol>http</protocol>
<rule processing-point="4">
<!- Is the requested Web resource a HTML document? -->
<property name="Content-Type" matches="text/html>
<!- Is the user a customer of the free Internet service? -->
<property name="User-Id" matches="^123[.]54[.]34[.]">
<action>icap://adserver.com/insert_ad?mode=respmod</action>
</property>
</property>
</rule>
</rulemodule>
Client Rule Module Example for Language Translation and Virus
Scanning Service
<?xml version="1.0"?>
<rulemodule>
<owner class="client">
<name>Markus Hofmann</name>
<id>23242</ID>
</owner>
<protocol>http</protocol>
<rule processing-point="4">
<!- Is the requested Web resource text based? -->
<property name="Content-Type" matches="application/">
<action>icap://mcaffee.com/virus_scan?mode=respmod</action>
</property>
</rule>
<rule processing-point="4">
<!- Is the requested Web resource text based? -->
<property name="Content-Type" matches="text/">
<!- Does the top level domain of the origin host not
equal ".de"? -> Document language is probably not
German -> Page needs to be translated -->
<property name="Host" matches="[^e]$|[^d][e]$|[^.][d][e]$">
<action>icap://icap.net/translate?mode=respmod</action>
</property>
</property>
</rule>
</rulemodule>
Content Provider Rule Module Example for Content Adaptation Service
for Wireless Web Access Devices
<?xml version="1.0"?>
<rulemodule>
<owner class="content provider">
<name>Yahoo Inc.</name>
Beck, Hofmann [Page 12]
Internet Draft PSRL November 17, 2000
<id>www.yahoo.com</id>
</owner>
<protocol>http</protocol>
<rule processing-point="4">
<!-Does the user have a wireless Web access device? -->
<property name="User-Agent" matches="Nokia|Ericcson|Palm">
<!- Is the requested Web resource text based? -->
<property name="Content-Type" matches="text/">
<action>icap://wapgateway.nl/transcode?mode=respmod</action>
</property>
</property>
</rule>
</rulemodule>
Full Copyright Statement
Copyright (C) The Internet Society (2000). All Rights Reserved.
This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it
or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any
kind, provided that the above copyright notice and this paragraph
are included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing
the copyright notice or references to the Internet Society or other
Internet organizations, except as needed for the purpose of
developing Internet standards in which case the procedures for
copyrights defined in the Internet Standards process must be
followed, or as required to translate it into languages other than
English.
The limited permissions granted above are perpetual and will not be
revoked by the Internet Society or its successors or assigns.
This document and the information contained herein is provided on an
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Acknowledgement
Funding for the RFC editor function is currently provided by the
Internet Society.
Beck, Hofmann [Page 13]