http://xml.apache.org/http://www.apache.org/http://www.w3.org/

Index
License
Install

Technologies
Infrastructure
User Guide
Dynamic Content

XSP Processor
SQL XSP Taglib
SQL Processor
LDAP Processor
DCP Processor

XSP WD

Javadocs

Cocoon 2
Javadoc XML

FAQ
Changes
Todo

Live Sites

Code Repository
Dev Snapshots
Mail Archive
Bug Database


Introduction

In addition to static content (that is, hand-written documents produced by web authors), web publishing also requires dynamic content generation. In dynamic content generation, XML documents or fragments are programmatically produced at request time.

In this context, content is the result of a computation based on request parameters and, frequently, on access to external data sources such as databases or remote server processes. This distinction in content origin justifies the extension of the "traditional" regions of web publishing (content and presentation) to also encompass that of logic.


Origins

The Cocoon community has long recognized the need for dynamic content generation capabilities. In response to this requirement, the Cocoon project has proposed XSP (eXtensible Server Pages). XSP defines a new XML DTD and namespace that addresses a complete region of web publishing, that of logic-based, dynamic content generation. XSP is a key component of future Cocoon versions and currently in development.

DCP (Dynamic Content Processor), on the other hand, aims at providing easy-to-use dynamic content generation capabilities in the context of the current version of Cocoon. DCP is also a testbed for implementation-related issues in the upcoming development of XSP. These issues include aspects such as multiple language support, automatic code reloading and code reuse.


Goals

DCP has been designed to provide dynamic content generation capabilities to Cocoon with the following goals in mind:

  • Minimal changes to the current architecture.
  • Maximal ease of use for authors and developers.
  • Avoiding XSP implementation complexities while still providing a useful facility for dynamic content generation.

In order to maximize ease of use, the following early decisions were made for DCP:

  • Other than the source XML document itself, no external documents should be required to map inline dynamic content generation directives to external programs.
  • External programs should be as easy to write as possible. In addition to Java, it should be possible to also write external programs in easy-to-use scripting languages.

By restricting the use of external documents (such as XSP libraries) to specify how to map content generation directives to external programs, the obvious choice was the use of processing instructions (e.g. <?dcp-object?>, <?dcp-content?>).

This decision results in a number of limitations when compared to the more general mechanism of transforming DOM elements (as opposed to processing instructions).

One such limitation is that passing [static] parameters to external programs is limited to the single-valued pseudo-attributes used in processing instructions. Closer inspection reveals, however, that this mechanism is appropriate for a large number of dynamic content generation requirements.

Keeping external program writing simple means not requiring programmers to learn a new API or to follow restrictive coding conventions. The ability to write programs in easy-to-use scripting languages also contributes to simplifying development. This is particularly appealing, for instance, to web authors already familiar with Javascript, which is currently supported.

Jean-Marc Lugrin's (Fesi) (Free EcmaScript Interpreter) is used to provide support for Javascript.


Relationship with Existing Technologies

DCP (and XSP, for that matter) differs from existing dynamic web content generation technologies in that it deals with DOM trees rather than with the textual representation of HTML documents.

Such technologies, however, have had a strong influence in DCP's design both because they pioneered programmatic web content generation and because DCP (and, again, XSP) aims at overcoming their limitations in the realm of XML-based document processing.

JSP, in particular, is a widely used standard in the Java environment. Other comparable technologies are Microsoft's ASP , Cold Fusion , Sun's [deprecated] Page Compilation , Webmacro and GSP .

These technologies share three common characteristics:

  • Text-based. Not being XML-aware, these technologies deal with textual streams, rather than with DOM trees or SAX events.
  • Html-oriented. Generation capabilities have been designed with HTML in mind and do not lend themselves easily to produce XML.
  • No separation of logic and content. Probably their most problematic area; these technologies mix content and program logic in the same document. This impairs labor division in web publishing.

DCP and XSP, on the other hand, aim at a complete separation of logic and content.

In DCP, web authors specify dynamic content insertion using a simple, standard XML syntax while programmers concentrate on content generation without being concerned by context or presentation issues.

Finally, the difference between DCP and XSP is that while DCP is interpreted and executed at runtime, XSP page are compiled and executed directly as document producers. This allows better separation of content and logic (since the XSP pages can be processed like regular document at first) and increased performance (since no interpretation is required and compiled pages are cached).


A Simple Javascript Example

Consider the following dynamic Cocoon XML document (sample.xml):

Ecmascript Example

In this example, portions shown in red are to be dynamically generated every time the document is requested.

For this to be achieved, three separate components must be written:

  • A source XML file containing the static portions of the document and some dynamic content insertion directives.
  • A DCP script containing DOM node-generation functions. This script can be used by many different XML documents.
  • An XSL stylesheet containing transformation rules to generate HTML from the (expanded) XML document. Again, this stylesheet can be used by many different XML documents.

The following processing instructions are recognized:

  • <?dcp-object name="objectName" [language="languageName"] code="codeLocation"?>

    This instruction declares an external program (or DCP script) that contains node-generation methods. These methods will be invoked during document processing as dictated by the appearance of subsequent <?dcp-content?> directives (explained below).

    • Attribute name specifies an author-defined objectName that will be used to qualify method names in the DCP script. This name must be unique within the document.
    • Attribute language specifies the programming language in which the DCP script is written. Currently supported values for this attribute are java and javascript (also referred to as ecmascript). This attribute is optional; its default value is java. Other languages may be added in the future. It is valid for the same XML document to use multiple DCP scripts written in different languages.
    • Attribute code specifies the actual DCP script location. Interpretation of this mandatory attribute is language-dependent. For Java, it is a fully qualified class name. For Javascript, it is a script filename relative to the path of the invoking XML document. The same code can be specified multiple times in a given document, provided a different objectName is used in each case.
  • <?dcp-content method="object.method" [param1="value" param2="value" ...] ?>

    This instruction requests the substitution of its corresponding node by the return value of a named method defined in a DCP script.

    Single-valued, named parameters can be passed to node-generation methods by specifying additional attributes in the <?dcp-content?> processing instruction. These attributes are made available to the method through a Dictionary argument.

    Attribute method defines what method to invoke on a given object. The object name must have been associated with a DCP script by means of a previous <?dcp-object?> processing instruction. Node-generation methods must be public and conform to the following signature:

    methodName( [java.util.Dictionary parameters], [org.w3c.Node source] )

    where the [optional] function arguments are:

    • parameters. A dictionary containing any optional named parameters specified as additional attributes to the <?dcp-content?> processing instruction.
    • source. The processing instruction node corresponding to the <?dcp-content?> directive itself. This is useful for methods that need access to siblings or ancestors in the DOM tree.


    Methods can return any type of value, including primitive types, void and null. Void and null are understood as a request to remove the corresponding node. Returned values that are instances of org.w3c.Node are simply inserted into the corresponding DOM tree position. Primitive types and regular objects are wrapped as strings in org.w3c.Text nodes. Arrays are wrapped as org.w3c.DocumentFragment's containing as many children as elements in the array; each element is recursively wrapped according to the above rules.
  • <?dcp-var name1="value1" [name2="value2" ...]?>

    This instruction declares one or more global variables that will be passed in each subsequent method invocation as if explicitly specified as parameters.

    This mechanism is basically a convenience shorthand to avoid cluttering <?dcp-content?> instructions with too long parameter lists.

    Declared variables are global to all subsequent method invocations. For a method to use a given global variable as a parameter, it must have been previously declared in the same document.

That said, the source XML document for the above example would be:

Ecmascript Example Source

In this document:

  • The processing instruction:

    <?dcp-object name="util" language="javascript" code="test.es"?>

    declares the existence of an external Javascript program contained in a file called test.es.

    Subsequent references to file test.es will use the alias util.
  • The processing instruction:

    <?dcp-content method="util.getSystemDate" format="MM/dd/yyyy"?>

    specifies that function getSystemDate (contained in file test.es) must be called and its return value substituted in the document position where the <?dcp-content?> directive originally appeared.

    Furthermore, when this function is called, it is passed a parameter of name format and value MM/dd/yyyy.

The initial portion of the script file test.es contains:

    var count = 0;
  
    /* Node Generation Functions */
    function getCount() {
      /* To reference variables as static, prepend "global." */
      return formatCount(++global.count);
    }

    function getSystemDate(parameters) {
     var now = new Date();
     var format = parameters.get("format");
  
     if (format != null) {
      return formatDate(now, format);
     }
  
     return now;
    }
   

DCP automatically reloads Javascript script files whenever they change on disk.

When a global variable must be treated as static, references to it must be qualified by the global modifier. This is convenient when the programmer wants the variable to retain its value across requests.

For functions returning simple object values, DCP takes care of wrapping the returned value as an org.w3c.dom.Text node containing the toString() form of the object. When a function returns null, the corresponding node is removed from the DOM tree.

Of course, returned values can be instances of a DOM Node type. This is illustrated by the function getParameters below:

    function getParameters() {
     var parameterNames = request.getParameterNames();
  
     if (!parameterNames.hasMoreElements()) {
      return null;
     }
  
     var parameterList = createElement("parameters");
  
     while (parameterNames.hasMoreElements()) {
      var parameterName = parameterNames.nextElement();
  
      var parameterElement = createElement("parameter");
      parameterElement.setAttribute("name", parameterName);
  
      var parameterValues = request.getParameterValues(parameterName);
  
      for (var i = 0; i < parameterValues.length; i++) {
       var valueElement = createElement("parameter-value");
       valueElement.appendChild(createTextNode(parameterValues[i]));
       parameterElement.appendChild(valueElement);
      }
  
      parameterList.appendChild(parameterElement);
     }
  
     return parameterList;
    }
   

Thus, if our example processes the request:

    sample.xml?me=Tarzan&you=Jane&you=Cheetah 
   

the above function would generate a DOM subtree equivalent to the following XML fragment:

    <parameters>
  
      <parameter name="me">
        <parameter-value>Tarzan</parameter-value>
      </parameter>
  
      <parameter name="you">
        <parameter-value>Jane</parameter-value>
        <parameter-value>Cheetah</parameter-value>
      </parameter>
  
    </parameters>  
   

The general signature for a dynamic content generation Javascript function is:

    function functionName(parameters, source) 
   

where:

  • parameters is an instance of java.util.Dictionary containing user-supplied parameters specified as <?dcp-content?> pseudo-attributes. Example: parameter format in function getSystemDate.
  • source is an instance of org.w3c.dom.ProcessingInstruction corresponding to the actual <?dcp-content?> processing instruction. This node is useful for performing context-dependent processing such as examining sibling or parent DOM nodes.

Note: Programmers may omit any or all of these arguments if they are not actually needed by the task at hand.

The following objects are always made available to external Javascript programs as global variables:

  • javax.servlet.http.HttpServletRequest request
  • org.w3c.dom.Document document

The following convenience functions are made accessible by DCP to external Javascript programs:

  • DOM factory functions are provided by DCP for easy construction of DOM nodes: createTextNode(data), createElement(tagName), etc. In general, all DOM factory methods defined for interface org.w3c.dom.Document are available as global Javascript functions.
  • Formatting functions for numbers and dates:
    • function formatCount(number),
    • function formatCurrency(number),
    • function formatPercentage(number) and
    • function formatDate(date, format). Date format strings conform to the syntax defined by class java.text.DateFormat.
  • A JDBC access function function sqlRowSet(connectionName, selectStatement), that returns an array of Javascript objects whose member names are given by the lowercase form of each SELECT column label. The array contains as many elements as rows are returned by the SELECT statement. Parameters to this function are:
    • connectionName. A connection pool name as specified by Gefion Software's DBConnectionManager in the resource file db.properties (see below).
    • selectStatement. A SQL SELECT statement for the database manager in use.

Using the Oracle demo connection in file db.properties

  logfile=/tmp/dbcm.log
  drivers=postgresql.Driver oracle.jdbc.driver.OracleDriver

  dictionary.url= jdbc:postgresql:translator
  dictionary.maxconn=8
  dictionary.user=clark
  dictionary.password=kent

  demo.url=jdbc:oracle:thin:@localhost:1521:orcl
  demo.maxconn=4
  demo.user=scott
  demo.password=tiger
  

a sample Javascript user function would look like:

    var selectStatement =
     "SELECT   EMPNO, " +
     "         ENAME, " +
     "         SAL + NVL(COMM, 0) AS INCOME " +
     "FROM     EMP " +
     "ORDER BY EMPNO";

    var emps = sqlRowSet("demo", selectStatement);

    for (var i = 0; i < emps.length; i++) {
     addEmp(emps[i].empno, emps[i].ename, emps[i].income)
    }
   

Finally, it is possible, in general, to:

  • Declare multiple external programs in the same XML document.

    <?dcp-object name="emp" language="javascript" code="emp.es"?>
    <?dcp-object name="dept" language="javascript" code="dept.es"?>
  • Declare the same external program multiple times in the same XML document, provided different names are used for each declaration.

    <?dcp-object name="emp" language="ecmascript" code="emp.es"?>
    <?dcp-object name="boss" language="ecmascript" code="emp.es"?>
  • Mix external programs written in different languages in the same XML document.

    <?dcp-object name="emp" language="ecmascript" code="emp.es"?>
    <?dcp-object name="dept" language="java" code="payroll.Department"?>

Java DCP Programming

For the Java language, the attribute code in the declaration

    <?dcp-object name="util" language="java" code="payroll.Employee"?>
   

is interpreted as a class name. Such class must be accessible through the servlet engine's classpath setting.

Node-generation methods in Java conform to the following signature:

    public methodName(
     [java.util.Dictionary parameters],
     [org.w3c.dom.Node source]
    )
   

Like in Javascript, these arguments are optional. The return type can be of any Java type including void.

Java classes used as DCP objects need not implement/extend any particular interface or class. In the Cocoon environment, however, it is strongly recommended to extend class:

    org.apache.cocoon.processor.dcp.java.ServletDCPProcessor. 
   

This class provides the following convenience services:

  • Direct access to the servlet request object
  • Direct access to the document being processed
  • Factory methods to create all types of DOM nodes (elements, text nodes, document fragments, processing instructions, etc)

If developers choose not to extend this convenience class, the following requeriments must be honored:

  • The class must have an empty constructor
  • The class must have at least one method that conforms to the above signature.

In absence of a non-empty constructor, if the class does require initialization it can implement:

    org.cocoon.framework.Configurable. 
   

In this case, the DCP processor will invoke the class' init method immediately after instantiation. The configuration values passed in this case are:

  • The document being processed
  • The parameters passed to the processor by the invoking environment

For the Cocoon environment, parameters contain:

  • The javax.servlet.http.HttpServletRequest request object corresponding to the current web server's request.
  • The java.lang.String path associated with the current source XML document.
  • The java.lang.String browser associated with the current request's User-Agent HTTP header.

Based on the above, for our tutorial example, the corresponding Java class would be:

    import java.util.*;
    import java.text.*;
    import org.w3c.dom.*;
    import javax.servlet.http.*;
    import org.apache.cocoon.processor.dcp.java.ServletDCPProcessor;
    
    public class Util extends ServletDCPProcessor {
      private static int count = 0;
    
      public synchronized int getCount() {
        return ++count;
      } 
    
      public String getSystemDate(Dictionary parameters) {
        Date now = new Date();
        String formattedDate = now.toString();
        String format = (String) parameters.get("format");
    
        if (format != null) {
          try {
            SimpleDateFormat dateFormat = new SimpleDateFormat(format);
            formattedDate = dateFormat.format(now);
          } catch (Exception e) { } // Bad format, ignore and return default
        }
    
        return formattedDate;
      }
    
      public Element getRequestParameters() {
        Enumeration e = this.request.getParameterNames();
    
        if (!e.hasMoreElements()) { // No parameters given, remove node from document 
          return null;
        }
    
        Element parameterList = createElement("parameters");
    
        int count;
        Element parameterValue;
        Element parameterElement;
        for (count = 0; e.hasMoreElements(); count++) {
          String name = (String) e.nextElement();
          String[] values = this.request.getParameterValues(name);
    
          parameterElement = createElement("parameter");
          parameterElement.setAttribute("name", name);
    
          for (int i = 0; i < values.length; i++) {
            parameterValue = createElement("parameter-value");
            parameterValue.appendChild(createTextNode(values[i]));
    
            parameterElement.appendChild(parameterValue);
          }
    
          parameterList.appendChild(parameterElement);
        }
    
        return parameterList;
      }
    }
   

Known Problems
  • Restricted Static Parameter Passing. Due to the use of processing instructions as a means of inserting dynamic content in XML documents (as opposed to DOM elements), structured parameter passing can become too complex.

    Consider the case when an employee list must be dynamically generated. Using elements to pass parameters to node-generation methods would allow for complex forms.

    Other nested, multivalued parameter forms simply cannot be expressed by means of single-valued processing instruction pseudo-attributes.

    A workaround for this is the traditional HTML idiom of using hidden fields in HTTP forms to pass static parameters.

    Important to note, XSP uses elements (instead of processing instructions) to specify dynamic content substitution.
  • Javascript Performance. While the current Javascript DCP performance is acceptable for small and medium-sized applications with light traffic, its response time is perceivably slower than that of Java.

    This is a natural consequence of Javascript being interpreted by Java (itself interpreted by the underlying VM). As such, this restriction may also apply to other scripting languages such as WebL.

    An alternative would be the use of Netscape's Rhino (which supports compilation to class files), but this may create incompatibilities with existing programs that depend on FESI features.

    In the meantime, some Fesi-based workarounds are in place, most notably evaluator pooling, a technique based on dynamically cloning script evaluators when multiple, concurrent requests use the same Javascript external program.
  • No Java Class Reloading. While Javascript files are automatically reloaded when they change on disk, currently there is no provision for automatic Java class reloading.

    Implementing this feature requires a specialized class loader. Note that servlet engine-provided class reloading does not apply to DCP because external Java programs are not servlets but, rather, regular classes dynamically instantiated by the DCP driver instead of by the underlying servlet engine.

    The latter is also true if the user-supplied class extends DCP's convenience class org.apache.cocoon.processor.dcp.java.ServletDCPProcessor.

Additional Examples

In addition to the examples presented in this document, there is a more complex application written entirely in Cocoon using Java DCP: the Cocoon Multilingual Dictionary .

This application lets users lookup terms and their translations in a number of European languages using Esperanto as the intermediate language. The entire example (source code and data, ~750K) can be downloaded from the above location.


Future Directions

DCP will be deprecated in favor of XSP. Therefore, development is currently limited to bug fixes.


Acknowledgments

The following people have contributed to the definition of DCP:

  • Assaf Arkin, <arkin@trendline.co.il>
  • Brett Knights, <bknights@uniserve.com>
  • Donald Ball, <balld@apache.org>
  • Keith Visco, <kvisco@ziplink.net>
  • Stefano Mazzocchi, <stefano@apache.org>


Copyright © 1999-2000 The Apache Software Foundation. All Rights Reserved.