RFC 
 TOC 
RFC2629throughXSLT  J. F. Reschke 
  greenbytes 
  May 2004 

Transforming RFC2629-formatted XML through XSLT


 RFC 
 TOC 

Table of Contents

Introduction
Supported RFC2629 elements
 2.1  Extension elements
Processing Instructions
 3.1  Supported xml2rfc-compatible PIs
 3.2  Unsupported xml2rfc-compatible PIs
 3.3  Extension PIs
Anchors
Supported XSLT engines
 5.1  Standalone Engines
 5.2  In-Browser Engines
Transforming to HTML
 6.1  HTML compliance
 6.2  Standard HTML LINK elements
 6.3  Standard HTML metadata
 6.4  Dublin Core (RFC2731) metadata
Transforming to XHTML
Transforming to CHM (Microsoft Compiled Help)
Transforming to PDF via XSL-FO
 9.1  Extension feature matrix
 9.2  Example: producing output for Apache FOP
10  Utilities
 10.1  Checking References
 10.2  Producing reference entries for books
11  Informative References
§  Author's Address
§  Index


 TOC 

1 Introduction

This document describes a set of XSLT transformations that can be used to transform RFC2629-compliant XML (see [RFC2629]) to various output formats, such as HTML and PDF. The main topics are


 TOC 

2 Supported RFC2629 elements

rfc2629.xslt supports both all RFC2629 grammar elements and the extensions implemented in xml2rfc 1.21.

2.1 Extension elements

In addition, rfc2629.xslt supports a set of extension elements, using elements and attributes in the namespace "http://greenbytes.de/2002/rfcedit". They are used for

Note that these extensions are experimental. Please email the author in case you're interested in using these extensions.


 TOC 

3 Processing Instructions

All PIs can be set as XSLT parameter as well, overriding any value that is found in the source file to be transformed.

Using processing instructions:

<?rfc toc="yes"?>
<?rfc-ext support-rfc2731="no"?>

Using XSLT parameters:

saxon foo.xml rfc2629.xslt xml2rfc-toc=yes \
  xml2rfc-ext-support-rfc2731=no > result.hzml 

3.1 Supported xml2rfc-compatible PIs

PI target PI pseudo-attribute XSLT parameter name default comment
rfc   background   xml2rfc-background   (not set)    
rfc   compact   xml2rfc-compact   "no"   only applies to HTML output method when printing  
rfc   comments   xml2rfc-comments   (not set)    
rfc   editing   xml2rfc-editing   "no"    
rfc   footer   xml2rfc-footer   (not set)    
rfc   header   xml2rfc-header   (not set)    
rfc   inline   xml2rfc-inline   (not set)    
rfc   iprnotified   xml2rfc-iprnotified   "no"    
rfc   linkmailto   xml2rfc-linkmailto   "yes"    
rfc   private   xml2rfc-private   (not set)    
rfc   sortrefs   xml2rfc-sortrefs   "no"    
rfc   symrefs   xml2rfc-symrefs   "no"    
rfc   toc   xml2rfc-toc   "no"    
rfc   tocdepth   xml2rfc-tocdepth   99    
rfc   topblock   xml2rfc-topblock   "yes"    

3.2 Unsupported xml2rfc-compatible PIs

PI target PI pseudo-attribute comment
rfc   include   incompatible with XML/XSLT processing model  
rfc   needLines    
rfc   slides    
rfc   strict    
rfc   subcompact    
rfc   tocindent   (defaults to "yes")  
rfc   tocompact    

3.3 Extension PIs

PI target PI pseudo-attribute XSLT parameter name default description
rfc-ext   support-rfc2731   xml2rfc-ext-support-rfc2731   "yes"   Decides whether the HTML transformation should generate META tags according Section 6.4.  

 TOC 

4 Anchors

The transformation automatically generates anchors that are supposed to be stable and predictable and that can be used to identify specific parts of the document. Anchors are generated both in HTML and XSL-FO content (but the latter will only be used for PDF output when the XSL-FO engine supports producing PDF anchors).

The following anchors get auto-generated:

Anchor name Description
rfc.abstract   Abstract  
rfc.authors   Authors section  
rfc.copyright   Copyright section  
rfc.copyrightnotice   Copyright notice  
rfc.figure.n   Figures (titled)  
rfc.figure.u.n   Figures (untitled)  
rfc.index   Index  
rfc.ipr   Intellectual Property  
rfc.iref.n   Internal references  
rfc.note.n   Notes (from front section)  
rfc.references   References  
rfc.references.n   Additional references  
rfc.section.n   Section n  
rfc.section.n.p.m   Section n, paragraph m  
rfc.status   Status of memo  
rfc.toc   Table of contents  

 TOC 

5 Supported XSLT engines

The transformation requires a non-standard extension function (see exsl:node-set) which is however widely available. XSLT processors that do not support this extension (or a functional equivalent) currently are not supported.

5.1 Standalone Engines

The following XSLT engines are believed to work well:

5.2 In-Browser Engines

The following browsers seem to work fine:

The following browsers are known not to work properly:


 TOC 

6 Transforming to HTML

Transformation to HTML can be done inside the browser if it supports XSLT. To enable this, add the following processing instruction to the start of the source file:

  <?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>

(and ensure that rfc2629.xslt is present).

6.1 HTML compliance

The transformation result is supposed to conform to the HTML 4.01 strict DTD [HTML]. This can be checked using the W3C's online validator at <http://validator.w3.org>.

6.2 Standard HTML LINK elements

LINK elements exist since HTML 2.0. They can be used to embed content-independant links inside the document. Unfortunately, only few user agents fully support this element, namely Mozilla where it's called "Site Navigation Bar" (by default disabled!).

The following LINK elements are produced:

LINK type description
alternate   for RFCs, a link to the authorative ASCII version on the IETF web site  
appendic   pointer to all top-level appendics  
author   pointer to "authors" section  
chapter   pointer to all top-level sections  
contents   pointer to table of contents  
copyright   pointer to copyright statement  
index   pointer to index  

The figure below shows how Mozilla Firebird displays the Site Navigation Bar for rfc2396.xml.


(LINK elements displayed in Mozilla Firebird for RFC2396.xml)

6.3 Standard HTML metadata

The following standard HTML META elements are produced:

META name description
generator   from XSLT engine version and stylesheet version  
keywords   from keyword elements in front section  

6.4 Dublin Core (RFC2731) metadata

Unless turned off using the "rfc-ext support-rfc2731" processing instruction, the transformation will generate metadata according to [RFC2731].

The following DCMI properties are produced:

META name description
DC.Creator   from author information in front section  
DC.Date.Issued   from date information in front section  
DC.Description.Abstract   from abstract  
DC.Identifier   document URN [RFC2648] from "docName" attribute  
DC.Relation.Replaces   from "obsoletes" attribute  

 TOC 

7 Transforming to XHTML

Transforming to XHTML requires slightly different XSLT output options and is implemented by the derived transformation script rfc2629toXHTML.xslt.

Note: Microsoft Internet Explorer does not support XHTML. Therefore it usually makes more sense to generate plain old HTML.


 TOC 

8 Transforming to CHM (Microsoft Compiled Help)

To generate a CHM file using Microsoft's HTML Help Compiler (hhc), three files are required in addition to the HTML file.

  1. hhc - table of contents file (HTML)
  2. hhk - index file (HTML)
  3. hhp - project file (plain text)

The three files are generated with three specific transformations, each requiring the additional XSLT parameter "basename" to specify the filename prefix.

Example:

saxon rfc2616.xml rfc2629toHhp.xslt basename=rfc2616  > rfc2616.hhp
saxon rfc2616.xml rfc2629toHhc.xslt basename=rfc2616  > rfc2616.hhc
saxon rfc2616.xml rfc2629toHhk.xslt basename=rfc2616  > rfc2616.hhk
hhc rfc2616.hhp

 TOC 

9 Transforming to PDF via XSL-FO

Transformation to XSL-FO [XSL-FO] format is available through rfc2629toFO.xslt (which includes rfc2629.xslt, so keep both in the same folder).

Compared to HTML user agents, XSL-FO engines unfortunately either come as open source (for instance, Apache FOP) or feature-complete (for instance, AntennaHouse XSL Formatter), but not both at the same time.

As Apache FOP needs special workarounds (page breaking, table layout), and some popular extensions aren't standardized yet, the translation produces a generic output (hopefully) conforming to [XSL-FO-11-WD]. Specific backends (xsl11toFop.xslt, xsl11toXep.xslt, xsl11toAn.xslt) the provide post-processing for the individual processors.

9.1 Extension feature matrix

PDF anchors PDF bookmarks PDF document information Index cleanup
XSL 1.1 WD   no, but can be auto-generated from "id" attributes   yes   no, but uses XEP output extensions   yes  
Antenna House XSL formatter   no   yes (from XSL 1.1 bookmarks)   yes (from XEP document info)   yes (just page duplicate elimination, from XSL 1.1 page index)  
Apache FOP   yes   yes (from XSL 1.1 bookmarks)   no   no  
RenderX XEP   no   yes (from XSL 1.1 bookmarks)   yes   yes (from XSL 1.1 page index)  

9.2 Example: producing output for Apache FOP

Example:

saxon rfc2616.xml rfc2629toFo.xslt > tmp.fo
saxon tmp.fo xsl11toFop > rfc2629.fo

 TOC 

10 Utilities

10.1 Checking References

check-ietf-references.xslt can be used to check all references to RFC-series IETF publications (note this script requires a local copy of <ftp://ftp.isi.edu/in-notes/rfc-index.xml>). For instance:

> saxon rfc2518.xml check-ietf-references.xslt
Normative References:
RFC1766: [PROPOSED STANDARD] obsoleted by RFC3066 RFC3282
RFC2277: [BEST CURRENT PRACTICE] (-> BCP0018) ok
RFC2119: [BEST CURRENT PRACTICE] (-> BCP0014) ok
RFC2396: [DRAFT STANDARD] ok
RFC2069: [PROPOSED STANDARD] obsoleted by RFC2617
RFC2068: [PROPOSED STANDARD] obsoleted by RFC2616
RFC2141: [PROPOSED STANDARD] ok
RFC2279: [PROPOSED STANDARD] obsoleted by RFC3629
Informational References:
RFC2026: [BEST CURRENT PRACTICE] (-> BCP0009) ok
RFC1807: [INFORMATIONAL] ok
RFC2291: [INFORMATIONAL] ok
RFC2413: [INFORMATIONAL] ok
RFC2376: [INFORMATIONAL] obsoleted by RFC3023

10.2 Producing reference entries for books

amazon-asin.xslt uses the Amazon web services to generate a <reference> element for a given ASIN (ISBN). For instance:

<?xml version="1.0" encoding="utf-8"?>
<references>
 <reference target="urn:isbn:0134516591">
   <front>
     <title>Simple Book, The: An Introduction to Internet Management,
               Revised Second Edition</title>
     <author surname="Rose"
                fullname="Marshall T. Rose" initials="M. T. "/>
     <author surname="Marshall"
                fullname="Rose T. Marshall" initials="R. T."/>
     <seriesInfo name="Prentice Hall" value=""/>
     <date year="1996" month="March"/>
   </front>
 </reference>
</references>

Note that the resulting XML usually requires checking, in this case Amazon's database is playing tricks with Marshall's name...


 TOC 

11  Informative References

[RFC2629] Rose, M.T., "Writing I-Ds and RFCs using XML", RFC 2629, June 1999.
[RFC2648] Moats, R., "A URN Namespace for IETF Documents", RFC 2648, August 1999.
[RFC2731] Kunze, J.A., "Encoding Dublin Core Metadata in HTML", RFC 2731, December 1999.
[HTML] Raggett, D., Hors, A. and I. Jacobs, "HTML 4.01 Specification", W3C REC REC-html401-19991224, December 1999.
[XSL-FO] Adler, S.Berglund, A.Caruso, J.Deach, S.Graham, T.Grosso, P.Gutentag, E.Milowski, R.Parnell, S.Richman, J. and S. Zilles, "Extensible Stylesheet Language (XSL) Version 1.0", W3C REC REC-xsl-20011015, October 2001.
[XSL-FO-11-WD] Berglund, A., "Extensible Stylesheet Language (XSL) Version 1.1", W3C REC WD-xsl11-20031217, December 2003.

 TOC 

Author's Address

  Julian F. Reschke
  greenbytes GmbH
  Salzmannstrasse 152
  Muenster, NW 48159
  Germany
Phone:  +49 251 2807760
Fax:  +49 251 2807761
EMail:  julian.reschke@greenbytes.de
URI:  http://greenbytes.de/tech/webdav/
 

 TOC 

Index

A B C D E F G H I K L M N P R S T X

A
  alternate HTML LINK element   6.2
  Anchors 
    rfc.abstract   4
    rfc.authors   4
    rfc.copyright   4
    rfc.copyrightnotice   4
    rfc.figure.n   4
    rfc.figure.u.n   4
    rfc.index   4
    rfc.ipr   4
    rfc.iref.n   4
    rfc.note.n   4
    rfc.references   4,  4
    rfc.section.n   4
    rfc.section.n.p.m   4
    rfc.status   4
    rfc.toc   4
  AntennaHouse XSL Formatter   9
  Apache FOP   9
  appendix HTML LINK element   6.2
  author HTML LINK element   6.2
B
  background PI pseudo-attribute   3.1
C
  chapter HTML LINK element   6.2
  CHM format   8
  comments PI pseudo-attribute   3.1
  compact PI pseudo-attribute   3.1
  contents HTML LINK element   6.2
  copyright HTML LINK element   6.2
  Creator DCMI property   6.4
D
  Date.Issued DCMI property   6.4
  DCMI properties 
    Creator   6.4
    Date.Issued   6.4
    Description.Abstract   6.4
    Identifier   6.4
    Relation.Replaces   6.4
  Description.Abstract DCMI property   6.4
E
  editing PI pseudo-attribute   3.1
F
  footer PI pseudo-attribute   3.1
G
  generator HTML META element   6.3
H
  header PI pseudo-attribute   3.1
  HTML compliance   6.1
  HTML LINK elements 
    alternate   6.2
    appendix   6.2
    author   6.2
    chapter   6.2
    contents   6.2
    copyright   6.2
    index   6.2
  HTML META elements 
    generator   6.3
    keywords   6.3
I
  Identifier DCMI property   6.4
  include PI pseudo-attribute   3.2
  index HTML LINK element   6.2
  inline PI pseudo-attribute   3.1
  Internet Explorer 5.5   5.2
  Internet Explorer 6   5.2
  iprnotified PI pseudo-attribute   3.1
K
  keywords HTML META element   6.3
L
  linkmailto PI pseudo-attribute   3.1
M
  Microsoft Help   8
  Mozilla   5.2
  MSXML3   5.1
  MSXML4   5.1
N
  needLines PI pseudo-attribute   3.2
P
  Parameters 
    xml2rfc-background   3.1
    xml2rfc-comments   3.1
    xml2rfc-compact   3.1
    xml2rfc-editing   3.1
    xml2rfc-ext-support-rfc2731   3.3
    xml2rfc-footer   3.1
    xml2rfc-header   3.1
    xml2rfc-inline   3.1
    xml2rfc-iprnotified   3.1
    xml2rfc-linkmailto   3.1
    xml2rfc-private   3.1
    xml2rfc-sortrefs   3.1
    xml2rfc-symrefs   3.1
    xml2rfc-toc   3.1
    xml2rfc-tocdepth   3.1
    xml2rfc-topblock   3.1
  private PI pseudo-attribute   3.1
  Processing Instruction pseudo attributes 
    background   3.1
    comments   3.1
    compact   3.1
    editing   3.1
    footer   3.1
    header   3.1
    include   3.2
    inline   3.1
    iprnotified   3.1
    linkmailto   3.1
    needLines   3.2
    private   3.1
    slides   3.2
    sortrefs   3.1
    strict   3.2
    subcompact   3.2
    support-rfc2731   3.3
    symrefs   3.1
    toc   3.1
    tocdepth   3.1
    tocindent   3.2
    tocompact   3.2
    topblock   3.1
R
  Relation.Replaces DCMI property   6.4
  rfc.abstract anchor   4
  rfc.authors anchor   4
  rfc.copyright anchor   4
  rfc.copyrightnotice anchor   4
  rfc.figure.n anchor   4
  rfc.figure.u.n anchor   4
  rfc.index anchor   4
  rfc.ipr anchor   4
  rfc.iref.n anchor   4
  rfc.note.n anchor   4
  rfc.references anchor   4
  rfc.references.n anchor   4
  rfc.section.n anchor   4
  rfc.section.n.p.m anchor   4
  rfc.status anchor   4
  rfc.toc anchor   4
S
  Saxon   5.1
  slides PI pseudo-attribute   3.2
  sortrefs PI pseudo-attribute   3.1
  strict PI pseudo-attribute   3.2
  subcompact PI pseudo-attribute   3.2
  support-rfc2731 PI pseudo-attribute   3.3
  symrefs PI pseudo-attribute   3.1
T
  toc PI pseudo-attribute   3.1
  tocdepth PI pseudo-attribute   3.1
  tocindent PI pseudo-attribute   3.2
  tocompact PI pseudo-attribute   3.2
  topblock PI pseudo-attribute   3.1
X
  Xalan   5.1
  xml-stylesheet PI   6
  xml2rfc-background parameter   3.1
  xml2rfc-comments parameter   3.1
  xml2rfc-editing parameter   3.1,  3.1
  xml2rfc-ext-support-rfc2731 parameter   3.3
  xml2rfc-footer parameter   3.1
  xml2rfc-header parameter   3.1
  xml2rfc-inline parameter   3.1
  xml2rfc-iprnotified parameter   3.1
  xml2rfc-linkmailto parameter   3.1
  xml2rfc-private parameter   3.1
  xml2rfc-sortrefs parameter   3.1
  xml2rfc-symrefs parameter   3.1
  xml2rfc-toc parameter   3.1
  xml2rfc-tocdepth parameter   3.1
  xml2rfc-topblock parameter   3.1
  xsltproc   5.1