RFC2629throughXSLT | J. F. Reschke |
greenbytes | |
May 2004 |
Transforming RFC2629-formatted XML through XSLT
1
Introduction
2
Supported RFC2629 elements
2.1
Extension elements
3
Processing Instructions
3.1
Supported xml2rfc-compatible PIs
3.2
Unsupported xml2rfc-compatible PIs
3.3
Extension PIs
4
Anchors
5
Supported XSLT engines
5.1
Standalone Engines
5.2
In-Browser Engines
6
Transforming to HTML
6.1
HTML compliance
6.2
Standard HTML LINK elements
6.3
Standard HTML metadata
6.4
Dublin Core (RFC2731) metadata
7
Transforming to XHTML
8
Transforming to CHM (Microsoft Compiled Help)
9
Transforming to PDF via XSL-FO
9.1
Extension feature matrix
9.2
Example: producing output for Apache FOP
10
Utilities
10.1
Checking References
10.2
Producing reference entries for books
11
Informative References
§
Author's Address
§
Index
This document describes a set of XSLT transformations that can be used to transform RFC2629-compliant XML (see [RFC2629]) to various output formats, such as HTML and PDF. The main topics are
rfc2629.xslt supports both all RFC2629 grammar elements and the extensions implemented in xml2rfc 1.21.
In addition, rfc2629.xslt supports a set of extension elements, using elements and attributes in the namespace "http://greenbytes.de/2002/rfcedit". They are used for
Note that these extensions are experimental. Please email the author in case you're interested in using these extensions.
All PIs can be set as XSLT parameter as well, overriding any value that is found in the source file to be transformed.
Using processing instructions:
<?rfc toc="yes"?> <?rfc-ext support-rfc2731="no"?>
Using XSLT parameters:
saxon foo.xml rfc2629.xslt xml2rfc-toc=yes \ xml2rfc-ext-support-rfc2731=no > result.hzml
PI target | PI pseudo-attribute | comment |
---|---|---|
rfc | include | incompatible with XML/XSLT processing model |
rfc | needLines | |
rfc | slides | |
rfc | strict | |
rfc | subcompact | |
rfc | tocindent | (defaults to "yes") |
rfc | tocompact |
PI target | PI pseudo-attribute | XSLT parameter name | default | description |
---|---|---|---|---|
rfc-ext | support-rfc2731 | xml2rfc-ext-support-rfc2731 | "yes" | Decides whether the HTML transformation should generate META tags according Section 6.4. |
The transformation automatically generates anchors that are supposed to be stable and predictable and that can be used to identify specific parts of the document. Anchors are generated both in HTML and XSL-FO content (but the latter will only be used for PDF output when the XSL-FO engine supports producing PDF anchors).
The following anchors get auto-generated:
The transformation requires a non-standard extension function (see exsl:node-set) which is however widely available. XSLT processors that do not support this extension (or a functional equivalent) currently are not supported.
The following XSLT engines are believed to work well:
The following browsers seem to work fine:
The following browsers are known not to work properly:
Transformation to HTML can be done inside the browser if it supports XSLT. To enable this, add the following processing instruction to the start of the source file:
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
(and ensure that rfc2629.xslt is present).
The transformation result is supposed to conform to the HTML 4.01 strict DTD [HTML]. This can be checked using the W3C's online validator at <http://validator.w3.org>.
LINK elements exist since HTML 2.0. They can be used to embed content-independant links inside the document. Unfortunately, only few user agents fully support this element, namely Mozilla where it's called "Site Navigation Bar" (by default disabled!).
The following LINK elements are produced:
The figure below shows how Mozilla Firebird displays the Site Navigation Bar for rfc2396.xml.
The following standard HTML META elements are produced:
META name | description |
---|---|
generator | from XSLT engine version and stylesheet version |
keywords | from keyword elements in front section |
Unless turned off using the "rfc-ext support-rfc2731" processing instruction, the transformation will generate metadata according to [RFC2731].
The following DCMI properties are produced:
META name | description |
---|---|
DC.Creator | from author information in front section |
DC.Date.Issued | from date information in front section |
DC.Description.Abstract | from abstract |
DC.Identifier | document URN [RFC2648] from "docName" attribute |
DC.Relation.Replaces | from "obsoletes" attribute |
Transforming to XHTML requires slightly different XSLT output options and is implemented by the derived transformation script rfc2629toXHTML.xslt.
Note: Microsoft Internet Explorer does not support XHTML. Therefore it usually makes more sense to generate plain old HTML.
To generate a CHM file using Microsoft's HTML Help Compiler (hhc), three files are required in addition to the HTML file.
The three files are generated with three specific transformations, each requiring the additional XSLT parameter "basename" to specify the filename prefix.
Example:
saxon rfc2616.xml rfc2629toHhp.xslt basename=rfc2616 > rfc2616.hhp saxon rfc2616.xml rfc2629toHhc.xslt basename=rfc2616 > rfc2616.hhc saxon rfc2616.xml rfc2629toHhk.xslt basename=rfc2616 > rfc2616.hhk hhc rfc2616.hhp
Transformation to XSL-FO [XSL-FO] format is available through rfc2629toFO.xslt (which includes rfc2629.xslt, so keep both in the same folder).
Compared to HTML user agents, XSL-FO engines unfortunately either come as open source (for instance, Apache FOP) or feature-complete (for instance, AntennaHouse XSL Formatter), but not both at the same time.
As Apache FOP needs special workarounds (page breaking, table layout), and some popular extensions aren't standardized yet, the translation produces a generic output (hopefully) conforming to [XSL-FO-11-WD]. Specific backends (xsl11toFop.xslt, xsl11toXep.xslt, xsl11toAn.xslt) the provide post-processing for the individual processors.
PDF anchors | PDF bookmarks | PDF document information | Index cleanup | |
---|---|---|---|---|
XSL 1.1 WD | no, but can be auto-generated from "id" attributes | yes | no, but uses XEP output extensions | yes |
Antenna House XSL formatter | no | yes (from XSL 1.1 bookmarks) | yes (from XEP document info) | yes (just page duplicate elimination, from XSL 1.1 page index) |
Apache FOP | yes | yes (from XSL 1.1 bookmarks) | no | no |
RenderX XEP | no | yes (from XSL 1.1 bookmarks) | yes | yes (from XSL 1.1 page index) |
Example:
saxon rfc2616.xml rfc2629toFo.xslt > tmp.fo saxon tmp.fo xsl11toFop > rfc2629.fo
check-ietf-references.xslt can be used to check all references to RFC-series IETF publications (note this script requires a local copy of <ftp://ftp.isi.edu/in-notes/rfc-index.xml>). For instance:
> saxon rfc2518.xml check-ietf-references.xslt Normative References: RFC1766: [PROPOSED STANDARD] obsoleted by RFC3066 RFC3282 RFC2277: [BEST CURRENT PRACTICE] (-> BCP0018) ok RFC2119: [BEST CURRENT PRACTICE] (-> BCP0014) ok RFC2396: [DRAFT STANDARD] ok RFC2069: [PROPOSED STANDARD] obsoleted by RFC2617 RFC2068: [PROPOSED STANDARD] obsoleted by RFC2616 RFC2141: [PROPOSED STANDARD] ok RFC2279: [PROPOSED STANDARD] obsoleted by RFC3629 Informational References: RFC2026: [BEST CURRENT PRACTICE] (-> BCP0009) ok RFC1807: [INFORMATIONAL] ok RFC2291: [INFORMATIONAL] ok RFC2413: [INFORMATIONAL] ok RFC2376: [INFORMATIONAL] obsoleted by RFC3023
amazon-asin.xslt uses the Amazon web services to generate a <reference> element for a given ASIN (ISBN). For instance:
<?xml version="1.0" encoding="utf-8"?>
<references>
<reference target="urn:isbn:0134516591">
<front>
<title>Simple Book, The: An Introduction to Internet Management,
Revised Second Edition</title>
<author surname="Rose"
fullname="Marshall T. Rose" initials="M. T. "/>
<author surname="Marshall"
fullname="Rose T. Marshall" initials="R. T."/>
<seriesInfo name="Prentice Hall" value=""/>
<date year="1996" month="March"/>
</front>
</reference>
</references>
Note that the resulting XML usually requires checking, in this case Amazon's database is playing tricks with Marshall's name...
[RFC2629] | Rose, M.T., "Writing I-Ds and RFCs using XML", RFC 2629, June 1999. |
[RFC2648] | Moats, R., "A URN Namespace for IETF Documents", RFC 2648, August 1999. |
[RFC2731] | Kunze, J.A., "Encoding Dublin Core Metadata in HTML", RFC 2731, December 1999. |
[HTML] | Raggett, D., Hors, A. and I. Jacobs, "HTML 4.01 Specification", W3C REC REC-html401-19991224, December 1999. |
[XSL-FO] | Adler, S., Berglund, A., Caruso, J., Deach, S., Graham, T., Grosso, P., Gutentag, E., Milowski, R., Parnell, S., Richman, J. and S. Zilles, "Extensible Stylesheet Language (XSL) Version 1.0", W3C REC REC-xsl-20011015, October 2001. |
[XSL-FO-11-WD] | Berglund, A., "Extensible Stylesheet Language (XSL) Version 1.1", W3C REC WD-xsl11-20031217, December 2003. |
Julian F. Reschke | |
greenbytes GmbH | |
Salzmannstrasse 152 |
|
Muenster, NW 48159 | |
Germany | |
Phone: | +49 251 2807760 |
Fax: | +49 251 2807761 |
EMail: | julian.reschke@greenbytes.de |
URI: | http://greenbytes.de/tech/webdav/ |
A |
alternate HTML LINK element 6.2 |
Anchors |
rfc.abstract 4 |
rfc.authors 4 |
rfc.copyright 4 |
rfc.copyrightnotice 4 |
rfc.figure.n 4 |
rfc.figure.u.n 4 |
rfc.index 4 |
rfc.ipr 4 |
rfc.iref.n 4 |
rfc.note.n 4 |
rfc.references 4, 4 |
rfc.section.n 4 |
rfc.section.n.p.m 4 |
rfc.status 4 |
rfc.toc 4 |
AntennaHouse XSL Formatter 9 |
Apache FOP 9 |
appendix HTML LINK element 6.2 |
author HTML LINK element 6.2 |
B |
background PI pseudo-attribute 3.1 |
C |
chapter HTML LINK element 6.2 |
CHM format 8 |
comments PI pseudo-attribute 3.1 |
compact PI pseudo-attribute 3.1 |
contents HTML LINK element 6.2 |
copyright HTML LINK element 6.2 |
Creator DCMI property 6.4 |
D |
Date.Issued DCMI property 6.4 |
DCMI properties |
Creator 6.4 |
Date.Issued 6.4 |
Description.Abstract 6.4 |
Identifier 6.4 |
Relation.Replaces 6.4 |
Description.Abstract DCMI property 6.4 |
E |
editing PI pseudo-attribute 3.1 |
F |
footer PI pseudo-attribute 3.1 |
G |
generator HTML META element 6.3 |
H |
header PI pseudo-attribute 3.1 |
HTML compliance 6.1 |
HTML LINK elements |
alternate 6.2 |
appendix 6.2 |
author 6.2 |
chapter 6.2 |
contents 6.2 |
copyright 6.2 |
index 6.2 |
HTML META elements |
generator 6.3 |
keywords 6.3 |
I |
Identifier DCMI property 6.4 |
include PI pseudo-attribute 3.2 |
index HTML LINK element 6.2 |
inline PI pseudo-attribute 3.1 |
Internet Explorer 5.5 5.2 |
Internet Explorer 6 5.2 |
iprnotified PI pseudo-attribute 3.1 |
K |
keywords HTML META element 6.3 |
L |
linkmailto PI pseudo-attribute 3.1 |
M |
Microsoft Help 8 |
Mozilla 5.2 |
MSXML3 5.1 |
MSXML4 5.1 |
N |
needLines PI pseudo-attribute 3.2 |
P |
Parameters |
xml2rfc-background 3.1 |
xml2rfc-comments 3.1 |
xml2rfc-compact 3.1 |
xml2rfc-editing 3.1 |
xml2rfc-ext-support-rfc2731 3.3 |
xml2rfc-footer 3.1 |
xml2rfc-header 3.1 |
xml2rfc-inline 3.1 |
xml2rfc-iprnotified 3.1 |
xml2rfc-linkmailto 3.1 |
xml2rfc-private 3.1 |
xml2rfc-sortrefs 3.1 |
xml2rfc-symrefs 3.1 |
xml2rfc-toc 3.1 |
xml2rfc-tocdepth 3.1 |
xml2rfc-topblock 3.1 |
private PI pseudo-attribute 3.1 |
Processing Instruction pseudo attributes |
background 3.1 |
comments 3.1 |
compact 3.1 |
editing 3.1 |
footer 3.1 |
header 3.1 |
include 3.2 |
inline 3.1 |
iprnotified 3.1 |
linkmailto 3.1 |
needLines 3.2 |
private 3.1 |
slides 3.2 |
sortrefs 3.1 |
strict 3.2 |
subcompact 3.2 |
support-rfc2731 3.3 |
symrefs 3.1 |
toc 3.1 |
tocdepth 3.1 |
tocindent 3.2 |
tocompact 3.2 |
topblock 3.1 |
R |
Relation.Replaces DCMI property 6.4 |
rfc.abstract anchor 4 |
rfc.authors anchor 4 |
rfc.copyright anchor 4 |
rfc.copyrightnotice anchor 4 |
rfc.figure.n anchor 4 |
rfc.figure.u.n anchor 4 |
rfc.index anchor 4 |
rfc.ipr anchor 4 |
rfc.iref.n anchor 4 |
rfc.note.n anchor 4 |
rfc.references anchor 4 |
rfc.references.n anchor 4 |
rfc.section.n anchor 4 |
rfc.section.n.p.m anchor 4 |
rfc.status anchor 4 |
rfc.toc anchor 4 |
S |
Saxon 5.1 |
slides PI pseudo-attribute 3.2 |
sortrefs PI pseudo-attribute 3.1 |
strict PI pseudo-attribute 3.2 |
subcompact PI pseudo-attribute 3.2 |
support-rfc2731 PI pseudo-attribute 3.3 |
symrefs PI pseudo-attribute 3.1 |
T |
toc PI pseudo-attribute 3.1 |
tocdepth PI pseudo-attribute 3.1 |
tocindent PI pseudo-attribute 3.2 |
tocompact PI pseudo-attribute 3.2 |
topblock PI pseudo-attribute 3.1 |
X |
Xalan 5.1 |
xml-stylesheet PI 6 |
xml2rfc-background parameter 3.1 |
xml2rfc-comments parameter 3.1 |
xml2rfc-editing parameter 3.1, 3.1 |
xml2rfc-ext-support-rfc2731 parameter 3.3 |
xml2rfc-footer parameter 3.1 |
xml2rfc-header parameter 3.1 |
xml2rfc-inline parameter 3.1 |
xml2rfc-iprnotified parameter 3.1 |
xml2rfc-linkmailto parameter 3.1 |
xml2rfc-private parameter 3.1 |
xml2rfc-sortrefs parameter 3.1 |
xml2rfc-symrefs parameter 3.1 |
xml2rfc-toc parameter 3.1 |
xml2rfc-tocdepth parameter 3.1 |
xml2rfc-topblock parameter 3.1 |
xsltproc 5.1 |