For the applications you have in mind, DocBook “out of the box” may not be exactly what you need. Perhaps you need additional inline elements or perhaps you want to remove elements that you never want your authors to use. By design, DocBook makes this sort of customization easy.
This chapter explains how to make your own customization layer. You might do this in order to:
Add new elements
Remove elements
Change the structure of existing elements
Add new attributes
Remove attributes
Broaden the range of values allowed in an attribute
Narrow the range of values in an attribute to a specific list or a fixed value
You can use customization layers to extend DocBook or subset it. Creating a schema that is a strict subset of DocBook means that all of your instances are still completely valid DocBook instances, which may be important to your tools and stylesheets, and to other people with whom you share documents. An extension adds new structures, or changes the schema in a way that is not compatible with DocBook. Extensions can be very useful, but might have a great impact on your environment.
Customization layers can be as small as restricting an attribute value or as large as adding an entirely different hierarchy on top of the inline elements.
Changing a schema can have a wide-ranging impact on the tools and stylesheets that you use. It can have an impact on your authors and on your legacy documents. This is especially true if you make an extension. If you rely on your support staff to install and maintain your authoring and publishing tools, check with them before you invest a lot of time modifying the schema. There may be additional issues that are outside your immediate control. Proceed with caution.
That said, DocBook is designed to be easy to modify. This chapter assumes that you are comfortable with XML and RELAX NG grammar syntax, but the examples presented should be a good springboard to learning the syntax if it's not already familiar to you.
Starting with DocBook V5.0, DocBook is identified by its namespace, http://docbook.org/ns/docbook
. The particular version of DocBook to which an element conforms is identified by its version
attribute. If the element does not specify a version, the version of the closest ancestor DocBook element that does specify a version is assumed. The version
attribute is required on the root DocBook element.
If you make any changes to the DocBook schema, it is imperative that you provide an alternative version identifier that you use for the schema and the modules you changed. The license agreement under which DocBook is distributed gives you complete freedom to change, modify, reuse, and generally hack the schema in any way you want, except that you must not call your alterations “DocBook”.
The following format is recommended:
base_version
-[subset|extension|variant] (name
[-version
])+
For example, version 1.0 of Acme Corporation's extension of DocBook V5.0 could be identified as “5.0-extension acme-1.0
”.
A document that relied on the version 3.2 of Example Corporation's subset of DocBook V5.0, MathML 2.0, and SVG 1.1 could be identified as: “5.0-subset example-3.2 mathml-2.0 svg-1.1
”.
If your schema is a proper subset, you can advertise this status by using the subset
keyword in the version. If your schema contains any markup model extensions, you can advertise this status by using the extension
keyword. If you'd rather not characterize your variant specifically as a subset or an extension, you can leave out this field entirely or, if you prefer, use the variant
keyword.
Although not directly supported by RELAX NG, in some cases it may still be valuable to identify a DocBook V5.0 customization layer with a public identifier. A public identifier for DocBook V5.0 is:
-//OASIS//DTD DocBook V5.0//EN
If you make any changes to the structure of the schema, it is imperative that you alter the public identifier that you use to identify it.
You should change both the owner identifier and the description. Formal public identifiers for the base DocBook modules would have identifiers with the following syntax:
-//OASIS//text-class
DocBookdescription
Vversion
//EN
Your own formal public identifiers should use the following syntax in order to record their DocBook derivation:
-//your-owner-ID
//text-class
DocBook Vversion
-Based [Subset|Extension|Variant]your-descrip-and-version
//lang
For example:
-//O'Reilly//DTD DocBook V5.0-Based Subset V1.1//EN
If your schema is a proper subset, you can advertise this status by using the Subset
keyword in the description. If your schema contains any markup model extensions, you can advertise this status by using the Extension
keyword. If you'd rather not characterize your variant specifically as a subset or an extension, you can leave out this field entirely, or, if you prefer, use the Variant
keyword.
A RELAX NG grammar is a collection of patterns. These patterns can be stored in a single file or in a collection of files that import each other. Patterns can augment each other in a variety of ways. A complete grammar is the logical union of the specified patterns.
For convenience, the DocBook grammar is distributed in a single file.
There are two standard syntaxes for RELAX NG, an XML syntax and a “compact” text syntax. The two forms have the same expressive power; it is possible to transform between them with no loss of information.
Many users find the relative terseness of the compact syntax makes it a convenient form for reading and writing RELAX NG. That is the form we will use in the following examples.
The names of the patterns used in a RELAX NG grammar are arbitrary, they have nothing to do with the names of the elements and attributes defined by the schema. The DocBook RELAX NG grammar employs a number of naming conventions in order to make it easier to navigate.
db.*
.attlist
Defines the list of attributes associated with an element. For example, db.emphasis.attlist
is the pattern that matches all of the attributes of the emphasis
element.
db.*
.attribute
Defines a single attribute. For example, db.conformance.attribute
is the pattern that matches the conformance
attribute on all of the elements where it occurs.
db.*
.attributes
Defines a collection of attributes. For example, db.effectivity.attributes
is all of the effectivity attributes (arch
, audience
, etc.).
db.*
.blocks
Defines a list (a choice of) a set of related block elements. For example, db.list.blocks
is a pattern that matches any of the list elements.
db.*
.contentmodel
Defines a fragment of content model shared by several elements.
db.*
.enumeration
Defines an enumeration, usually one used in an attribute value. For example, db.revisionflag.enumeration
is a pattern that matches the list of values that can be used as the value of a revisionflag
attribute.
db.*
.info
Defines the info
element for a particular element. For example, db.example.info
is the pattern that matches info
on example
.
Almost all of the info
elements are the same, but they are described with distinct patterns so that customizers can change them selectively.
db.*
.inlines
Defines a list (a choice of) a set of related inline elements. For example, db.link.inlines
is a pattern that matches any of the linking-related elements.
db.*
.role.attribute
Defines the role
attribute for a particular element. For example, db.emphasis.role
is the pattern that matches role
on emphasis
.
All of the role
attributes are the same, but they are described with distinct patterns so that customizers can change them selectively.
db.*
Is the pattern that matches a particular DocBook element. element. For example, db.title.role
is the pattern that matches title
.
RELAX NG allows multiple patterns to match the same element, so sometimes these patterns come in flavors, for example, db.indexterm.singular
, db.indexterm.startofrange
, and db.indexterm.endofrange
. Each of these patterns matches a indexterm
with varying attributes.
These are conventions, not hard and fast rules. There are patterns that don't follow these conventions.
Although customization layers vary in complexity, most of them have the same general structure as other customization layers of similar complexity.
In the most common case, you probably want to include all of DocBook, but you want to make some small changes. These customization layers tend to look like this:
namespace db = "http://docbook.org/ns/docbook" # perhaps other namespace declarations include "docbook.rnc" # new patterns and augmented patterns
Start by importing the base DocBook schema. |
|
Then you can add new patterns or augment existing patterns. |
If you want to completely replace a pattern (for example, to remove or completely change an element), the template is a little different.
namespace db = "http://docbook.org/ns/docbook" # perhaps other namespace declarations include "docbook.rnc" { # redefinitions of DocBook patterns } # new patterns and augmented patterns
You can redefine patterns in the body of an import statement. These patterns completely replace any that appear in the imported schema. |
|
As before, patterns outside the include statement can augment existing patterns (even redefined ones). |
There are other possibilities as well, these examples are illustrative, not exhaustive.
The procedure for creating, testing, and using a customization layer is always about the same. In this section, we'll go through the process in some detail. The rest of the sections in this chapter describe a range of useful customization layers.
If you're considering writing a customization layer, there must be something that you want to change. Perhaps you want to add an element or attribute, remove one, or change some other aspect of the schema.
Adding an element, particularly an inline element, is one possibility. If you're writing about cryptography, you might want to add a “cleartext
” element, for example.
Figuring out what to change may be the hardest part of the process. Finding something similar usually provides a good model for new changes.
Depending on the exact focus of your document, there are probably several candidates. In this case, all of the following look plausible: technical inlines, programming inlines, and domain inlines. Let's suppose you chose the domain inlines.
As shown in Example 5.1, “Adding cleartext with a Customization Layer”, your customization would import the DocBook schema, extend the domain inlines, and then provide a pattern that matches the new element.
Example 5.1. Adding cleartext
with a Customization Layer
namespace db = "http://docbook.org/ns/docbook" default namespace = "http://docbook.org/ns/docbook" include "docbook.rnc" db.domain.inlines |= db.cleartext # Define a new cleartext element: db.cleartext.role.attribute = attribute role { text } db.cleartext.attlist = db.cleartext.role.attribute? & db.common.attributes & db.common.linking.attributes db.cleartext = element cleartext { db.cleartext.attlist, db._text }
The “ |
|
Next, we create a pattern for the |
|
Defining a separate pattern for the role attribute makes it easy for customizers to change it on a per-element basis. |
|
Defining a separate pattern for the attributes makes it easy for customizer to change them on a per-element basis. |
|
The pattern for the element pulls it all together. The pattern “ |
Schemas, by their nature, contain many complex, interrelated patterns. Whenever you make a change to the schema, it's always wise to use a validator to double-check your work.
Start by validating a document that's plain, vanilla DocBook, one that you know is valid according to the DocBook standard schema. This will help you identify errors that you've introduced to the schema itself. After you are confident that the schema is correct, begin testing with instances that you expect (and don't expect) to be valid against it.
DocBook has a large number of elements. In some authoring environments, it may be useful or necessary to remove some of these elements.
msgset
MsgSet
is a favorite target. It has a complex internal structure designed for describing interrelated error messages, especially on systems that may exhibit messages from several different components. Many technical documents can do without it, and removing it leaves one less complexity to explain to your authors.
Example 5.2, “Removing msgset” shows a customization layer that removes the msgset
element from DocBook:
Example 5.2. Removing msgset
namespace db = "http://docbook.org/ns/docbook" include "docbook.rnc" { db.msgset = notAllowed }
The complexity of msgset
is really in its msgentry
children. DocBook V4.5 introduced a simple alternative, simplemsgentry
. Example 5.3, “Removing msgentry” demonstrates how you could allow msgset
but only support the simpler alternative.
Example 5.3. Removing msgentry
namespace db = "http://docbook.org/ns/docbook" include "docbook.rnc" { db.msgentry = notAllowed }
Closer examination of the msgentry
content model will reveal that it contains a number of descendants. It isn't necessary, but it wouldn't be wrong, to define their patterns as notAllowed
as well.
DocBook contains a large number of computer inlines. The DocBook inlines define a domain-specific vocabulary. If you're working in another domain, many of them may be unnecessary.
They're defined in a set of patterns that ultimately roll-up to the “db.domain.inlines
” pattern. If you make that pattern “notAllowed
”, you'll remove them all in one fell swoop.
Example 5.4. Removing Computer Inlines
namespace db = "http://docbook.org/ns/docbook" include "docbook.rnc" { db.domain.inlines = notAllowed }
If you want to be more selective, you might consider making one or more of the set not allowed instead: “db.error.inlines
”, errors and error messages; “db.gui.inlines
”, GUI elements; “db.keyboard.inlines
”, key and keyboard elements; “db.markup.inlines
”, markup elements; “db.math.inlines
”, mathematical expressions; “db.os.inlines
”,
operating system inlines; and “db.programming.inlines
”, programming-related inlines.
It's likely that a customization layer that removed this many technical inlines would also remove some larger technical structures (msgset
, funcsynopsis
).
Another possibility is removing the complex synopsis elements. The customization layer in Example 5.5, “Removing CmdSynopsis and FuncSynopsis” removes cmdsynopsis
and funcsynopsis
.
Example 5.5. Removing CmdSynopsis and FuncSynopsis
namespace db = "http://docbook.org/ns/docbook" include "docbook.rnc" { db.funcsynopsis = notAllowed db.cmdsynopsis = notAllowed }
Perhaps you want to restrict your authors to only three levels of sectioning. To do that, you must remove the sect4
and sect5
elements, as shown in Example 5.6, “Removing sect4 and sect5 Elements ”.
Example 5.6. Removing sect4
and sect5
Elements
namespace db = "http://docbook.org/ns/docbook" include "docbook.rnc" { db.sect4 = notAllowed # Strictly speaking, we don't need to remove sect5 because, having removed # sect4, there's no way to reach it. But it seems cleaner to do so. db.sect5 = notAllowed }
This technique works if your authors are using numbered sections. You could require them to do so by removing section
. But suppose instead you want to allow them to use recursive sections and still limit them to only three levels.
One way to do this would be to define new “section2
” and “section3
” patterns, as shown in Example 5.7, “Limiting recursive sections to three levels”.
Example 5.7. Limiting recursive sections to three levels
namespace db = "http://docbook.org/ns/docbook" default namespace = "http://docbook.org/ns/docbook" include "docbook.rnc" { db.section = element section { db.section.attlist, db.section.info, db.recursive.blocks.or.section2s, db.navigation.components* } } db.recursive.section2s = (db.section2+, db.simplesect*) | db.simplesect+ db.recursive.blocks.or.section2s = (db.all.blocks+, db.recursive.section2s?) | db.recursive.section2s db.section2 = element section { db.section.attlist, db.section.info, db.recursive.blocks.or.section3s, db.navigation.components* } db.recursive.section3s = (db.section3+, db.simplesect*) | db.simplesect+ db.recursive.blocks.or.section3s = (db.all.blocks+, db.recursive.section3s?) | db.recursive.section3s db.section3 = element section { db.section.attlist, db.section.info, db.all.blocks+ db.navigation.components* }
Another solution, assuming your validation environment supports Schematron, is simply to add a new rule, as shown in Example 5.8, “Limiting recursive sections to three levels”.
Example 5.8. Limiting recursive sections to three levels
namespace db = "http://docbook.org/ns/docbook" namespace s = "http://www.ascc.net/xml/schematron" default namespace = "http://docbook.org/ns/docbook" include "docbook.rnc" { db.section = [ s:pattern [ name = "Limit depth of sections" s:rule [ context = "db:section" s:assert [ test = "count(ancestor::db:section) < 2" "Sections can be no more than three levels deep" ] ] ] ] element section { db.section.attlist, db.section.info, db.recursive.blocks.or.sections, db.navigation.components* } }
Sometimes what you want to do is not as simple as entirely removing an element. Instead, you want to remove it only from some contexts. The way to accomplish this task is to redefine the patterns used to calculate the elements allowed in those contexts.
Standard DocBook allows any inline element or any block element to appear in a table cell. You might decide that it's unreasonable to allow admonitions (note
, caution
, warning
, etc.) to appear in a table cell.
In order to remove them, you must change what is allowed in an entry
, as show in Example 5.9, “Removing Admonitions from Tables”.
Example 5.9. Removing Admonitions from Tables
namespace db = "http://docbook.org/ns/docbook" default namespace = "http://docbook.org/ns/docbook" include "docbook.rnc" { db.entry = element entry { db.entry.attlist, (db.all.inlines* | db.some.blocks*) } } db.some.blocks = db.somenopara.blocks | db.para.blocks | db.extension.blocks db.somenopara.blocks = db.list.blocks | db.formal.blocks | db.informal.blocks | db.publishing.blocks | db.graphic.blocks | db.technical.blocks | db.verbatim.blocks | db.bridgehead | db.remark | db.revhistory | db.indexterm | db.synopsis.blocks
The extent to which any particular change is easy or hard depends in part on how many patterns need to be changed. The DocBook Technical Committee is generally open to the idea of adding more patterns if it improves the readability of customization layers. Feel free to ask, if you think some refactoring would make your job easier.
Just as there may be more elements than you need, there may be more attributes.
Suppose your processing system doesn't support “continued” lists. You want to remove the continuation
attribute from the orderedlist
element. There are two ways you could accomplish this. One way would be to redefine the “db.orderedlist.continuation.attribute
” as
not allowed; the other would be to redefine the “db.orderedlist.attlist
” pattern so that it does not include the continuation attribute. Either would accomplish the goal.
Example 5.10. Removing continuations
from orderedlist
namespace db = "http://docbook.org/ns/docbook" include "docbook.rnc" { db.orderedlist.continuation.attribute = empty }
DocBook defines a whole set of “common attributes”; these attributes appear on every element. Depending on how you're processing your documents, removing some of them can both simplify the authoring task and improve processing speed.
Some obvious candidates are:
Arch
, OS
,...)If you're not using all of the effectivity attributes in your documents, you can get rid of up to seven attributes in one fell swoop.
lang
If you're not producing multilingual documents, you can remove lang
.
remap
The remap
attribute is designed to hold the name of a semantically equivalent construct from a previous markup scheme (for example, a Microsoft Word style template name, if you're converting from Word). If you're authoring from scratch, or not preserving previous constructs with remap
, you can get rid of it.
xreflabel
If your processing system isn't using xreflabel
, it's a candidate as well.
The customization layer in Example 5.11, “Removing Common Attributes” reduces the common attributes to just xml:id
version
, and lang
.
Example 5.11. Removing Common Attributes
namespace db = "http://docbook.org/ns/docbook" include "docbook.rnc" { db.common.base.attributes = db.version.attribute? & db.xml.lang.attribute? }
The xml:id
attribute is added in two other patterns, one where it's required and one where it's optional.
Adding a new inline or block element is generally a straightforward matter of creating a pattern for the new element and “|=” adding it to the right pattern. But if your new element is more intimately related to the existing structure of the document, it may require more surgery.
Example 5.12, “Adding a sect6 Element” extends DocBook by adding a sect6
element.
Example 5.12. Adding a sect6
Element
namespace db = "http://docbook.org/ns/docbook" default namespace = "http://docbook.org/ns/docbook" include "docbook.rnc" { db.sect5.sections = (db.sect6+, db.simplesect*) | db.simplesect+ } db.sect6.sections = db.simplesect+ db.sect6.status.attribute = db.status.attribute db.sect6.role.attribute = attribute role { text } db.sect6.attlist = db.sect6.role.attribute? & db.common.attributes & db.common.linking.attributes & db.label.attribute? & db.sect6.status.attribute? db.sect6.info = db._info.title.req db.sect6 = element sect6 { db.sect6.attlist, db.sect6.info, ((db.all.blocks+, db.sect6.sections?) | db.sect6.sections), db.navigation.components* }
Here we've redefined sect5
to include sect6
and provided a pattern for sect6
.
The role
attribute, found on almost all of the elements in DocBook, is a text attribute that can be used to subclass an element. In some applications, it may be useful to modify the definition of role
so that authors must choose one of a specific set of possible values.
In Example 5.13, “Changing role on procedure”, role
on the procedure
element is constrained to the values required
or optional
.
Example 5.13. Changing role on procedure
namespace db = "http://docbook.org/ns/docbook" include "docbook.rnc" { db.procedure.role.attribute = attribute role { "required" | "optional" } }