Generate DocBook from AsciiDoc

Asciidoctor can produce the DocBook 5 XML output format from an AsciiDoc document. Although DocBook XML is not a publishable format, it can be used to tie into an existing publishing toolchain that processes DocBook. This page explains how to use Asciidoctor to convert AsciiDoc to DocBook.

Backend and converter

Asciidoctor’s built-in DocBook converter is registered for the docbook and docbook5 backends. The DocBook converter generates XML that adheres to the DocBook XML schema. There’s a corresponding DocBook tag for each AsciiDoc element.

Backend names

docbook, docbook5

Converter class

Asciidoctor::Converter::DocBook5Converter

Output format

XML

Output file extension

.xml

Generate DocBook

  1. To follow along with the steps below, use your own AsciiDoc file or copy the contents of Example 1 into a new plain text file.

    Example 1. my-document.adoc
    = The Dangers of Wolpertingers
    :url-wolpertinger: https://en.wikipedia.org/wiki/Wolpertinger
    
    Don't worry about gumberoos or splintercats.
    Something far more fearsome plagues the days, nights, and inbetweens.
    Wolpertingers.
    
    == Origins
    
    Wolpertingers are {url-wolpertinger}[ravenous beasts].
  2. Make sure to save the file with the .adoc file extension.

  3. To convert the my-document.adoc document to DocBook 5.0 format, call the processor with the backend flag set to docbook.

    $ asciidoctor -b docbook my-document.adoc
  4. A new XML document, named my-document.xml, will now be present in the current directory.

    $ ls
    my-document.adoc  my-document.xml

    Here’s a snippet of the XML generated by the DocBook converter.

    Example 2. XML generated from AsciiDoc
    <?xml version="1.0" encoding="UTF-8"?>
    <?asciidoc-toc?>
    <?asciidoc-numbered?>
    <article xmlns="http://docbook.org/ns/docbook" xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en">
    <info>
    <title>The Dangers of Wolpertingers</title>
    <date>2020-12-08</date>
    </info>
    <simpara>Don&#8217;t worry about gumberoos or splintercats.
    Something far more fearsome plagues the days, nights, and inbetweens.
    Wolpertingers.</simpara>
    <section xml:id="_origins">
    <title>Origins</title>
    <simpara>Wolpertingers are <link xl:href="https://en.wikipedia.org/wiki/Wolpertinger">ravenous beasts</link>.</simpara>
    </section>
    </article>
  5. On Linux, you can view the DocBook file with Yelp.

    $ yelp my-document.xml

The DocBook converter produces output that is compliant to the DocBook 5.0 specification.

A summary of the differences are as follows:

  • XSD declarations are used on the document root instead of a DTD

  • <info> elements for document info instead of <articleinfo> and <bookinfo>

  • elements that hold the author’s name are wrapped in a <personname> element

  • the id for an element is defined using an xml:id attribute

  • <link> is used for links instead of <ulink>

  • the URL for a link is defined using the xl:href attribute

If you’re using the Asciidoctor API, you can generate a DocBook document directly from your application.

Example 3. Generate DocBook output from the API
Asciidoctor.convert_file 'my-document.adoc', backend: 'docbook'

If you need to output DocBook 4.5, you may find the community-supported DocBook 4.5 Converter useful.

Convert DocBook to PDF

Although the Asciidoctor project provides Asciidoctor PDF for performing direct AsciiDoc to PDF conversion, you may opt instead to convert to PDF via DocBook. The DocBook to PDF conversion is handled by the DocBook toolchain.

The DocBook toolchain can prove to be a challenge to set up. This section provides several suggestions for how to use the DocBook toolchain, though this list is by no means exhaustive. If you already have a working DocBook toolchain, then these instructions are not for you.

xmlto

If you’re using a Linux distribution, the xmlto package might be an option for you. xmlto is a simple shell script for converting XML files to various formats using the DocBook toolchain. Among those formats supported is DocBook as an input format and PDF as an output format.

To install xmlto, look for the package by the same name and use your package manager to install it (e.g., dnf install xmlto fop or apt-get install xmlto fop). Once you have installed the package, you can use it to generate PDF from DocBook with Apache FOP as follows:

$ xmlto --skip-validation --with-fop pdf doc.xml

If you’re using an RPM-based Linux distribution, you may be able to use the dblatex backend to generate the PDF instead:

$ xmlto --skip-validation pdf doc.xml

The AsciiDoc processor adds several XML processing instructions to support features not provided by DocBook, such as a thematic break and a page break. When using the Apache FOP backend, you need to provide an XSL stylesheet fragment that modifies the default XSL stylesheet to support these processing instructions. You can also use this stylesheet as an opportunity to customize the PDF that Apache FOP produces.

Here’s an example of an XSL stylesheet fragment to get you started:

Example 4. custom.xsl
<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:fo="http://www.w3.org/1999/XSL/Format">
  <xsl:param name="hyphenate">false</xsl:param>
  <xsl:param name="runinhead.default.title.end.punct"/>
  <xsl:param name="generate.toc">
    <xsl:choose>
      <xsl:when test="/processing-instruction('asciidoc-toc')">
article toc,title
book toc,title,figure,table,example,equation
      </xsl:when>
      <xsl:otherwise>
article nop
book nop
      </xsl:otherwise>
    </xsl:choose>
  </xsl:param>
  <xsl:param name="section.autolabel">
    <xsl:choose>
      <xsl:when test="/processing-instruction('asciidoc-numbered')">1</xsl:when>
      <xsl:otherwise>0</xsl:otherwise>
    </xsl:choose>
  </xsl:param>
  <xsl:template match="processing-instruction('asciidoc-br')">
    <fo:block/>
  </xsl:template>
  <xsl:template match="processing-instruction('asciidoc-hr')">
    <fo:block space-after="1em">
      <fo:leader leader-pattern="rule" rule-thickness="0.5pt" rule-style="solid" leader-length.minimum="100%"/>
    </fo:block>
  </xsl:template>
  <xsl:template match="processing-instruction('asciidoc-pagebreak')">
     <fo:block break-after='page'/>
  </xsl:template>
</xsl:stylesheet>

Pass this stylesheet to xmlto as follows:

$ xmlto --skip-validation --with-fop -m custom.xsl pdf doc.xml

To get more ideas of how to customize the stylesheet, refer to XSL stylesheet from the fopub project.

fopub

A similar alternative to xmlto is fopub. fopub uses Java and the Gradle build tool to wrap the DocBook toolchain. The only prerequisite to perform the DocBook to PDF conversion using fopub is a Java Development Kit (JDK), which provides the Java runtime.

Please note that the fopub project is archived. However, it still may prove useful.

To get fopub, you must clone the repository. Once you have done so, you can run the fopub script in that repository to convert a DocBook file to PDF.

$ ./fopub README.xml

The benefit of fopub is that it’s preconfigured to convert DocBook with AsciiDoc processing instructions. The stylesheet it provides also smooths out some of the rough edges of the visual styling provided by the DocBook toolchain. However, the project is no longer actively maintained, so keep that in mind when deciding whether to use it. As an alternative, you may give db-toolchain, which positions itself as a successor to fopub (albeit more complex).

Maven plugins

If you are using Maven to build your docs, then you might consider using either the docbkx plugin to generate PDF from DocBook. You can find an example projects in the Asciidoctor Maven example repository:

The typical way to use these plugins is in a processing pipeline starting with the Asciidoctor Maven plugin. The Asciidoctor Maven plugin uses Asciidoctor to generate the DocBook file(s).