AsciidoctorJ Conversion Process Overview

Before starting to write your first extension, some basic understanding of how Asciidoctor treats the document is helpful. As any language processing tool, the process can be roughly split into three steps:

  1. Parsing: the raw sources content is read and analyzed to generate the internal representation, the AST (abstract syntax tree).

  2. Processing: the AST is processed. For example to detect possible errors, add automatically generated content (toc), etc.

  3. Output generation: once the final AST is set, it’s again processed to generate the desired output. For example, a subsection of the AST representing a title with a paragraph will be converted into its correspondent HTML or PDF output.

Some liberty is taken to make the process easier to understand. In reality, Asciidoctor has implementation details that divert from the 3 steps above.

The different extension types are called in different steps of the conversion process in the following order:

  1. Preprocessors are called when the parser requires the AsciiDoc source to parse.

  2. IncludeProcessors are called whenever an include:: directive is found while reading the AsciiDoc source.

  3. BlockMacroProcessors and BlockProcessors are called while parsing in the order that they appear in the source document.

  4. Treeprocessors are called after the document has been completely parsed into the Document tree right before processing.

  5. InlineMacroProcessors are called during output generation in the order that they appear in the document.

  6. DocinfoProcessors are called at the beginning of output generation if they add content to the header. And they are called at the end of output generation if they add content to the footer.

  7. Postprocessors are called after output generation before the content is written to the destination.