Preprocessor Extension

The Preprocessor extension runs prior to the AsciiDoc source being parsed. Specifically, it runs after the Document instance has been created and its PreprocessorReader has been initialized to start reading the lines of the source document.

If the preprocessor extension reads lines from the reader, it will cause side effects. One of these side effects is that document attributes in the header may not be interpreted correctly. Another is that it may interfere with line number tracking. If the extension adds or removes lines, the parser could produce misleading line number information. Thus, interacting with the reader in a preprocessor extension should be avoided, if possible.

A preprocessor extension can still be used to modify options on the Document object, in which case there’s no side effect. It can also be safely used as an event listener, perhaps to run other code. It may also be possible to avoid side effects by reading lines above the actual AsciiDoc document, as is shown in the example on this page. It’s safe to do so because these lines are not interpreted as AsciiDoc.

Prior to invoking the preprocessor extension, Asciidoctor splits the source text of the primary document into lines and normalizes them. It then creates an instance of PreprocessorReader from these lines.

Asciidoctor then passes the Document and PreprocessorReader instances to the Asciidoctor::Extensions::Processor#process method of the extension. If the extension reads lines from reader, those lines will first be preprocessed by Asciidoctor’s internal preprocessor. The extension can modify the Reader and either return the same Reader (or falsy, which is equivalent) or a new Reader instance to replace it. However, as stated above, be careful when reading lines from the original PreprocessorReader instance as it can have side effets.

Preprocessor Extension Example

Purpose

Skim off front matter from the top of the document that gets used by site generators like Jekyll.

sample-with-front-matter.adoc

tags: [announcement, website]
---
= Document Title

content

[subs=+attributes]
.Captured front matter
----
---
{front-matter}
---
----

FrontMatterPreprocessor

class FrontMatterPreprocessor < Asciidoctor::Extensions::Preprocessor
  def process document, reader
    lines = reader.lines # get raw lines
    return reader if lines.empty?
    front_matter = []
    if lines.first.chomp == '---'
      original_lines = lines.dup
      lines.shift
      while !lines.empty? && lines.first.chomp != '---'
        front_matter << lines.shift
      end

      if (first = lines.first).nil? || first.chomp != '---'
        lines = original_lines
      else
        lines.shift
        document.attributes['front-matter'] = front_matter.join.chomp
        # advance the reader by the number of lines taken
        (front_matter.length + 2).times { reader.advance }
      end
    end
    reader
  end
end

Usage

Asciidoctor::Extensions.register do
  preprocessor FrontMatterPreprocessor
end

Asciidoctor.convert_file 'sample-with-front-matter.adoc', safe: :safe