Table Data Formats

Default table syntax

A table is delimited by a vertical bar and three equal signs (|===). It contains cells that are arranged into rows according to the number of columns the table is assigned. The number of columns a table contains can be specified implicitly using the number of cells in the table’s first row or by setting the cols attribute. Each cell is specified by a vertical bar (|).

If you’re new to AsciiDoc tables, Build a Basic Table provides step by step directions for creating your first table.

Style and layout options

Table content can be:

  • styled and aligned by column or cell,

  • aligned by row,

  • duplicated across multiple rows, and

  • marked up by any AsciiDoc syntax.

Table cells can span rows and columns.

You can adjust a table’s:

  • width,

  • orientation, and

  • border style.

You can also specify each column’s width and designate header and footer rows.

Supported data formats

The default table data format is prefix-separated values (PSV); that means the processor creates a new cell each time it encounters a vertical bar (|). AsciiDoc also supports comma-separated values (CSV), tab-separated values (TSV), and delimited data values (DSV).

Escape the cell separator

The parser scans for the cell separator to partition cells before it processes the cell text. So even if you try to hide the cell separator using an inline passthrough, the parser will see it. If the cell contain contains the cell separator, you must escape that character. There are three ways to escape it:

  • Prefix the character with a leading backslash (i.e., \|), which will be removed from the output.

  • Use the {vbar} attribute reference in place of | in content.

  • Change the cell separator used by the table.

Unless you do one of these things, the cell separator will be interpreted as a cell boundary.

Consider the following example, which escapes the cell separator using a leading backslash:

[cols=2*]
|===
|The default separator in PSV tables is the \| character.
|The \| character is often referred to as a "`pipe`".
|===

This table will render as follows:

Result: Converted PSV table that contains pipe characters

The default separator in PSV tables is the | character.

The | character is often referred to as a “pipe”.

Notice that the pipe character appears without the leading backslash (i.e., unescaped) in the rendered result.

An alternative is to use the {vbar} attribute reference as a substitute. This approach produces the same result as the previous example.

[cols=2*]
|====
|The default separator in PSV tables is the {vbar} character.
|The {vbar} character is often referred to as a "`pipe`".
|====

Escaping each cell separator character that appears in the content of a cell can be tedious. There are also times when you can’t or don’t want to modify the cell content (perhaps because it is being included from another file). To address these cases, AsciiDoc allows you to override the cell separator.

The cell separator is controlled using the separator attribute on the table block. You’ll want to select any single character that is not found in the content. A good candidate is the broken bar, or ¦.

Here’s the previous example rewritten using a custom separator.

[cols=2*,separator=¦]
|===
¦The default separator in PSV tables is the | character.
¦The | character is often referred to as a "`pipe`".
|===

Notice that it’s no longer necessary to escape the pipe character in the content of the table cells. You can safely use the original cell separator in the cell content and not worry about it being interpreted as the boundary of a cell.

Delimiter-separated values

Tables can also be populated from data formatted as delimiter-separated values (i.e., data tables). In contrast with the PSV format, in which the delimiter is placed in front of each cell value, the delimiter in a delimiter-separated format (CSV, TSV, DSV) is placed between the cell values (called a separator) and does not accept a cell formatting spec. Each line of data is assumed to represent a single row, though you’ll learn that’s not a strict rule. How the table data gets interpreted is controlled by the format and separator attributes on the table.

What the delimiter?

Aren’t comma-separated values a subset of delimiter-separated values? It really depends on who you consult.

The term “delimiter-separated values” in this text refers to the family of data formats that use a delimiter, including comma-separated values (CSV), tab-separated values (TSV) and delimited data (DSV), all of which are supported in AsciiDoc tables. CSV is the data format used most often.

“Comma-separated values” is really a misleading term since CSV can use delimiters other than , as the field separator (which, in this context, separates cells). What we’re really talking about is how the data is interpreted.

CSV and TSV both use a delimiter and an optional enclosing character, loosely based on RFC 4180. DSV (i.e., delimited data) only uses a delimiter, which can be escaped using a backslash; an enclosing character is not recognized. These parsing rules are described in detail in Data table formats.

Let’s consider an example of using comma-separated values (CSV) to populate an AsciiDoc table with data. To instruct the processor to read the data as CSV, set the value of the format attribute on the table to csv. When the format attribute is set to csv, the default data separator is a comma (,), as seen in the table below.

[%header,format=csv]
|===
Artist,Track,Genre
Baauer,Harlem Shake,Hip Hop
The Lumineers,Ho Hey,Folk Rock
|===
Result: Rendered CSV table
Artist Track Genre

Baauer

Harlem Shake

Hip Hop

The Lumineers

Ho Hey

Folk Rock

This feature is particularly useful when you want to populate a table in your manuscript from data stored in a separate file. You can do so using the include directive between the table delimiters, as shown here:

[%header,format=csv]
|===
include::tracks.csv[]
|===

If your data is separated by tabs instead of commas, set the format to tsv (tab-separated values) instead.

Now let’s consider an example of using delimited data (DSV) to populate an AsciiDoc table with data. To instruct the processor to read the data as DSV, set the value of the format attribute on the table to dsv. When the format attribute is set to dsv, the default data separator is a colon (:), as seen in the table below.

[%header,format=dsv]
|===
Artist:Track:Genre
Robyn:Indestructible:Dance
The Piano Guys:Code Name Vivaldi:Classical
|===
Result: Rendered DSV table
Artist Track Genre

Robyn

Indestructible

Dance

The Piano Guys

Code Name Vivaldi

Classical

Data table formats

The CSV and TSV data formats are parsed differently from the DSV data format. The following two sections outline those differences.

CSV and TSV

Table data in either CSV or TSV format is parsed according to the following rules, loosely based on RFC 4180:

  • The default delimiter for CSV is a comma (,) while the default delimiter for TSV is a tab character.

  • Empty lines are skipped (unless enclosed in a quoted value).

  • Whitespace surrounding each value is stripped.

  • Values can be enclosed in double quotes (").

    • A quoted value may contain zero or more separator or newline characters.

    • A newline begins a new row unless the newline is enclosed in double quotes.

    • A quoted value may include the double quote character if escaped using another double quote ("").

    • Newlines in quoted values are retained.

  • If rows do not have the same number of cells (“ragged” tables), cells are shuffled to fully fill the rows.

    • This is different behavior than Excel, which pads short rows with empty cells.

    • Extra cells at the end of the last row get dropped.

    • As a rule of thumb, data for a single row should be on the same line.

DSV

Table data in DSV format is parsed according to the following rules:

  • The default delimiter for DSV is a colon (:).

  • Empty lines are skipped.

  • Whitespace surrounding each value is stripped.

  • The delimiter character can be included in the value if escaped using a single backslash (\:).

  • If rows do not have the same number of cells (“ragged” tables), cells are shuffled to fully fill the rows.

Custom delimiters

Each data format has a default separator associated with it (csv = comma, tsv = tab, dsv = colon), but the separator can be changed to any character (or even a string of characters) by setting the separator attribute on the table.

Here’s an example of a DSV table that uses a custom separator character (i.e., delimiter):

Example 1. A DSV table with a custom separator
[format=dsv,separator=;]
|===
a;b;c
d;e;f
|===
To make a TSV table, you can set the format attribute to csv and the separator to \t. Though the tsv format is preferred.

The separator is independent of the processing rules for the format. If you set format=dsv and separator=,, the data will be processed using the DSV rules, even though the data looks like CSV.

Shorthand notation for data tables

AsciiDoc provides shorthand notation for specifying the data format of a table. The first position of the table block delimiter (i.e., |===) can be replaced by a built-in delimiter to set the table format (e.g., ,=== for CSV).

To make a CSV table, you can use ,=== as the table block delimiter:

,===
Artist,Track,Genre

Baauer,Harlem Shake,Hip Hop
,===
Result: Rendered CSV table using shorthand syntax
Artist Track Genre

Baauer

Harlem Shake

Hip Hop

To make a DSV table, you can use :=== as the table block delimiter:

:===
Artist:Track:Genre

Robyn:Indestructible:Dance
:===
Result: Rendered DSV table using shorthand syntax
Artist Track Genre

Robyn

Indestructible

Dance

When using either the CSV or DSV shorthand, you do not need to set the format attribute as it’s implied.

To make a TSV table, you can set the format attribute to tsv instead of having to set the format to csv and the separator to \t. In this case, you can use either |=== or ,=== as the table block delimiter. There is no special delimited block notation for a TSV table.

Formatting cells in a data table

The delimited formats do not provide a way to express formatting of individual table cells. Instead, you can apply cell formatting to all cells in a given column using the cols spec on the table:

[format=csv,cols="1h,1a"]
|===
Sky,image::sky.jpg[]
Forest,image::forest.jpg[]
|===

Data tables do not support cells that span multiple rows or columns, since that information can only be expressed at the cell level. You are advised to use the PSV format if you need that functionality.