Back to blog
rootSectionDefinition

PdfQL language

A query language designed to extract structured data from PDF documents using a pipeline of composable stages.

Quick example

select(tables)
    ->filter((item) => item.GetCell(4).Text() == 'Name')
    ->selectMany(tableRows)
    ->map((item) => item.GetCell(1))

โ–ถ๏ธŽ Stages

Pipeline operators that transform, filter and reduce element collections. Chain them to build precise extraction queries.

ฮป Expressions

Building blocks used inside stage predicates. Compose them to express complex matching conditions.

๐Ÿ”‘ Keywords

Special tokens with predefined meaning in the PdfQL grammar.

๐Ÿ“ฅ Output

Supported output formats for query results.