Back to blog
documentation

Stages

PdfQL is the language that describes how to get objects from PDF document. Each instruction is the stage that transforms data from current input to described output.

Syntax

Stages
  : Stage ('->' Stage)*
  ;

PdfQL example

select(tables) // PdfTable[] - Get all tables from a document
    ->filter((item) => item.GetCell(4).Text() == 'Name') // PdfTable[] - Returns only tables where cell #4 contains text 'Name'
    ->selectMany(tableRows) // PdfTableRow[] - Get all table rows from tables, and transaform two-dimension array to one dimension
    ->map((item) => item.GetCell(1).Text()) // string - From table rows get cell #1 text.

PdfQL stage syntax

Stage can be one of the following tokens

Stage
  : SelectStage
  | SelectManyStage
  | FilterStage
  | MapStage
  | SingleStage
  | FirstOrDefaultStage
  | FirstStage
  ;