Output Devices¶

Output devices process PDF Page and generate/extract resources from them.

All the Output devices inherit from base Output Device:

class pyxpdf.xpdf.PDFOutputDevice¶

Generic PDF Output Device

All PDF Output Device inherit from this.

Currently there are three Output devices implemented:

Page Iterator¶

To iterate over a PDF Output Device page wise, we have page_iterator:

class pyxpdf.xpdf.page_iterator(output, **kwargs)¶

Iterate over PDF output devices by page.

Parameters

output – PDF output device to iterate over
kwargs – All the optional arguments to pass to get() method of output device

Examples

Iterate pages text from TextOutput

>>> text_out = TextOutput(doc)
>>> for page_text in page_iterator(text_out)
...     print(page_text)

Iterate images from RawImageOutput with specific crop_box

>>> image_out = RawImageOutput(doc)
>>> for image in page_iterator(image_out, crop_box=(0,0,500,500)):
...     image.show()    # pillow image