Warning
pdfnaut is currently in an early stage of development and has only been tested with a small set of compliant documents. Some non-compliant documents may work under strict=False. Expect bugs or issues.
pdfnaut aims to become a PDF processor for parsing PDF 2.0 files.
pdfnaut provides a high-level interface for reading and writing PDF documents as described in the PDF 2.0 specification for actions such as reading and writing metadata, modifying and inserting pages, creating PDF objects, etc.
pdfnaut requires at least Python 3.9 or later. To install pdfnaut via pip:
python -m pip install pdfnaut
If you plan to work with encrypted or protected PDF documents, you must install one of the supported crypt providers. See Standard Security Handler in the documentation for details.
Example 1: Accessing the content stream of a page
from pdfnaut import PdfDocument
pdf = PdfDocument.from_filename("tests/docs/sample.pdf")
for operator in pdf.pages[0].content_stream:
print(operator)Example 2: Reading document information
from pdfnaut import PdfDocument
pdf = PdfDocument.from_filename("tests/docs/sample.pdf")
print(pdf.doc_info.title)
print(pdf.doc_info.author)For more examples on what pdfnaut can do, see the examples directory in the repository or see the guides in the documentation.
Contributions to pdfnaut should be done according to the Contributing Guidelines. You can contribute in many ways including adding small features, resolving issues, writing documentation, and more.
pdfnaut is provided under the terms of the Apache License 2.0