pdf2image

A python3 module that wraps the pdftoppm utility to convert PDF to the PIL image formatt

How to install

pip install pdf2image

Windows users will have to install pdftoppm

Linux users will have pdftoppm pre-installed with the distro (Tested on Ubuntu and Archlinux)

How does it work?

from pdf2image import convert_from_path, convert_from_bytes

Then simply do:

images = convert_from_path('/home/kankroc/example.pdf')

OR

images = convert_from_bytes(open('/home/kankroc/example.pdf', 'rb').read())

images will be a list of PIL Image representing each page of the PDF document.

Exception handling

There are no exception thrown by pdftoppm therefore any file that couldn't be convert/processed will return an empty Image list. The philosophy behind this choice is simple, if the file was corrupted / not found, no image could be extracted and returning an empty list makes sense. (This is up for discussion)

Limitations / known issues

A relatively big PDF will use up all your memory and cause the process to be killed

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
pdf2image		pdf2image
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
circle.yml		circle.yml
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

pdf2image

How to install

How does it work?

Exception handling

Limitations / known issues

About

Uh oh!

Releases

Packages

Languages

License

yangpingyan/pdf2image

Folders and files

Latest commit

History

Repository files navigation

pdf2image

How to install

How does it work?

Exception handling

Limitations / known issues

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages