This repository provides a serverless solution for converting PDFs to images using AWS Lambda and Poppler, an open-source PDF processing library.
The project packages the Poppler utility (pdftocairo) into an AWS Lambda layer, allowing the Lambda function to convert PDF files into images. The conversion is done in a lightweight, reusable Lambda layer, optimized for serverless environments.
- Docker (for building the Lambda layer)
- AWS CLI and AWS SAM CLI (for deployment)
- Node.js
To build the Poppler layer for ARM64 architecture:
docker build --platform=linux/arm64 --build-arg TARGET_PLATFORM=arm64 -f ./layers/poppler/Dockerfile -t poppler-lambda-layer-arm ./layers/poppler
docker run --rm --platform=linux/arm64 -v "$(pwd)/layers:/workspace" poppler-lambda-layer-armFor x86_64 architecture, adjust the commands as follows:
- Platform:
linux/amd64 - Target Platform:
amd64 - Image Name:
poppler-lambda-layer-x86-64
- Set up a new SAM project.
- Define the Lambda function and layer in
template.yaml. - Deploy the application:
sam build
sam deploy --guidedTo test, send a POST request to the API endpoint with a base64-encoded PDF. The function returns the first page of the PDF as a base64-encoded PNG image.
curl -s -X POST "YOUR_ENDPOINT_HERE" \
-H "Content-Type: application/json" \
-d "{\"data\":\"$(base64 < sample.pdf | tr -d '\n')\"}" \
| sed -n 's/This setup allows PDF-to-image conversion in a serverless environment, making it scalable and reusable for document processing tasks.