OpenClaw skill plus standalone CLI scripts for searching, fetching, and updating documents from a Paperless-ngx instance via its REST API.
- Python 3.10+
- requests (
pip install requests) - Paperless-ngx API token with document access
- Install dependencies:
pip install requests. - Create a
config.envfile (or export environment variables) with the following values:
PAPERLESS_URL=http://localhost:8000
PAPERLESS_TOKEN=your_api_token_here- Ensure
PAPERLESS_URLandPAPERLESS_TOKENare available in your environment before running the scripts.
python scripts/search.py --query "tax form" --limit 5
python scripts/fetch.py --id 123 --text
python scripts/update_meta.py --id 123 --add-tag importantpython scripts/search.py --query "tax form" --tag receipts --after 2024-01-01 --limit 10
python scripts/search.py --correspondent "Acme Corp" --type "Invoice" --jsonSupported filters:
--queryfull-text search (server-side; matches OCR/content when indexed by Paperless-ngx)--tagtag name (repeatable)--typedocument type name--correspondentcorrespondent name--aftercreated after (YYYY-MM-DD)--beforecreated before (YYYY-MM-DD)
Output:
- Default: human-readable table with columns
id,title,created,correspondent,tags,document_type --json: machine-readable JSON array
python scripts/fetch.py --id 123
python scripts/fetch.py --id 123 --out ./downloads/
python scripts/fetch.py --id 123 --textBehavior:
- Default: downloads the file from
/api/documents/{id}/download/ --text: prints the OCR/text content from/api/documents/{id}/
python scripts/update_meta.py --id 123 --add-tag important --remove-tag inbox
python scripts/update_meta.py --id 123 --title "Q1 Invoice" --correspondent "Acme Corp"Behavior:
- Resolves tag and correspondent names to IDs via
/api/tags/and/api/correspondents/ - Sends a
PATCHto/api/documents/{id}/
- Authentication uses
Authorization: Token {PAPERLESS_TOKEN}. - Pagination is handled automatically.
- If
--textoutput is empty, OCR may still be processing. Reprocess in Paperless-ngx, then retry. - A 401/403 error usually means the token is invalid or lacks access.
- Some Paperless-ngx setups may reject certain query parameters; the search script falls back to client-side filtering of metadata when needed.
- Search uses Paperless-ngx server-side full-text search via the
queryparameter; no document contents are downloaded for searching. - Full-text results depend on the server index (OCR/content availability is determined by Paperless-ngx settings and processing status).
- Do not commit
config.env(it contains secrets). Useconfig.example.envas the template to share.
Q: Why is --text empty or incomplete?
A: OCR may still be running. Reprocess the document in Paperless-ngx and try again.
Q: How do I reprocess a document?
A: Use the Paperless-ngx UI or call the reprocess endpoint (e.g. via /api/documents/bulk_edit/ with method reprocess).
Q: I get 401/403 errors. What should I check?
A: Verify that PAPERLESS_TOKEN is a valid API token with document permissions.
Q: Search results look incomplete when using filters.
A: Some servers reject certain filters. The script will fall back to client-side filtering, but if your API blocks the query param entirely, try fewer filters.
Q: Does search include OCR/content?
A: Yes, --query uses Paperless-ngx server-side full-text search, which includes OCR/content when it has been indexed by the server.
This skill is designed for OpenClaw, but the scripts work standalone from the command line.