A minimal Go service that queues long-running embedding tasks with self hosted inference.
⚠️ Queue implementation is TODO
Live on: go-queue-embeddings.onrender.com
⚠️ This demo runs on Render.com's free tier. It may show a 502 error during initialization after periods of inactivity. If this happens, please wait a few seconds and refresh the page.
- Showcase concurrency patterns in Go using worker queues
- Provide a working pipeline for document processing and embedding
- Use modular interfaces for future extensibility
- Build the container:
docker build -t go-queue-embeddings .- Run the container (maps port 8080):
docker run -p 8080:8080 go-queue-embeddingsPrerequisites: Ensure Docker is installed on your system
- Gin because is popular and easy to use
- Using freely an hexagonal architecture approach to ensure extensibility, specially decoupling the logic from the embedding provider and the storage
- Started using Ollama because it has a huge community and is optimized for different hardware out of the box
- Saving as the process JSON in a temp folder for this POC but the code is expandable to save in a database or other storage in the future
- Plan and Progress are tracked in plan.md for clarity and future reference.
- Added Supervisord to run this in a Hugging Face Space, but managing two ports inside Hugging Face caused issues, so we switched to Render.com instead. Later we can revisit this issue, for example using tfgo instead of ollama.
- Choosed HTMX to mantain a lean view implementation, the focus is the go service. But we can implement server side react or next.js later.
-
PDF Upload
- Route:
/upload - Receives PDF via POST request
- Returns UUID process ID for tracking
- Request fields:
pdf: PDF file (multipart/form-data)chunk_strategy: Strategy for PDF text chunking
- Route:
-
Response Format
{ "id": "uuid-process-id", "status": "processing", "progress": 25 } -
Processing Pipeline
- PDF is divided into chunks based on strategy
- Each chunk is sent to embedding service
- Results saved as JSON file (
<process_id>.json)
-
Output route
- Route:
/process/<process_id> - Returns the JSON file with the status and, if completed, the results
- Route:
- Process JSON
{ "id": "uuid-process-id", "status": "processing|completed|failed", "progress": 75, "data": [ { "id": "uuid-chunk-id", "text": "chunk text content", "embedding": [0.1, 0.2, ...], "metadata": { "chunk_size": 512, "embedding_model": "model-name" } } ], "metadata": { "chunk_size": 512, "embedding_model": "model-name" } }