feat: implement unified hook model with SDK, feature tables, and OCI runner#66
feat: implement unified hook model with SDK, feature tables, and OCI runner#66
Conversation
…runner Replace the old validator model with a unified hook system where hooks produce typed features stored in per-convention PostgreSQL tables. This includes the Python SDK for authoring hooks, the OCI container runner, dynamic DDL for feature tables, and the event-driven validation pipeline. Key changes: - Add Python SDK (sdk/py/) with @hook decorator, convention declarations, testing harness, manifest generation, and deploy tooling - Add feature domain with FeatureStore port and PostgresFeatureStore adapter that creates per-convention PG schemas with typed columns from hook manifests - Replace old OCI DockerValidatorRunner with OciHookRunner supporting progress tracking, rejection, feature collection, and security hardening - Replace CheckResult/CheckStatus with HookResult/HookStatus on ValidationRun - Remove dead old validator port (ValidatorRunner, ValidationInputs) - Update validation API route to serve HookResult-based responses - Add Alembic migration for hooks JSON column and feature_tables catalog - Convention creation now creates feature tables before persisting the row - Quote all PG identifiers in dynamic DDL to handle convention IDs with hyphens - Validate record_srn presence before feature insertion - 532 server tests + 76 SDK tests passing
|
Greptile SummaryThis large PR (157 files, +10k/-2k lines) replaces the old validator model with a unified hook system. Hooks are OCI containers that produce typed features stored in per-convention PostgreSQL tables. The PR also adds a Python SDK with Key changes:
Issues found:
Confidence Score: 2/5
|
| Filename | Overview |
|---|---|
| server/osa/domain/deposition/command/create_convention.py | Auth gate changed from at_least(Role.ADMIN) to public(), allowing unauthenticated convention creation including OCI image execution specs. |
| server/osa/infrastructure/oci/runner.py | Solid OCI hook runner with good security hardening (ReadonlyRootfs, CapDrop ALL, no network, PidsLimit, MemorySwap). Minor env injection concern with unsanitized hook name. |
| server/osa/infrastructure/persistence/feature_store.py | Dynamic DDL feature table creation with PG identifier validation. TOCTOU race in check-then-create pattern; concurrent calls could bypass ConflictError. |
| server/osa/infrastructure/oci/source_runner.py | OCI source runner with network access (intentional for upstream APIs). Good security hardening with CapDrop ALL and no-new-privileges. Code largely mirrors OciHookRunner. |
| server/osa/domain/validation/service/validation.py | Clean refactor from validator-based to hook-based validation. Sequential hook execution with halt-on-failure. New validate_deposition method consolidates full workflow. |
| server/osa/domain/deposition/service/convention.py | Convention service creates feature tables before persisting (good atomic ordering). Emits ConventionRegistered event for downstream source triggers. Clean integration with FeatureService and Outbox. |
| server/osa/domain/shared/model/hook.py | Hook domain models (ColumnDef, FeatureSchema, HookManifest, HookLimits, HookDefinition). HookManifest.name lacks pattern validation despite being used in PG identifiers and file paths. |
| server/migrations/versions/add_hooks_and_feature_tables.py | Clean Alembic migration: drops validator_refs, adds hooks/source JSON columns, creates feature_tables catalog. Reversible downgrade included. |
| server/osa/domain/source/service/source.py | Major refactor from in-process source pulling to OCI container execution model. Creates depositions from container output records, moves staged files, and emits continuation events for chunked processing. |
| sdk/py/osa/runtime/entrypoint.py | OCI container entrypoint for hooks. Reads record.json, runs hook, writes features.json or rejection to progress.jsonl. Previously flagged field mismatch (reason vs message) has been addressed. |
| deploy/docker-compose.yml | Docker socket mount added for sibling container execution. Default JWT secret in production compose. Both are security concerns that should be addressed before production use. |
| server/osa/infrastructure/persistence/adapter/storage.py | Extended with hook output dirs, feature reading, source staging dirs, and file move operations. Path traversal protection via _safe_path for uploaded files; hook_name used directly in paths relies on upstream validation. |
Sequence Diagram
sequenceDiagram
participant CLI as SDK Deploy CLI
participant API as OSA Server API
participant CS as ConventionService
participant FS as FeatureService
participant PG as PostgreSQL
participant Outbox as Event Outbox
participant Worker as WorkerPool
participant Docker as Docker Daemon
participant Hook as Hook Container
CLI->>CLI: Build hook/source Docker images
CLI->>API: POST /api/v1/conventions (hooks, source, schema)
API->>CS: create_convention()
CS->>FS: create_table(hook_def) for each hook
FS->>PG: CREATE SCHEMA features + CREATE TABLE + catalog insert
CS->>PG: INSERT convention row
CS->>Outbox: ConventionRegistered event
Note over Worker: On DepositionSubmitted event
Worker->>CS: validate_deposition(dep_srn)
CS->>CS: Build HookInputs (record_json, files_dir)
loop For each HookDefinition
CS->>Docker: Create + start hook container
Docker->>Hook: Run with /osa/in (ro), /osa/out (rw)
Hook->>Hook: Read record.json, write features.json
Hook-->>Docker: Exit
Docker-->>CS: Wait result + inspect OOM
CS->>CS: Parse progress.jsonl, check rejection
end
CS->>Outbox: ValidationCompleted or ValidationFailed
Note over Worker: On RecordPublished event
Worker->>FS: insert_features_for_record()
FS->>PG: Read features.json from cold storage
FS->>PG: Bulk INSERT into features.hook_name table
Last reviewed commit: c473cbf
sdk/py/osa/runtime/entrypoint.py
Outdated
| result = hook_fn(record) | ||
| except Reject as e: | ||
| # Write rejection to progress.jsonl | ||
| entry = {"status": "rejected", "reason": str(e)} |
There was a problem hiding this comment.
Field mismatch - entrypoint writes "reason" but runner reads "message"
The entrypoint writes rejection as {"status": "rejected", "reason": str(e)}, but OciHookRunner._check_rejection() reads entry.message (server/osa/infrastructure/oci/runner.py:212). This means rejections will always return None as the rejection reason.
| entry = {"status": "rejected", "reason": str(e)} | |
| entry = {"status": "rejected", "message": str(e)} |
…pipeline Complete the unified hook model with source ingestion support: SDK: - Add Source base class, SourceContext, SourceRecord, and SourceSchedule types - Add convention() function for declaring schemas, hooks, sources, and file requirements together in code - Add osa deploy CLI: builds OCI images for hooks and sources from pyproject.toml, registers conventions with the server via REST API - Add osa-run-hook and osa-run-source console script entry points (replaces python -m invocation to avoid double-import issues) - Add auto-discovery of convention packages via osa.conventions entry point group - Generate Dockerfiles with build-essential and local SDK staging for projects using path-based SDK dependencies Server: - Add SourceDefinition domain model and SourceRunner port - Add OciSourceRunner adapter (network-enabled containers for upstream API access, bind-mount staging with host path translation for sibling containers) - Add TriggerSourceOnDeploy event handler (ConventionRegistered -> SourceRequested) - Add PullFromSource handler with full deposition creation from source records - Add InsertRecordFeatures handler (deferred feature insertion on record publication) - Add host_data_dir config for Docker sibling container path translation - Add file storage methods for source staging, hook features, and source-to-deposition file moves - Fix OCI runner staging cleanup (onexc permission handler instead of ignore_errors) - Increase hook error log capture to 2000 chars for debugging - Make convention creation public (temporary, for osa deploy without auth) - Add integration tests for convention, deposition, event, and feature store repositories - Remove old in-process source discovery and geo_entrez source plugin
|
@greptile |
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
- Standardize rejection key on "message" (was "reason" in CLI) - Replace RuntimeError with ExternalServiceError in OciSourceRunner - Split staging into sibling input/ and output/ dirs under work_dir - Add find_latest_where to filter events by payload field, fixing TriggerInitialSourceRun and SourceSchedule global query bug
|
@greptile |
- Move business logic from ValidateDeposition handler into ValidationService.validate_deposition() — handler now delegates to service instead of orchestrating repos directly - Move feature insertion logic from InsertRecordFeatures handler into FeatureService.insert_features_for_record() - Delete TriggerInitialSourceRun handler (replaced by TriggerSourceOnDeploy in previous commit) - Move _compute_summary from validation route into ValidationRun.summary property on the domain entity - Add RunStatus.REJECTED to distinguish hook rejections from infrastructure failures - Fix file storage to use dedicated files/ subdirectory instead of writing directly to deposition root - Add per-run isolation for source staging dirs - Add SQL-level list_with_source to ConventionRepository (replaces in-memory filtering) - Add pg identifier validation and conflict detection to feature store DDL - Include image resolution in timeout scope for OCI hook/source runners - Move httpx import to module level in SDK deploy command
|
@greptile |
| OSA_DATABASE__URL: postgresql+asyncpg://${POSTGRES_USER:-postgres}:${POSTGRES_PASSWORD:-osa}@db:5432/${POSTGRES_DB:-osa} | ||
| OSA_DATA_DIR: /data | ||
| OSA_LOGGING__LEVEL: ${LOG_LEVEL:-INFO} | ||
| OSA_AUTH__JWT__SECRET: ${JWT_SECRET:-change-me-in-production-must-be-32-chars-long} |
There was a problem hiding this comment.
Default JWT secret in production compose
The default JWT secret change-me-in-production-must-be-32-chars-long will be used if JWT_SECRET isn't set. While the name is self-documenting, having any default in the production compose file means a misconfigured deployment silently uses an insecure secret. Consider removing the default so the server fails to start if JWT_SECRET isn't explicitly set.
Introduce PgIdentifier type with regex validation to ensure hook and column names are safe PostgreSQL identifiers, preventing SQL injection, path traversal, and environment variable injection attacks
Summary
sdk/py/) with@hookdecorator,convention()declarations, testing harness, manifest generation, and deploy toolingfeaturedomain withFeatureStoreport andPostgresFeatureStoreadapter that creates per-convention PG schemas with typed columns from hook manifestsDockerValidatorRunnerwithOciHookRunnersupporting progress tracking, rejection, feature collection, and security hardeningCheckResult/CheckStatuswithHookResult/HookStatusonValidationRun; remove dead old validator porthooksJSON column andfeature_tablescatalog tableTest plan