Add generic AI job queue lifecycle

This commit is contained in:
Grendgi
2026-06-08 13:32:43 +03:00
parent e2f2adf900
commit 9d65ee323c
9 changed files with 262 additions and 5 deletions

View File

@@ -5,11 +5,31 @@ Technical AI job service for Portal workloads.
The first version owns only AI job lifecycle and metrics. Business data stays in
domain services such as `telephony`, `monitoring-tg` and `monitoring-pf`.
## Generic job contract
The service is intentionally domain-agnostic:
- `owner_service` names the caller, for example `telephony`, `monitoring-tg`,
`monitoring-pf` or a future Portal module.
- `owner_ref` is the caller's stable object reference, for example
`beeline/{call_id}` or `channel/{message_id}`.
- `task_type` describes the technical task class, for example
`transcribe`, `call_analysis`, `tg_analysis`, `pf_competitor_analysis`.
- `model_profile` selects a runtime profile, for example `whisperx`,
`qwen2.5-14b`, `vision`, or a future provider profile.
- `input` and `result` are JSON payloads owned by the caller and worker.
This keeps AI service as shared infrastructure rather than a telephony-specific
service.
## API
- `POST /api/v1/jobs` creates one job.
- `POST /api/v1/jobs/batch` creates many jobs with shared defaults.
- `POST /api/v1/jobs/claim` atomically claims pending jobs for a worker.
- `GET /api/v1/jobs/{id}` returns technical job state and result.
- `POST /api/v1/jobs/{id}/complete` stores a successful job result.
- `POST /api/v1/jobs/{id}/fail` stores a failed job category and message.
- `POST /api/v1/jobs/{id}/retry` resets failed/running jobs to `pending`.
- `GET /api/v1/stats` returns queue and error counters.
- `GET /healthz` returns process health.