# AI Service Technical AI job service for Portal workloads. The first version owns only AI job lifecycle and metrics. Business data stays in domain services such as `telephony`, `monitoring-tg` and `monitoring-pf`. ## Generic job contract The service is intentionally domain-agnostic: - `owner_service` names the caller, for example `telephony`, `monitoring-tg`, `monitoring-pf` or a future Portal module. - `owner_ref` is the caller's stable object reference, for example `beeline/{call_id}` or `channel/{message_id}`. - `task_type` describes the technical task class, for example `transcribe`, `call_analysis`, `tg_analysis`, `pf_competitor_analysis`. - `model_profile` selects a runtime profile, for example `whisperx`, `qwen2.5-14b`, `vision`, or a future provider profile. - `input` and `result` are JSON payloads owned by the caller and worker. This keeps AI service as shared infrastructure rather than a telephony-specific service. ## API - `POST /api/v1/jobs` creates one job. - `POST /api/v1/jobs/batch` creates many jobs with shared defaults. - `POST /api/v1/jobs/claim` atomically claims pending jobs for a worker. - `GET /api/v1/jobs/{id}` returns technical job state and result. - `POST /api/v1/jobs/{id}/complete` stores a successful job result. - `POST /api/v1/jobs/{id}/fail` stores a failed job category and message. - `POST /api/v1/jobs/{id}/retry` resets failed/running jobs to `pending`. - `GET /api/v1/stats` returns queue and error counters. - `GET /healthz` returns process health. - `GET /readyz` checks PostgreSQL readiness. ## Configuration - `HTTP_HOST`, default `0.0.0.0` - `HTTP_PORT`, default `8080` - `DATABASE_URL`, required - `MIGRATE_ON_START`, default `true` - `LLM_BASE_URL`, primary OpenAI-compatible LLM endpoint - `LLM_API_KEY`, primary LLM API key - `LLM_MODEL`, default `qwen2.5-14b` - `LLM_TIMEOUT`, default `5m` - `WHISPERX_URL`, WhisperX endpoint for transcription jobs - `OPENCLAW_URL`, optional OpenClaw gateway URL if we route through OpenClaw instead of direct vLLM ## Next integration step `telephony` should first mirror low-risk analysis jobs into this service while continuing local processing. Remote execution can then be enabled by feature flag per task type. ## OpenClaw note Current Portal services call the local AI server directly: vLLM for LLM tasks and WhisperX for transcription. OpenClaw is not required for the current `ai-service` queue deployment. It becomes useful if we want centralized model routing, provider fallback, request policy and cross-model gateway behavior.