Grendgi 24c5d89c7b
All checks were successful
CI / test (push) Successful in 13s
Build and Deploy / build-and-deploy (push) Successful in 20s
Add generic LLM worker
2026-06-08 13:52:29 +03:00
2026-06-08 13:52:29 +03:00
2026-06-08 13:52:29 +03:00
2026-06-08 13:52:29 +03:00
2026-06-08 13:52:29 +03:00
2026-06-08 13:45:55 +03:00
2026-06-08 13:37:06 +03:00
2026-06-08 13:52:29 +03:00
2026-06-08 13:23:10 +03:00
2026-06-08 13:23:10 +03:00
2026-06-08 13:52:29 +03:00

AI Service

Technical AI job service for Portal workloads.

The first version owns only AI job lifecycle and metrics. Business data stays in domain services such as telephony, monitoring-tg and monitoring-pf.

Generic job contract

The service is intentionally domain-agnostic:

  • owner_service names the caller, for example telephony, monitoring-tg, monitoring-pf or a future Portal module.
  • owner_ref is the caller's stable object reference, for example beeline/{call_id} or channel/{message_id}.
  • task_type describes the technical task class, for example transcribe, call_analysis, tg_analysis, pf_competitor_analysis.
  • model_profile selects a runtime profile, for example whisperx, qwen2.5-14b, vision, or a future provider profile.
  • input and result are JSON payloads owned by the caller and worker.

This keeps AI service as shared infrastructure rather than a telephony-specific service.

Built-in workers

The first built-in worker processes llm_chat and chat_completion jobs whose model_profile equals LLM_MODEL.

Input can be either explicit messages:

{
  "messages": [
    {"role": "system", "content": "Answer as JSON."},
    {"role": "user", "content": "Classify this text"}
  ],
  "max_tokens": 256
}

or compact system / user fields. The completed job result contains content, model, usage and duration_ms.

API

  • POST /api/v1/jobs creates one job.
  • POST /api/v1/jobs/batch creates many jobs with shared defaults.
  • POST /api/v1/jobs/claim atomically claims pending jobs for a worker.
  • GET /api/v1/jobs/{id} returns technical job state and result.
  • POST /api/v1/jobs/{id}/complete stores a successful job result.
  • POST /api/v1/jobs/{id}/fail stores a failed job category and message.
  • POST /api/v1/jobs/{id}/retry resets failed/running jobs to pending.
  • GET /api/v1/stats returns queue and error counters.
  • GET /api/v1/providers/status checks configured AI providers without returning secrets.
  • GET /healthz returns process health.
  • GET /readyz checks PostgreSQL readiness.

Configuration

  • HTTP_HOST, default 0.0.0.0
  • HTTP_PORT, default 8080
  • DATABASE_URL, required
  • MIGRATE_ON_START, default true
  • LLM_BASE_URL, primary OpenAI-compatible LLM endpoint
  • LLM_API_KEY, primary LLM API key
  • LLM_MODEL, default qwen2.5-14b
  • LLM_TIMEOUT, default 5m
  • WHISPERX_URL, WhisperX endpoint for transcription jobs
  • WORKER_ID, default hostname
  • WORKER_POLL_INTERVAL, default 2s
  • WORKER_CLAIM_LIMIT, default 4

Next integration step

telephony should first mirror low-risk analysis jobs into this service while continuing local processing. Remote execution can then be enabled by feature flag per task type.

Description
No description provided
Readme 474 KiB
Languages
Go 98.7%
Shell 0.7%
Dockerfile 0.6%