Add generic LLM worker
This commit is contained in:
23
README.md
23
README.md
@@ -22,6 +22,26 @@ The service is intentionally domain-agnostic:
|
||||
This keeps AI service as shared infrastructure rather than a telephony-specific
|
||||
service.
|
||||
|
||||
## Built-in workers
|
||||
|
||||
The first built-in worker processes `llm_chat` and `chat_completion` jobs whose
|
||||
`model_profile` equals `LLM_MODEL`.
|
||||
|
||||
Input can be either explicit messages:
|
||||
|
||||
```json
|
||||
{
|
||||
"messages": [
|
||||
{"role": "system", "content": "Answer as JSON."},
|
||||
{"role": "user", "content": "Classify this text"}
|
||||
],
|
||||
"max_tokens": 256
|
||||
}
|
||||
```
|
||||
|
||||
or compact `system` / `user` fields. The completed job result contains
|
||||
`content`, `model`, `usage` and `duration_ms`.
|
||||
|
||||
## API
|
||||
|
||||
- `POST /api/v1/jobs` creates one job.
|
||||
@@ -48,6 +68,9 @@ service.
|
||||
- `LLM_MODEL`, default `qwen2.5-14b`
|
||||
- `LLM_TIMEOUT`, default `5m`
|
||||
- `WHISPERX_URL`, WhisperX endpoint for transcription jobs
|
||||
- `WORKER_ID`, default hostname
|
||||
- `WORKER_POLL_INTERVAL`, default `2s`
|
||||
- `WORKER_CLAIM_LIMIT`, default `4`
|
||||
|
||||
## Next integration step
|
||||
|
||||
|
||||
Reference in New Issue
Block a user