1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
|
# CI queue Telegram alerts (`alert_queued_jobs.py`)
`alert_queued_jobs.py` monitors the GitHub Actions queue and sends Telegram alerts when runs are stuck longer than configured thresholds. It runs in the **CI queue Telegram alerts** job of [telegram_scheduled_notifications.yml](../../workflows/telegram_scheduled_notifications.yml) (every 30 min + manual `workflow_dispatch`). That workflow file also contains the **Mute digest to Telegram** job (`send_digest.py`).
**Flow:** Fetch `queued` runs from GitHub API → filter by blacklist → compare wait time to thresholds per workflow type → send 1–2 messages to Telegram (optionally with a call string at the start for a duty bot). Can also send an “all good” message when the queue is empty or no jobs are stuck.
---
## GitHub: Secrets & Variables
**Settings → Secrets and variables → Actions.**
| Type | Name | Description |
|------|------|-------------|
| Secret | `TELEGRAM_YDBOT_TOKEN` | Bot token (passed as `TELEGRAM_BOT_TOKEN` in the workflow). |
| Variable | `GH_ALERTS_TG_LOGINS` | Call string at the **start** of alert messages so the duty bot reacts (e.g. `"/duty ydb-ci"` or `"@user"`). Empty = no call. |
| Variable | `GH_ALERTS_RUNS_BLACK_LIST` | Space-separated run IDs to ignore (passed as `--blacklist`). |
Workflow also uses: `--chat-id 3017506311`, `--send-when-all-good`, `--notify-on-api-errors`.
---
## Script config (top of file)
**WORKFLOW_THRESHOLDS:** list of `(pattern, display_name, threshold_spec)`. `threshold_spec` = fixed hours (e.g. `6`) or time-based `[(start_utc, end_utc, hours), ...]` (overnight when `start > end`).
**Defaults:** PR-check — 1 h (08:00–20:00 UTC), 3 h (20:00–08:00 UTC); Postcommit — 6 h fixed. Other constants: API URL, timeouts, max jobs per message, dashboard link.
---
## Local testing
**Dry-run** (no token/chat needed; prints messages that would be sent):
```bash
pip install requests
python .github/scripts/telegram/alert_queued_jobs.py --dry-run --chat-id 0
```
With call string (e.g. for duty bot):
```bash
GH_ALERTS_TG_LOGINS="/duty ydb-ci" python .github/scripts/telegram/alert_queued_jobs.py --dry-run --chat-id 0
```
Output shows `--- Message 1 ---` / `--- Message 2 ---`. The second message (call + stuck list) only appears when there are runs over the threshold.
**Test Telegram connection** (token + chat required):
```bash
python .github/scripts/telegram/alert_queued_jobs.py --test-connection --bot-token "..." --chat-id "..."
```
---
## CLI & env
| Option / Env | Description |
|--------------|-------------|
| `--dry-run` / `DRY_RUN=true` | Print messages, do not send. |
| `--chat-id`, `--channel` | Telegram chat/channel ID. |
| `--bot-token` | Or `TELEGRAM_BOT_TOKEN`. |
| `--thread-id` | Or `TELEGRAM_THREAD_ID`. |
| `--test-connection` | Only check chat access (getChat). |
| `--send-when-all-good` | Send even when queue is fine. |
| `--notify-on-api-errors` | Notify on GitHub API errors. |
| `--blacklist` | Space-separated run IDs to ignore. |
| `GH_ALERTS_TG_LOGINS` | Same as repo variable (call at message start). |
| `GITHUB_REPOSITORY` | Default `ydb-platform/ydb` for links. |
---
## Message format
- If `GH_ALERTS_TG_LOGINS` is set, **alert and API-error messages start with that string** (so the duty bot sees the command first).
- Message 1: summary (queue size, stuck count, per-workflow stats).
- Message 2 (only when there are stuck jobs): call (if set), then stuck list and dashboard link.
**Requirements:** Python 3.7+, `requests`.
|