tencentcloud-cls-alarm

Installation

SKILL.md

Tencent Cloud CLS — Alarm Management

CRUD on CLS alarm policies, notice groups, mute shields, and the alarm execution log.

Setup: See tencentcloud authentication. Defaults to TENCENTCLOUD_REGION from env; CLS alarms live per-region.

Companion skill: tencentcloud-cls for log content search.

CLI (preferred)

The skill ships scripts/cls_alarm.py — wraps every alarm / notice / shield operation as a subcommand.

A=$SKILL_DIR/scripts/cls_alarm.py

python3 $A alarms                              # list policies
python3 $A alarm <alarm-id>                    # full detail
python3 $A alarm-disable <alarm-id>            # quick mute (Status=false)
python3 $A alarm-enable  <alarm-id>
python3 $A alarm-create --json /tmp/alarm.json --dry-run
python3 $A alarm-modify <alarm-id> --condition '$1.cnt > 20'
python3 $A alarm-delete <alarm-id> --yes       # destructive

python3 $A notices                             # notice groups
python3 $A notice <notice-id>

python3 $A shields                             # mute rules
python3 $A shield-create --notice-id <notice-id> --start $(date +%s) --end $(date -v+2H +%s) --type 1 --reason "deploy"

python3 $A alarm-log --time 6h                 # firing history

For the rare schema field the CLI doesn't surface, fall through to the raw SDK examples below.

When to Use

Inspect existing alarm policies (filter by name, enabled flag)
Create a new log-based alarm: CQL/SQL query → trigger condition → notice group
Modify thresholds, query, period, recipients
Quick enable / disable to mute during incidents
Manage notice groups (recipients: users, groups, webhooks, with escalation)
Time-bounded shield rules to mute during deploys
Pull the alarm execution log to see which alarms fired and when

Dependencies

pip install tencentcloud-sdk-python

Quick start

import os
import json
from tencentcloud.common import credential
from tencentcloud.cls.v20201016 import cls_client, models

cred = credential.EnvironmentVariableCredential().get_credential()
client = cls_client.ClsClient(cred, os.environ["TENCENTCLOUD_REGION"])

Key concepts

Object	API methods	What it is
Alarm policy	`DescribeAlarms` / `CreateAlarm` / `ModifyAlarm` / `DeleteAlarm`	Query against one or more topics + condition + notice groups
Notice group	`DescribeAlarmNotices` / `CreateAlarmNotice` / …	Recipients (users / groups / webhooks) + escalation
Shield	`DescribeAlarmShields` / `CreateAlarmShield` / …	Time-bounded mute scoped to a notice group
Alarm log	`GetAlarmLog`	Firing history

Workflows

List alarm policies

req = models.DescribeAlarmsRequest()
req.Limit = 100
# Optional filters: req.Filters = [{"Key": "name", "Values": ["DeepSeek"]}]
resp = client.DescribeAlarms(req)
for a in resp.Alarms:
    print(a.AlarmId, a.Name, "Enable=", a.Enable, "Status=", a.Status)

Get a single alarm policy in full

req = models.DescribeAlarmsRequest()
req.Filters = [{"Key": "alarmId", "Values": ["<alarm-id>"]}]
resp = client.DescribeAlarms(req)
print(json.dumps(resp.Alarms[0]._serialize(), indent=2, ensure_ascii=False))

Create an alarm

req = models.CreateAlarmRequest()
req.Name = "OpenAI 5xx burst"
req.AlarmTargets = [{
    "TopicId": "<trace-topic-id>",
    "Query": "* | select count(*) as cnt where status_code >= 500",
    "Number": 1,
    "StartTimeOffset": -5,
    "EndTimeOffset": 0,
    "SyntaxRule": 1,             # 1 = CQL, 0 = Lucene
}]
req.MonitorTime = {"Type": "Period", "Time": 1}
req.Condition = "$1.cnt > 10"
req.AlarmPeriod = 5              # evaluate every 5 minutes
req.TriggerCount = 1             # fire after 1 consecutive match
req.AlarmLevel = 2               # 0=Notice, 1=Warning, 2=Critical
req.AlarmNoticeIds = ["<notice-id-a>", "<notice-id-b>"]
req.MonitorObjectType = 0
req.Enable = True

resp = client.CreateAlarm(req)
print("Created", resp.AlarmId)

Quick enable / disable

req = models.ModifyAlarmRequest()
req.AlarmId = "<alarm-id>"
req.Status = False               # mute (Status — runtime), keep Enable=True
client.ModifyAlarm(req)

Enable vs Status are different fields. Enable=False archives the alarm definition; Status=False is the everyday on/off switch you want for muting.

Modify thresholds without rewriting the whole payload

req = models.ModifyAlarmRequest()
req.AlarmId = "<alarm-id>"
req.Condition = "$1.cnt > 20"
req.AlarmPeriod = 10
client.ModifyAlarm(req)

Delete (destructive)

# Confirm with the user first — there's no undo.
req = models.DeleteAlarmRequest()
req.AlarmId = "<alarm-id>"
client.DeleteAlarm(req)

Manage notice groups

# List
req = models.DescribeAlarmNoticesRequest()
req.Limit = 100
for n in client.DescribeAlarmNotices(req).AlarmNotices:
    print(n.AlarmNoticeId, n.Name, n.Type)

# Create
req = models.CreateAlarmNoticeRequest()
req.Name = "Ops on-call"
req.Type = "Trigger"             # Trigger | Recovery | All
req.NoticeReceivers = [{
    "ReceiverType": "Uin",
    "ReceiverIds": [123456],
    "ReceiverChannels": ["Email", "Sms"],
    "StartTime": "00:00:00",
    "EndTime": "23:59:59",
}]
req.WebCallbacks = [{
    "Url": "https://hooks.example.com/cls",
    "CallbackType": "WeCom",
    "Method": "POST",
}]
resp = client.CreateAlarmNotice(req)

Time-bounded shield (mute during deploy)

import time
req = models.CreateAlarmShieldRequest()
req.AlarmNoticeId = "<notice-id>"
req.StartTime = int(time.time())
req.EndTime = int(time.time()) + 7200       # 2 hours
req.Type = 1                                # 1 = full shield, 2 = partial
req.Reason = "Deploy <git-sha>"
client.CreateAlarmShield(req)

Inspect alarm firing history

req = models.GetAlarmLogRequest()
req.From = int((time.time() - 86400) * 1000)
req.To = int(time.time() * 1000)
req.Query = 'alarm_name:"OpenAI*"'
req.Limit = 100
resp = client.GetAlarmLog(req)
for line in resp.Results:
    print(line.Time, line.LogJson)

Cloning an alarm

The fastest way to create a similar alarm is to read the existing one, strip read-only fields, tweak, and re-create:

import json
existing = client.DescribeAlarms(models.DescribeAlarmsRequest(Filters=[{"Key": "alarmId", "Values": ["<src-id>"]}])).Alarms[0]
payload = json.loads(json.dumps(existing._serialize()))      # deep clone
for k in ("AlarmId", "CreateTime", "UpdateTime"):
    payload.pop(k, None)
payload["Name"] = "OpenAI 5xx burst v2"
# ...edit other fields...
req = models.CreateAlarmRequest()
req.from_json_string(json.dumps(payload))
print(client.CreateAlarm(req).AlarmId)

Important reminders

Always preview the payload (json.dumps(req._serialize(), indent=2)) before CreateAlarm / ModifyAlarm — schema is strict and errors are unhelpful.
Destructive ops (DeleteAlarm, DeleteAlarmNotice, DeleteAlarmShield) need explicit user confirmation in chat before running.
Enable (archived?) vs Status (currently on?) are different fields. Use Status for everyday muting.
Shield rules are scoped to a single AlarmNoticeId. Iterate over all notice IDs to list all shields globally.
Default region is ap-hongkong; pass another region via cls_client.ClsClient(cred, "ap-singapore") if needed.

Console links

Alarm console: https://console.cloud.tencent.com/cls/alarm/list
Alarm API reference: https://www.tencentcloud.com/document/product/614/56473

Related skills

More from acedatacloud/skills

Installs

Repository

acedatacloud/skills

GitHub Stars

First Seen

2 days ago

Security Audits

Gen Agent Trust HubPass

SocketPass

SnykPass