Kami Conflict Detection
Detect physical conflicts (fighting, shoving, scuffling) between 2+ people from one or many RTSP camera streams concurrently. Continuous mode — the process never exits on detection; every confirmed event is pushed as an alert tagged with the originating camera, then monitoring resumes.
Privacy Policy
For privacy policy details, see: https://kamiclaw-skill.kamihome.com/privacy
How It Works
- YOLO pre-filter — lightweight person detection counts people in each frame (must be ≥
min_persons, default 2). - Multi-frame collection — collects N frames with a configurable time gap.
- LLM conflict analysis — frames are sent to the Kami detection API for violence/conflict judgment.
- Continuous alerting — on a confirmed conflict, the worker saves a per-camera video clip, pushes an alert with
camera = <camera_name>to every configured channel, and immediately resumes monitoring. The process keeps running until the user stops it (Ctrl+C / OpenClaw shutdown).
Multi-camera: every entry in
config.json -> cameras[]runs in its own worker thread inside the same process. The YOLO ONNX session, the conflict analyzer and all push channels are shared once across cameras. Push channels are shared — no per-camera webhook split. Stranger / clip / log lines are namespaced by camera name.
When to Use
- Monitor one or many camera feeds for physical fights or scuffles
- Detect shoving, pushing, or violent behavior between people
- Run conflict detection on a local video file for testing
- Set up automated surveillance alerts for physical altercations
Installation
bash setup.sh
No sudo required. The script auto-bootstraps python3.10 in user space (via uv when needed), creates .venv/, installs dependencies, and prepares alerts/. Idempotent — safe to re-run.
Prerequisites
- Linux/macOS shell with
curl(orwget) andunzipavailable yolov8s-worldv2.onnxmodel file in the skill directory (auto-downloaded as a pre-exported ONNX bundle fromhttps://publicfiles.xiaoyi.com/kami-conflict-detection-model.zipon first run if missing — no.ptconversion required)- One or more RTSP cameras online, OR a local video file for testing
- Kami API key (
--kami_api_keyor write toconfig.json). Register at https://kamiclaw-skill.kamihome.com for a free 200-credit quota. setup.shhas been run at least once
Python 3.10 is not a manual prerequisite —
setup.shwill install it locally without sudo if missing.
Parameters
Confirm the following before running. The fields marked (persisted in config.json) can be saved to config.json next to the script so the user does not need to provide them every run — see Configuration Persistence below.
| Parameter | Default | Description |
|-----------|---------|-------------|
| --rtsp_url | (empty; see cameras[] in config.json) | Single-camera CLI override. When set, takes priority over cameras[]. |
| --camera_name | (empty) | Optional camera name used together with --rtsp_url. Defaults to camera_0. |
| --kami_api_key | (persisted in config.json) | Kami API key |
| --yolo_model | yolov8s-worldv2.onnx | YOLO model file path |
| --conf_threshold | 0.25 | YOLO confidence threshold |
| --min_persons | 2 | Minimum person count to trigger LLM analysis |
| --sample_interval | 1.0 | YOLO pre-filter interval (seconds) |
| --multi_frame_count | 3 | Frames per LLM analysis |
| --multi_frame_gap | 0.5 | Gap between collected frames (seconds) |
| --buffer_seconds | 30 | Ring buffer duration for clip export |
| --clip_before | 5 | Seconds of video before the conflict |
| --clip_after | 5 | Seconds of video after the conflict |
| --output_dir | alerts/ | Root directory for saved video clips. Per-camera subfolders (alerts/<camera_name>/) are created automatically. |
| --fps | 15 | Video stream frame rate |
| --inbox_file | alerts/pending.jsonl | Alarm inbox consumed by the heartbeat task |
| --feishu_webhook | (persisted in config.json) | Feishu custom bot webhook URL |
| --feishu_secret | (persisted in config.json) | Feishu signing secret (only if signing enabled) |
| --feishu_app_id | (persisted in config.json) | Feishu self-built app App ID. Required only if you want the conflict snapshot to render inline inside the Feishu card (uploads the image to Feishu and embeds it via image_key). |
| --feishu_app_secret | (persisted in config.json) | Feishu self-built app App Secret, paired with --feishu_app_id. |
| --discord_webhook | (persisted in config.json) | Discord channel webhook URL |
| --telegram_bot_token | (persisted in config.json) | Telegram Bot token |
| --telegram_chat_id | (persisted in config.json) | Telegram target chat/group/channel ID |
| --proxy | (persisted in config.json) | HTTPS proxy for Discord/Telegram (not used for Feishu). Mainland-China users MUST set this for Discord/Telegram. |
Only ask the user about a parameter if (a) it's still empty in config.json AND has no command-line value, OR (b) the user explicitly asks to adjust it. Do NOT pause the conversation for blanket parameter confirmation.
Configuration Persistence (config.json)
A config.json file lives next to the script with the following empty-by-default fields:
{
"cameras": [
{
"name": "",
"rtsp_url": ""
}
],
"kami_api_key": "",
"feishu_webhook": "",
"feishu_secret": "",
"feishu_app_id": "",
"feishu_app_secret": "",
"discord_webhook": "",
"telegram_bot_token": "",
"telegram_chat_id": ""
}
cameras array — one entry per RTSP source. All cameras run concurrently inside a single process, sharing the loaded YOLO ONNX model, the conflict analyzer and the same set of push channels. Add a new entry to the array for each additional camera (e.g. living_room, office_door, …).
| Camera field | Required | Behaviour when empty |
|--------------|----------|----------------------|
| name | recommended | Free-form short label written into every alert as alert.camera, prefixed to alert.message as [name], and used as the snapshot subdirectory under alerts/<name>/. If the user does not supply one, the script auto-assigns camera_0, camera_1, … and logs the assignment. |
| rtsp_url | yes | Entry is skipped if empty. Accepts rtsp://..., http(s)://..., or a local file path. |
Resolution order at runtime: --rtsp_url (single-camera CLI override) → cameras[] in config.json → empty (fatal: EXIT_ERROR).
Workflow OpenClaw MUST follow (single-turn, no extra confirmation):
- On first launch, read
config.jsonand identify which fields are still empty. - Ask the user only for the empty fields:
- Cameras — for each camera the user wants to monitor, ask for the
rtsp_urlAND a cameranamein the same question:- The
nameis a free-form short label, ideally a natural-language tag likeliving_room/front_door/office. It will appear in every alarm to identify which camera fired the event. - When the user is unsure about their RTSP URL format, provide these common brand templates as a reference:
TP-Link: rtsp://<user>:<password>@<ip>:554/stream1 Hikvision(海康): rtsp://<user>:<password>@<ip>:554/Streaming/Channels/101 Dahua(大华): rtsp://<user>:<password>@<ip>:554/cam/realmonitor?channel=1&subtype=0Note:
<user>and<password>are the camera's login credentials (oftenadmin);<ip>is the camera's LAN IP. The port is almost always554. Substream variants (lower resolution / bandwidth) may use path102(Hikvision) orsubtype=1(Dahua). - If the user only provides the
rtsp_urland omits the name, IMMEDIATELY tell them: "No camera name provided. The script will auto-assign an incremental id likecamera_0/camera_1. Do you want to proceed with the auto id?" Only proceed with the auto id once the user accepts (or stays silent / says ok). Do NOT silently assign the auto id without notifying. - Persist each camera as an object in the
camerasarray.
- The
- Kami API key — required for the LLM conflict analysis.
- Push channels — which to enable (Feishu / Discord / Telegram) and their credentials. Push channels are shared across all cameras (no per-camera split); every camera uses the same webhooks.
- Feishu — ask for
feishu_webhook(required). If the user wants the conflict snapshot to render inline inside the Feishu card, ALSO ask forfeishu_app_id+feishu_app_secret(self-built app credentials, used only to upload the snapshot via OpenAPI to obtain animage_key). Without app credentials the snapshot falls back to a clickable image-host URL. - Discord — ask for
discord_webhook. The conflict snapshot is attached as multipart so it renders inline automatically. Mainland-China users MUST also provideproxy. - Telegram — ask for
telegram_bot_token+telegram_chat_id. The conflict snapshot is delivered viasendPhotoand renders inline. Mainland-China users MUST also provideproxy.
- Feishu — ask for
- Cameras — for each camera the user wants to monitor, ask for the
- Write the user's answers back into
config.json(preserve existing non-empty fields). Subsequent launches skip these prompts. - Immediately in the SAME turn, run
bash setup.sh(idempotent) and launch the detector. Do NOT end the turn after step 3 — do NOT wait for the user to say "start" or "begin". - If all required fields are already non-empty on entry, skip steps 2–3 and go straight to step 4.
Alarm Push Channels
Alarms can be pushed through the following channels — all optional, configure any combination:
| Channel | Required Parameters |
|---------|--------------------|
| Feishu (custom bot) | --feishu_webhook (required); --feishu_secret (optional, signing); --feishu_app_id + --feishu_app_secret (optional, enables inline snapshot image via Feishu OpenAPI image_key) |
| Discord (channel webhook) | --discord_webhook (snapshot attached as multipart, renders inline) |
| Telegram (Bot API) | --telegram_bot_token + --telegram_chat_id (snapshot delivered via sendPhoto, renders inline) |
Beyond app push, every alarm is always delivered through two redundant local channels:
- stdout JSON line — OpenClaw reads stdout line-by-line and reports each alert in chat (no exit; the process keeps running).
- Inbox file
alerts/pending.jsonl— appended on every alarm; consumed by the heartbeat task as a fallback.
Refer to the official docs of each platform for how to obtain webhook URLs / bot tokens / chat IDs. In mainland China, Discord and Telegram require a proxy (
--proxyorHTTPS_PROXY). Push card labels are language-fixed: Feishu → Chinese, Discord/Telegram → English.
Usage
# First time only
bash setup.sh
# Multi-camera mode (recommended): write cameras[] into config.json once,
# then just run the script with no extra args.
.venv/bin/python conflict_detector_last.py
# Single-camera CLI override (one-off, e.g., for testing a video file)
.venv/bin/python conflict_detector_last.py \
--rtsp_url /path/to/test_video.mp4 \
--camera_name test_video \
--kami_api_key YOUR-API-KEY
Output Format (stdout JSON)
When a conflict is detected, the worker prints one JSON line and continues monitoring:
{
"alert": "conflict_detected",
"camera": "living_room",
"timestamp": "2026-05-21 14:30:22",
"description": "Two people are engaged in a physical altercation",
"video_clip": "alerts/living_room/conflict_living_room_20260521_143022.mp4",
"snapshot_image": "alerts/living_room/snapshot_living_room_20260521_143022.jpg",
"clip_duration": "10s",
"message": "[living_room] Warning: Physical conflict detected. ..."
}
The
snapshot_imagefield is a single representative frame captured betweenclip_beforeandclip_afterwindows (i.e. at the conflict moment), resized to a 640px long side and saved as JPEG. It is embedded inline in the Feishu / Discord / Telegram cards.
Exit Codes
| Code | Meaning | OpenClaw Action |
|------|---------|-----------------|
| 0 | Normal exit (Ctrl+C, all streams ended, no fatal error) | Inform user; optionally restart |
| 1 | Runtime error (model not found, missing API key, no camera configured) | Report error; check conflict_detector.log |
Continuous mode: detected conflicts do NOT terminate the process — they only push alerts. The process stays alive until interrupted.
Strict Rules (MUST Follow)
- RULE: Alarms flow via (a) stdout JSON line, (b) inbox file, (c) Feishu (optional), (d) Discord (optional), (e) Telegram (optional). Never rely on a single channel.
- RULE (continuous mode): A detected conflict NEVER terminates the process. The worker pushes the alert and resumes monitoring on the same stream. There is no
exit 10; there is no detect-report-restart loop. - RULE (multi-camera): All cameras in
config.json -> cameras[]are monitored concurrently inside a single Python process (shared YOLO ONNX session, shared analyzer, one inference lock, per-camera FrameGrabber + worker thread). Never spawn one process per camera. - RULE (camera id): Every alarm written to
pending.jsonl, Feishu, Discord and Telegram MUST include the camera name in thecamerafield. Themessagefield MUST be prefixed with[<camera_name>]. Snapshots MUST be written underalerts/<camera_name>/. - RULE (camera name prompt): When asking the user for an RTSP URL, the agent MUST in the SAME question also ask for a human-readable camera
name(e.g.living_room,front_door). If the user provides the URL but skips the name, the agent MUST EXPLICITLY notify them that an auto id (camera_0,camera_1, …) will be used; never assign the auto id silently. - RULE (shared push): Feishu / Discord / Telegram credentials in
config.jsonapply to every camera. There is no per-camera webhook split. - RULE (feishu inline image): To render the conflict snapshot inline inside the Feishu card, the agent MUST ask the user for
feishu_app_idANDfeishu_app_secret(self-built app credentials) when configuring Feishu. Without them, the snapshot falls back to a clickable sm.ms URL (or plain text path). The webhook alone is NOT sufficient for inline image rendering. - RULE (snapshot): Every conflict alarm MUST carry a
snapshot_imagefield pointing to a JPEG saved underalerts/<camera_name>/snapshot_<camera_name>_YYYYMMDD_HHMMSS.jpg. The frame is captured betweenclip_beforeandclip_after(i.e. at the conflict moment) and resized to a 640px long side for consistent display across push channels. - RULE: Every heartbeat consumes
alerts/pending.jsonl; non-empty → proactive chat summary; empty →HEARTBEAT_OK. - RULE: Consumed alarms are MOVED to
alerts/consumed/, not deleted. - RULE: Before launch, read
config.json; only ask the user for fields that are empty, and write the answers back intoconfig.jsonso subsequent launches are non-interactive. - RULE (auto-launch): Once at least one camera (with
rtsp_url) andkami_api_keyare present inconfig.json(either pre-existing or just written), the agent MUST runbash setup.shand launch the detector in the same conversation turn. Never end the turn at "config saved" — the user does NOT need to send a second message like "start it" or "begin detection". - RULE: Warn the user if no push channel is configured (chat-window push still active).
Troubleshooting
| Problem | Fix |
|---------|-----|
| Virtual environment not found | Run bash setup.sh |
| Model file missing | setup.sh (or first script run) auto-downloads the ONNX bundle from https://publicfiles.xiaoyi.com/kami-conflict-detection-model.zip, extracts it, moves the .onnx next to the script and deletes the temp folder. If automatic download fails, manually fetch the zip, unzip it, and move yolov8s-worldv2.onnx into the skill directory. |
| unzip not found | Install unzip (e.g. apt install unzip / brew install unzip) and rerun setup.sh. |
| RTSP connection failure | Verify camera is online; check cameras[].rtsp_url |
| LLM API failure | Check kami_api_key; verify network to the Kami API endpoint |
| No alerts generated | See conflict_detector.log; try lowering --conf_threshold |
| Script exits with code 1 | Check log; common causes: missing model, no camera configured, missing API key |
| Feishu card shows snapshot path instead of an inline image | Provide feishu_app_id + feishu_app_secret (self-built app); without them the script falls back to sm.ms URL / local path. |
| Discord / Telegram push silently fails | Mainland-China users must set proxy in config.json (or --proxy). Feishu does NOT need a proxy. |
Scan to join WeChat group