OR
OpenRemedy

Integrations

Connect Alertmanager, Grafana, Datadog, PagerDuty, or custom clients.

Connect your monitoring stack to OpenRemedy to automatically create and classify incidents.

All integrations use the same webhook endpoint:

POST https://<your-domain>/api/v1/webhooks/alerts/<tenant-slug>
Content-Type: application/json
X-OpenRemedy-Signature: sha256=<hex digest>
 
{
  "hostname": "web-01",
  "incident_type": "disk_full",
  "severity": "high",
  "evidence": {
    "disk_usage_percent": 95,
    "mount": "/"
  }
}

Supported incident types: service_down, disk_full, cpu_high, memory_high, port_unavailable, custom

Supported severities: critical, high, medium, low, info

Evidence: Any JSON object with relevant data. The classifier uses this to match recipes.


Authentication: HMAC signatures (required)

The webhook endpoint requires every request to be HMAC-SHA256-signed with the tenant's webhook_secret. Unsigned or wrongly-signed requests get 401 Unauthorized. The endpoint is also rate-limited at 60 requests/min per source IP.

1. Get your tenant's webhook secret. A secret is auto-generated for each tenant. Fetch it from the tenant settings page in the dashboard, or ask a tenant admin / superadmin.

2. Sign the raw request body. The signature is sha256=<hex> where <hex> is HMAC-SHA256(webhook_secret, raw_body).hexdigest(). The body must be byte-exact — no re-serialisation between signing and sending, or the digest won't match.

3. Send the signature in X-OpenRemedy-Signature.

Examples

bash + openssl + curl:

SECRET="your-tenant-webhook-secret"
BODY='{"hostname":"web-01","incident_type":"disk_full","severity":"high","evidence":{}}'
SIG=$(printf '%s' "$BODY" | openssl dgst -sha256 -hmac "$SECRET" | awk '{print $2}')
 
curl -X POST https://your-platform.example.com/api/v1/webhooks/alerts/my-company \
  -H "Content-Type: application/json" \
  -H "X-OpenRemedy-Signature: sha256=$SIG" \
  -d "$BODY"

Python:

import hashlib, hmac, json, requests
 
secret = "your-tenant-webhook-secret"
body = json.dumps({"hostname": "web-01", "incident_type": "disk_full", ...})
sig = hmac.new(secret.encode(), body.encode(), hashlib.sha256).hexdigest()
 
requests.post(
    "https://your-platform.example.com/api/v1/webhooks/alerts/my-company",
    data=body,  # NOT json= (would re-serialise and break the signature)
    headers={
        "Content-Type": "application/json",
        "X-OpenRemedy-Signature": f"sha256={sig}",
    },
)

Node.js:

import crypto from "crypto";
import fetch from "node-fetch";
 
const secret = "your-tenant-webhook-secret";
const body = JSON.stringify({ hostname: "web-01", incident_type: "disk_full" });
const sig = crypto.createHmac("sha256", secret).update(body).digest("hex");
 
await fetch("https://your-platform.example.com/api/v1/webhooks/alerts/my-company", {
  method: "POST",
  body,
  headers: {
    "Content-Type": "application/json",
    "X-OpenRemedy-Signature": `sha256=${sig}`,
  },
});

For senders that can't sign on the wire (e.g. Grafana's basic webhook, PagerDuty), use a sidecar adapter that takes the upstream payload, signs it, and forwards. The Alertmanager example below shows the pattern.


1. OpenRemedy Monitor Script (No Dependencies)

The simplest option — a bash script that runs via cron on each server.

Install

# Download
sudo mkdir -p /opt/openremedy
sudo curl -sSL https://raw.githubusercontent.com/OpenRemedy/openremedy/main/scripts/openremedy-monitor.sh \
  -o /opt/openremedy/monitor.sh
sudo chmod +x /opt/openremedy/monitor.sh

Configure Cron (every 5 minutes)

# Edit crontab
crontab -e
 
# Add this line:
*/5 * * * * OREMEDY_URL=https://your-platform.example.com OREMEDY_TENANT=my-company DISK_THRESHOLD=90 CPU_THRESHOLD=90 MEM_THRESHOLD=90 CHECK_SERVICES=nginx,mysql,redis CHECK_PORTS=80,443,3306 /opt/openremedy/monitor.sh >> /var/log/openremedy-monitor.log 2>&1

Environment Variables

VariableDefaultDescription
OREMEDY_URL(required)OpenRemedy API URL
OREMEDY_TENANT(required)Your tenant slug
DISK_THRESHOLD90Disk usage % to trigger alert
CPU_THRESHOLD90CPU load % to trigger alert
MEM_THRESHOLD90Memory usage % to trigger alert
CHECK_SERVICES(empty)Comma-separated systemd services to monitor
CHECK_PORTS(empty)Comma-separated ports to check

What It Checks

  • Disk: All non-virtual mounts. Alerts when usage >= threshold.
  • CPU: 1-minute load average / number of cores. Alerts when >= threshold.
  • Memory: Used/total RAM. Alerts when usage >= threshold.
  • Services: Checks systemctl is-active for each listed service.
  • Ports: Checks ss -tlnp for each listed port.

2. Prometheus + Alertmanager

If you already run Prometheus, configure Alertmanager to forward alerts to OpenRemedy.

Step 1: Prometheus Alert Rules

Create or edit your alert rules file (e.g. /etc/prometheus/rules/openremedy.yml):

groups:
  - name: openremedy
    rules:
      - alert: DiskFull
        expr: (1 - node_filesystem_avail_bytes / node_filesystem_size_bytes) * 100 > 90
        for: 5m
        labels:
          severity: high
          incident_type: disk_full
        annotations:
          summary: "Disk usage above 90% on {{ $labels.instance }}"
          hostname: "{{ $labels.instance }}"
          mount: "{{ $labels.mountpoint }}"
          usage_percent: "{{ $value }}"
 
      - alert: HighCPU
        expr: 100 - (avg by(instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 90
        for: 5m
        labels:
          severity: high
          incident_type: cpu_high
        annotations:
          summary: "CPU usage above 90% on {{ $labels.instance }}"
          hostname: "{{ $labels.instance }}"
          cpu_percent: "{{ $value }}"
 
      - alert: HighMemory
        expr: (1 - node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes) * 100 > 90
        for: 5m
        labels:
          severity: high
          incident_type: memory_high
        annotations:
          summary: "Memory usage above 90% on {{ $labels.instance }}"
          hostname: "{{ $labels.instance }}"
          memory_percent: "{{ $value }}"
 
      - alert: ServiceDown
        expr: up == 0
        for: 1m
        labels:
          severity: critical
          incident_type: service_down
        annotations:
          summary: "Target {{ $labels.instance }} is down"
          hostname: "{{ $labels.instance }}"
 
      - alert: PortDown
        expr: probe_success == 0
        for: 2m
        labels:
          severity: high
          incident_type: port_unavailable
        annotations:
          summary: "Port probe failed on {{ $labels.instance }}"
          hostname: "{{ $labels.instance }}"

Step 2: Alertmanager Webhook Receiver

Edit /etc/alertmanager/alertmanager.yml:

global:
  resolve_timeout: 5m
 
route:
  receiver: openremedy
  group_by: ['alertname', 'instance']
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 4h
 
receivers:
  - name: openremedy
    webhook_configs:
      - url: 'https://your-platform.example.com/api/v1/webhooks/alerts/my-company'
        send_resolved: true

Step 3: Alertmanager Webhook Adapter

Alertmanager sends a different JSON format than OpenRemedy expects. Create a small adapter (or we'll add native Alertmanager support in a future release).

For now, use this nginx/caddy rewrite or a simple adapter script:

#!/usr/bin/env python3
"""Alertmanager → OpenRemedy webhook adapter. Run as a sidecar.
 
Reads the tenant's HMAC secret from the env so it can sign every
forwarded request. See the Authentication section above for how to
fetch the secret.
"""
import hashlib
import hmac
import json
import os
 
from flask import Flask, request, jsonify
import requests
 
app = Flask(__name__)
 
OPENREMEDY_URL = "https://your-platform.example.com/api/v1/webhooks/alerts/my-company"
WEBHOOK_SECRET = os.environ["OPENREMEDY_WEBHOOK_SECRET"]
 
def _signed_post(url, payload):
    body = json.dumps(payload).encode("utf-8")
    sig = hmac.new(WEBHOOK_SECRET.encode(), body, hashlib.sha256).hexdigest()
    return requests.post(
        url,
        data=body,  # raw bytes, NOT json=  (re-serialising would break the signature)
        headers={
            "Content-Type": "application/json",
            "X-OpenRemedy-Signature": f"sha256={sig}",
        },
        timeout=10,
    )
 
@app.route("/alertmanager", methods=["POST"])
def handle():
    data = request.json
    for alert in data.get("alerts", []):
        labels = alert.get("labels", {})
        annotations = alert.get("annotations", {})
 
        payload = {
            "hostname": annotations.get("hostname", labels.get("instance", "unknown")).split(":")[0],
            "incident_type": labels.get("incident_type", "custom"),
            "severity": labels.get("severity", "medium"),
            "evidence": {
                "alertname": labels.get("alertname", ""),
                "status": alert.get("status", ""),
                **{k: v for k, v in annotations.items() if k != "hostname"},
            },
        }
        try:
            _signed_post(OPENREMEDY_URL, payload)
        except Exception as e:
            print(f"Failed to send alert: {e}")
 
    return jsonify({"status": "ok"})
 
if __name__ == "__main__":
    app.run(host="0.0.0.0", port=9095)

Run the adapter: python3 alertmanager_adapter.py

Point Alertmanager to: http://localhost:9095/alertmanager


3. Grafana Alerts

Grafana can send webhook notifications directly to OpenRemedy.

Step 1: Create Contact Point

  1. Go to Alerting → Contact Points → New Contact Point
  2. Name: OpenRemedy
  3. Type: Webhook
  4. URL: https://your-platform.example.com/api/v1/webhooks/alerts/my-company
  5. HTTP Method: POST

Step 2: Create Notification Template

Go to Alerting → Notification Templates → New Template:

{{ define "openremedy" }}
{
  "hostname": "{{ (index .Alerts 0).Labels.instance }}",
  "incident_type": "{{ (index .Alerts 0).Labels.incident_type | default "custom" }}",
  "severity": "{{ (index .Alerts 0).Labels.severity | default "medium" }}",
  "evidence": {
    "alertname": "{{ (index .Alerts 0).Labels.alertname }}",
    "message": "{{ (index .Alerts 0).Annotations.summary }}",
    "value": "{{ (index .Alerts 0).Values }}"
  }
}
{{ end }}

Step 3: Create Alert Rules

Use Grafana's alert rules with labels:

  • incident_type: disk_full, cpu_high, memory_high, service_down, port_unavailable
  • severity: critical, high, medium, low
  • instance: hostname of the affected server

4. Datadog

Step 1: Create Webhook Integration

  1. Go to Integrations → Webhooks
  2. Click New Webhook
  3. Name: OpenRemedy
  4. URL: https://your-platform.example.com/api/v1/webhooks/alerts/my-company
  5. Payload:
{
  "hostname": "$HOSTNAME",
  "incident_type": "$ALERT_TYPE",
  "severity": "$ALERT_PRIORITY",
  "evidence": {
    "title": "$EVENT_TITLE",
    "message": "$EVENT_MSG",
    "tags": "$TAGS",
    "link": "$LINK",
    "metric": "$ALERT_METRIC",
    "value": "$ALERT_VALUE"
  }
}

Step 2: Use in Monitors

When creating a Datadog Monitor, add @webhook-OpenRemedy to the notification message.

Map Datadog priorities to OpenRemedy severities:

DatadogOpenRemedy
P1critical
P2high
P3medium
P4low

5. PagerDuty

Step 1: Create Webhook Extension

  1. Go to Services → Your Service → Integrations
  2. Add Generic Webhook (v3)
  3. URL: https://your-platform.example.com/api/v1/webhooks/alerts/my-company

Step 2: PagerDuty Adapter

PagerDuty's webhook format differs from OpenRemedy's. Use this adapter:

#!/usr/bin/env python3
"""PagerDuty → OpenRemedy webhook adapter."""
from flask import Flask, request, jsonify
import requests
 
app = Flask(__name__)
 
OPENREMEDY_URL = "https://your-platform.example.com/api/v1/webhooks/alerts/my-company"
 
SEVERITY_MAP = {
    "critical": "critical",
    "error": "high",
    "warning": "medium",
    "info": "low",
}
 
@app.route("/pagerduty", methods=["POST"])
def handle():
    data = request.json
    for event in data.get("messages", []):
        incident = event.get("incident", {})
        payload = {
            "hostname": incident.get("impacted_services", [{}])[0].get("name", "unknown"),
            "incident_type": "custom",
            "severity": SEVERITY_MAP.get(incident.get("urgency", ""), "medium"),
            "evidence": {
                "title": incident.get("title", ""),
                "description": incident.get("description", ""),
                "pagerduty_id": incident.get("id", ""),
                "status": incident.get("status", ""),
                "url": incident.get("html_url", ""),
            },
        }
        try:
            requests.post(OPENREMEDY_URL, json=payload, timeout=10)
        except Exception as e:
            print(f"Failed: {e}")
 
    return jsonify({"status": "ok"})
 
if __name__ == "__main__":
    app.run(host="0.0.0.0", port=9096)

6. Custom / Any HTTP Source

Any system that can send HTTP POST requests can integrate with OpenRemedy.

Curl Example

curl -X POST https://your-platform.example.com/api/v1/webhooks/alerts/my-company \
  -H "Content-Type: application/json" \
  -d '{
    "hostname": "web-01",
    "incident_type": "service_down",
    "severity": "critical",
    "evidence": {
      "service_name": "nginx",
      "service_active": false,
      "error": "Connection refused on port 80"
    }
  }'

Python Example

import requests
 
requests.post(
    "https://your-platform.example.com/api/v1/webhooks/alerts/my-company",
    json={
        "hostname": "db-01",
        "incident_type": "memory_high",
        "severity": "high",
        "evidence": {
            "memory_percent": 94,
            "total_mb": 16384,
            "used_mb": 15400,
            "top_process": "mysqld",
        },
    },
)

Node.js Example

fetch("https://your-platform.example.com/api/v1/webhooks/alerts/my-company", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({
    hostname: "api-01",
    incident_type: "cpu_high",
    severity: "high",
    evidence: {
      cpu_percent: 97,
      load_1m: 8.5,
      cores: 4,
    },
  }),
});

Evidence Best Practices

The more context you include in the evidence field, the better OpenRemedy's LLM classifier can match incidents to recipes.

Good evidence for each incident type:

TypeKey Evidence Fields
disk_fulldisk_usage_percent, mount, largest_files
cpu_highcpu_percent, load_1m, cores, top_process
memory_highmemory_percent, total_mb, used_mb, top_process
service_downservice_name, service_active, error, exit_code
port_unavailableport, port_open, expected_service
customAny relevant JSON — the LLM will analyze it