> ## Documentation Index
> Fetch the complete documentation index at: https://docs.exorde.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Alerts

> LLM-validated volume-spike signals with severity, spread, IOCs, and matched cluster context. The structured form of 'something just happened'.

An **alert** is a structured signal that conversation on a topic just spiked outside its normal pattern, validated by an LLM gate, with enough metadata to act on it without a human having to read the underlying posts. Alerts are one of the three pillars of the Intel API — alongside [trending](/trending) and [narrative](/narrative) — and the only one designed for **push-style consumption**: poll on Watch, subscribe to a webhook on See and Know.

## When an alert fires

The pipeline runs continuously. An alert is emitted when **all four** of the following hold:

1. A keyword's per-window volume crosses **5σ above its 14-day rolling baseline** on a topic or watchlist.
2. The spike is spread across **multiple domains and languages** — single-domain bursts are filtered as noise.
3. An **LLM gate** classifies the spike as a real, describable event (not a recurring meme, scheduled show, or platform artifact).
4. The signal hasn't already been emitted in the current deduplication window.

Low-volume topics like `cyber` or `disinfo` may produce **zero alerts in a 24-hour window**. That is by design: alerts are intentionally rare. Use [`/v1/topics/{t}/volume`](/topics-and-watchlists#analytics-endpoints-on-curated-topics) for raw activity instead.

<Note>
  The default `hours=168` (7 days) on `/v1/topics/{t}/alerts` exists for exactly this reason — it gives quiet topics a useful window without forcing every caller to remember the parameter. Tune down to `hours=24` for high-volume topics like `global`.
</Note>

## The alert envelope

Same JSON shape on every endpoint that returns alerts: `/v1/topics/{t}/alerts`, `/v1/watchlists/{id}/alerts`, and webhook deliveries.

```json theme={null}
{
  "alert_id": "c80fcfed-6818-44ed-a0b9-0eda91d1401c",
  "detected_at": "2026-05-18T04:00:30.148Z",
  "topic": "cyber",
  "signal_type": "volume_spike",
  "source": "aggregator",
  "keyword": "dark web",
  "confidence": 0.72,
  "severity": {
    "deviation_sigma": 6.67,
    "current_value": 24.0,
    "baseline_value": 3.29
  },
  "spread": {
    "domain_count": 14,
    "language_count": 8
  },
  "llm_validated": true,
  "description": "Multiple credible data breach disclosures (Turkish breach, FoxIT/Foxit software, gaming accounts) surfacing on dark web with fact-checker verification signals genuine cybersecurity incidents being reported and discussed across platforms.",
  "sample_posts": [
    {
      "preview": "Turkish operator breach reportedly exposed via dark-web listing — fact-check pending...",
      "domain": "x.com",
      "language": "en",
      "captured_at": "2026-05-18T03:42:11Z"
    }
  ],
  "iocs": {
    "urls": [],
    "ips": [],
    "domains": [],
    "hashes": { "md5": [], "sha1": [], "sha256": [] },
    "cves": [],
    "crypto_wallets": [],
    "emails": []
  },
  "matched_cluster": {
    "cluster_id": 258,
    "cluster_title": "Dark-web breach disclosures, May 2026",
    "narrative_context": "Cluster tracking weekly cadence of breach announcements with fact-checker overlay."
  }
}
```

## Field guide

### Identity

| Field         | Type         | Purpose                                                                                                                                     |
| ------------- | ------------ | ------------------------------------------------------------------------------------------------------------------------------------------- |
| `alert_id`    | UUID         | Stable, globally unique. **Use for dedup across polls and across webhook redeliveries.**                                                    |
| `detected_at` | ISO-8601 UTC | Wall-clock moment the spike crossed threshold. Not when you fetched it.                                                                     |
| `topic`       | string       | Curated topic slug (e.g. `cyber`). Absent on watchlist alerts; `watchlist_id` is present instead.                                           |
| `source`      | enum         | Pipeline stage that emitted the alert: `aggregator`, `cluster`, `entity`. `aggregator` covers volume spikes; the others are content-driven. |

### Signal type

`signal_type` is the discriminator. Today's stable values:

| `signal_type`       | Meaning                                                    | Carries IOCs?                           |
| ------------------- | ---------------------------------------------------------- | --------------------------------------- |
| `volume_spike`      | Keyword volume on the topic exceeded 5σ baseline           | Sometimes (extracted from sample posts) |
| `keyword_spike`     | Synonym for `volume_spike`, retained for legacy clients    | Sometimes                               |
| `coordination`      | Cross-domain synchronised posting pattern                  | Rare                                    |
| `sentiment_shift`   | Sharp sentiment polarity shift on an established narrative | No                                      |
| `anomaly`           | Statistical outlier that doesn't fit other categories      | No                                      |
| `cluster_emergence` | A new conversation cluster just crystallised               | Often                                   |
| `cluster_death`     | An active cluster collapsed below activity threshold       | No                                      |

Match on `signal_type` to route alerts to the right consumer (SOC vs. brand vs. newsroom).

### Severity

```json theme={null}
"severity": {
  "deviation_sigma": 6.67,
  "current_value": 24.0,
  "baseline_value": 3.29
}
```

| Field             | Meaning                                                                                                                                                                  |
| ----------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `deviation_sigma` | How many standard deviations above the 14-day baseline. **5.0 is the floor**; anything higher is unusually loud. 6.67 (the example above) is "drop everything and look." |
| `current_value`   | Raw volume in the detection window                                                                                                                                       |
| `baseline_value`  | Mean volume over the trailing 14 days for the same window length                                                                                                         |

The math, in plain terms: `deviation_sigma = (current_value − baseline_value) / σ_14d`, and the alert is only emitted when `deviation_sigma ≥ 5`.

### Spread

The virality footprint. Single-domain spikes — even loud ones — are filtered out. An alert with `domain_count: 14, language_count: 8` is a story crossing platforms and language communities, not one viral tweet.

| Field            | Meaning                                                    |
| ---------------- | ---------------------------------------------------------- |
| `domain_count`   | Distinct source domains carrying the keyword in the window |
| `language_count` | Distinct languages (ISO 639-1 codes) of those posts        |

A common disinfo filter is `domain_count >= 5 AND language_count >= 3` (see [Use cases recipe 4](/use-cases#4-disinformation-early-warning)).

### Confidence and LLM validation

| Field           | Meaning                                                                                                                                                           |
| --------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `confidence`    | Float 0.0–1.0. The model's estimate that this signal is a real event vs. noise. Use as a UI sort key.                                                             |
| `llm_validated` | Boolean. The LLM gate either confirmed the spike represents a describable real-world event, or didn't. **Filter to `true` for high-stakes downstream consumers.** |
| `description`   | Human-readable, English, 1–3 sentences. Editorial-grade. Drop straight into a Slack alert without rewriting.                                                      |

### Evidence

```json theme={null}
"sample_posts": [
  { "preview": "...", "domain": "x.com", "language": "en", "captured_at": "..." }
]
```

3–5 representative posts. Truncated to \~160 chars; for full content fetch [`/v1/topics/{t}/posts`](/topics-and-watchlists#analytics-endpoints-on-curated-topics) (See tier and above).

### IOCs

The IOC extractor runs on every alert with text content, including `volume_spike` types. Always present, often empty.

```json theme={null}
"iocs": {
  "urls":           [],
  "ips":            [],
  "domains":        [],
  "hashes":         { "md5": [], "sha1": [], "sha256": [] },
  "cves":           [],
  "crypto_wallets": [],
  "emails":         []
}
```

The shape is **always the full schema, even when empty**. Code can iterate keys safely without `if "cves" in iocs` checks.

### Matched cluster

If the spike falls inside an existing conversation cluster, the alert links to it:

```json theme={null}
"matched_cluster": {
  "cluster_id": 258,
  "cluster_title": "Dark-web breach disclosures, May 2026",
  "narrative_context": "Cluster tracking weekly cadence of breach announcements..."
}
```

Drill down with `GET /v1/topics/{t}/clusters/{cluster_id}` (See tier) for the full cluster: top entities, top domains, time-series, full evidence post list. `matched_cluster` is `null` when the spike doesn't fit any active cluster — usually meaning it's a brand-new story.

## Endpoints that return alerts

| Endpoint                                      | Tier   | Returns                                 |
| --------------------------------------------- | ------ | --------------------------------------- |
| `GET /v1/topics/{topic}/alerts`               | Watch+ | Alerts for a curated topic              |
| `GET /v1/watchlists/{id}/alerts`              | See+   | Alerts scoped to your watchlist's terms |
| `POST /v1/subscriptions` (with `type: alert`) | See+   | Webhook push delivery, same envelope    |

Query parameters on the polling endpoints:

| Param           | Default | Watch cap | See cap | Know cap |
| --------------- | ------- | --------- | ------- | -------- |
| `hours`         | 168     | 24        | 72      | 168      |
| `limit`         | 50      | 50        | 100     | 200      |
| `signal_type`   | (any)   | —         | —       | —        |
| `min_sigma`     | 5.0     | —         | —       | —        |
| `llm_validated` | (any)   | —         | —       | —        |

Request a `hours` value above your tier cap and the response is silently clamped — the JSON includes the effective window in `query_window`.

## Polling pattern (Watch and See)

```python theme={null}
import os, time, httpx
from datetime import datetime, timezone

BASE = "https://intel-v1.exorde.io"
HEADERS = {"X-API-Key": os.environ["EXORDE_API_KEY"]}
SEEN: set[str] = set()


def poll(topic: str, hours: int = 24) -> list[dict]:
    r = httpx.get(
        f"{BASE}/v1/topics/{topic}/alerts",
        params={"hours": hours, "limit": 50, "llm_validated": True},
        headers=HEADERS,
        timeout=10,
    )
    if r.status_code == 429:
        time.sleep(int(r.headers.get("Retry-After", 5)))
        return []
    r.raise_for_status()
    fresh = [a for a in r.json()["alerts"] if a["alert_id"] not in SEEN]
    SEEN.update(a["alert_id"] for a in fresh)
    return fresh


while True:
    for a in poll("global", hours=24):
        sev = a["severity"]
        ts = datetime.now(timezone.utc).strftime("%H:%M:%S")
        print(f"[{ts}] {a['keyword']:<25} σ={sev['deviation_sigma']:.2f} "
              f"({a['spread']['domain_count']}d × {a['spread']['language_count']}l)")
    time.sleep(60)
```

**Cadence guidance:**

| Tier              | Cadence   | Daily call cost |
| ----------------- | --------- | --------------- |
| Watch             | every 60s | \~1,440 / day   |
| See (poll)        | every 10s | \~8,640 / day   |
| See / Know (push) | webhook   | 0 RPM           |

Below 5-second freshness, **switch to webhooks**. See [Rate limits](/rate-limits#recommended-polling-cadence).

## Webhook delivery (See and Know)

```bash theme={null}
curl -X POST https://intel-v1.exorde.io/v1/subscriptions \
  -H "X-API-Key: $EXORDE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "type": "alert",
    "scope": { "kind": "topic", "topic": "cyber" },
    "delivery": {
      "kind": "webhook",
      "url": "https://your.app/exorde-webhook",
      "secret": "whsec_..."
    },
    "filters": {
      "min_sigma": 6.0,
      "llm_validated": true
    }
  }'
```

Each delivery POSTs the alert envelope (above) to your URL, with these headers:

| Header                     | Meaning                                                                 |
| -------------------------- | ----------------------------------------------------------------------- |
| `X-Exorde-Signature`       | `sha256=<hex>` HMAC of the body, signed with your subscription's secret |
| `X-Exorde-Delivery-Id`     | Unique per delivery attempt; **use to dedup retries**                   |
| `X-Exorde-Subscription-Id` | The subscription that produced this event                               |
| `X-Exorde-Event-Type`      | Always `alert` for this subscription type                               |

Verify the signature server-side **before** trusting the payload:

```python theme={null}
import hmac, hashlib

def verify(body: bytes, signature_header: str, secret: str) -> bool:
    expected = "sha256=" + hmac.new(
        secret.encode(), body, hashlib.sha256
    ).hexdigest()
    return hmac.compare_digest(expected, signature_header)
```

Webhooks that return non-2xx **N times in a row** auto-pause and emit `webhook_dead`. Re-enable from `PATCH /v1/subscriptions/{id}` once your endpoint is healthy. See [Errors → Subscription / webhook errors](/errors#subscription-webhook-errors).

## Filtering patterns

**Newsroom — only loud, validated, multi-platform stories:**

```python theme={null}
high_signal = [
    a for a in alerts
    if a["llm_validated"]
    and a["severity"]["deviation_sigma"] >= 6.0
    and a["spread"]["domain_count"] >= 8
]
```

**Threat-intel — only alerts carrying actionable IOCs:**

```python theme={null}
def has_iocs(a: dict) -> bool:
    i = a["iocs"]
    return bool(
        i["urls"] or i["ips"] or i["domains"]
        or i["cves"] or i["crypto_wallets"]
        or any(i["hashes"].values())
    )

actionable = [a for a in alerts if has_iocs(a)]
```

**Disinfo — coordinated multi-language pushes only:**

```python theme={null}
suspicious = [
    a for a in alerts
    if a["llm_validated"]
    and a["spread"]["language_count"] >= 3
    and a["spread"]["domain_count"] >= 5
    and a["confidence"] >= 0.7
]
```

## Idempotency and dedup

* **Across polls:** `alert_id` is stable. Keep a `set` of seen IDs (or a Redis `SADD` with TTL) and skip duplicates.
* **Across webhook retries:** Use `X-Exorde-Delivery-Id` as the dedup key — same `alert_id` may be redelivered if your endpoint 5xx'd.
* **Across rotations:** Alerts persist through key rotation. The `alert_id` doesn't reset.

## Operational guidance

* **Don't trust `description` for routing** — it's prose. Route on `signal_type`, `topic`, `severity.deviation_sigma`, `iocs` presence.
* **Always pass `llm_validated: true`** in production filters unless you're explicitly hunting noise.
* **Persist `alert_id` for at least 7 days** — the maximum dedup window. Shorter and you'll re-page the on-call.
* **Show the `trace_id`** (response header `X-Exorde-Trace-Id`) on any UI that surfaces an alert. It's the support handshake.
* **`matched_cluster: null` is a feature**, not missing data — it tells you "this is brand new, not part of an ongoing story."
* **Alerts count against RPM but not monthly quota** when delivered via webhook. Push is the right architecture above 5-second cadence.

## What's not an alert

For clarity:

| You want                           | Use this                                                                                               |
| ---------------------------------- | ------------------------------------------------------------------------------------------------------ |
| "What are the top terms right now" | [`/v1/topics/{t}/trending`](/trending)                                                                 |
| "What is the dominant storyline"   | [`/v1/topics/{t}/narrative`](/narrative)                                                               |
| "Show me posts mentioning X"       | [`/v1/topics/{t}/search`](/topics-and-watchlists#analytics-endpoints-on-curated-topics) (See+)         |
| "Track my brand specifically"      | [Watchlists](/topics-and-watchlists#custom-watchlists) (See+)                                          |
| "Editorial weekly summary"         | [`/v1/topics/{t}/reports/latest`](/topics-and-watchlists#analytics-endpoints-on-curated-topics) (Know) |

Alerts are the **push-shaped, machine-routable** view of the data. Everything else is pull-shaped and human-shaped.

<Note>Last reviewed: 2026-05-19. API version 1.2.8.</Note>
