D Diagent docs

Embed the widget

Allowed origins

The widget script is public on purpose โ€” anyone can fetch /widget/widget.js. That makes allowed origins the trust boundary that stops a third party from embedding your snippet on their own site and burning your quota.

The contract

Every POST /v1/widget/init reads the request's Origin header (or Referer as a fallback) and checks it against the agent's allowed_origins list. The rules are:

  1. Empty list โ†’ 403. Deny everywhere. New agents start empty until you add at least one origin.
  2. Wildcard "*" โ†’ allow. Opt-in escape hatch for internal tools and demos. Never set as a default.
  3. Otherwise โ†’ exact scheme://host match. No subdomain inference. https://example.com does not permit https://app.example.com.

Every privileged call re-checks Origin

The init endpoint is where the JWT gets minted โ€” but the JWT is a bearer token, and bearer tokens can be exfiltrated (leaked log, browser dev-tools, XSS on a victim site, MITM on cleartext). To stop a stolen JWT from being replayed from attacker.example, every privileged widget endpoint runs through the VerifyWidgetOrigin middleware, which re-validates the request Origin against the JWT-bound agent's allowed_origins on every call:

  • POST /v1/widget/messages
  • POST /v1/widget/messages/stream (SSE)
  • POST /v1/widget/leads
  • POST /v1/widget/request-human
  • POST /v1/widget/events
  • POST /v1/widget/typing
  • POST /v1/widget/satisfaction
  • POST /v1/widget/coupon/apply
  • POST /v1/widget/conversation/clear
  • GET /v1/widget/conversation/messages
  • DELETE /v1/widget/me

Policy is identical to /v1/widget/init โ€” empty list = deny, "*" = allow (including no-Origin), specific entries = exact normalised match โ€” so a request that init would have approved can never be 403'd by the post-init middleware, and a request that init would have denied can never sneak past either.

Strict subdomain matching

This is the rule that catches people off guard, so it deserves its own callout:

Listing https://thecodestudio.com in allowed_origins does not permit https://pitchbar.thecodestudio.com. Subdomains are independent โ€” list each one explicitly. This prevents an attacker who controls a subdomain (via DNS or shared hosting) from inheriting trust from the parent.

If you actually want all subdomains, list them individually:

https://example.com
https://www.example.com
https://app.example.com
https://docs.example.com

Adding origins

From the agent's Settings page (/app/agents/{id}/settings), the Allowed origins card has a textarea โ€” one origin per line. Save updates the agent immediately; new init requests use the new list within seconds.

Origins must include the scheme:

ValidInvalid
https://example.comexample.com
http://localhost:3000localhost
https://shop.example.com*.example.com (wildcards aren't supported except as "*")

Testing locally

During development, add http://localhost:3000 (or whatever port you're using) to the agent's allowed origins. Don't use "*" for this โ€” leaving it on by accident in production leaves the agent open.

What happens on rejection

A request from a disallowed origin gets a JSON 403:

{
    "error": {
        "code": "origin_forbidden",
        "message": "Origin is not allowed for this agent."
    }
}

The widget's loader handles this gracefully โ€” the launcher disappears silently rather than throwing a console error, so visitors never see a broken UI. The 403 is logged on the platform side so you can watch for abuse patterns.

What about same-host origins

The check uses scheme + host, so http vs. https is distinct (as it should be). And different ports are different origins (http://localhost:3000 โ‰  http://localhost:3001).

Wildcards: when to use, when not

"*" exists for cases where you genuinely don't know the origin in advance:

  • Internal demo agents that get embedded on every prospect's preview site.
  • Sandbox / preview environments where origin churns daily.

For production agents, never. The cost of forgetting "*" is that anyone who finds your data-agent-id can drain your quota. The cost of an explicit list is one minute per new origin.

Restricted paths โ€” the path-level companion

Allowed origins draws the trust boundary at the domain level (only https://shop.example.com can load the widget). Restricted paths is its sibling: a list of URL paths within an already-allowed origin where the widget should NOT mount. Use it to keep the bot off your own /admin, /checkout, or /account flows without touching code.

Each entry is a glob โ€” * is the only wildcard, and matches across slashes greedily. Comparison is case-insensitive against window.location.pathname:

PatternMatchesDoesn't match
/admin /admin, /Admin /admin/users (use /admin/* for that)
/admin/* /admin/users, /admin/billing/invoices /admin exactly (the bare prefix); list both if you want both
/checkout /checkout /checkout/confirm
/account/* /account/profile, /account/security /Help/account

How it works at runtime

The agent's restricted_paths list rides the same POST /v1/widget/init response as the rest of the config. After init succeeds, the widget checks window.location.pathname against the list โ€” if any pattern matches, the bar never mounts, the trigger engine never starts, and no further HTTP rides on that page. The init call itself does happen (the server is the source of truth), so if the overhead matters, also gate at the script-tag level using allowed_origins for the host.

Authoring

From the agent's Settings page, the Restricted paths card has a textarea โ€” one path per line. Same UX as Allowed origins. Empty list = no restrictions (widget mounts everywhere within an allowed origin). Up to 32 entries, each up to 200 characters.

Why this is a separate knob from auth

The platform also auto-suppresses the marketing demo widget on authenticated admin/customer routes via a server-side check in the Inertia root layout โ€” that's a hard guarantee that doesn't depend on agent config. restricted_paths is the buyer-side extension: even on a fully unauthenticated marketing site, /checkout shouldn't be cluttered with a sales chat bot.

How auto-index uses origins

Auto-index uses the same allow-list, but with a twist when "*" is set: the page URL being auto-indexed must match the visitor's actual Origin header. That stops a malicious page from auto-indexing arbitrary third-party domains via the wildcard.